Project

General

Profile

Help! RawTransfer is stuck -- how to debug?

Added by Patrick Goetz over 6 years ago

I have rawtransfer.py running out of a service file; usually this is fine except occasionally the camera PC is rebooted and the image server loses its Samba mountpoint. Yesterday this was all working fine, and nothing on the system related has changed.

Today I got a call from a user saying images weren't showing up on the web page. When I checked, the Samba share had become unmounted and the rawtransfer service had died. I remounted and restarted, but now nothing happens. Here is the rawtransfer.py command:

ExecStart=/usr/local/lib/python2.7/dist-packages/leginon/rawtransfer.py --method=rsync --source_path=/data2/krios-k2-frames --camera_host=krios-k2 --destination_head=/cryodata 2>&1 >/var/log/rawtransfer-krios-k2.log

It's as if the program is stuck: images are not rotating out of /data2/krios-k2-frames and nothing appears in the log files; i.e. the log file is persistently empty -- shouldn't something be getting logged? Finally, there is no new folder in /cryodata/frames.

Is there any way to debug this, given that nothing is being logged? Is the log file syntax incorrect?


Replies (3)

RE: Help! RawTransfer is stuck -- how to debug? - Added by Anchi Cheng over 6 years ago

First, check the mount point, make sure it is o.k by checking if you can access the files in /data2/krios-k2-frames

Second, I recommend run it not as service and without piping to the log as a test. This way all errors comes to standard output and you can use ctrl-c to see where it would break it out.

I do not see syntex error but I do not know what the part

2>&1

is for.

RE: Help! RawTransfer is stuck -- how to debug? - Added by Patrick Goetz over 6 years ago

It seems to be working now:

root@kraken:/data2/krios-k2-frames# systemctl status rawtransfer-krios
● rawtransfer-krios.service - rawtransfer.py service for Krios
   Loaded: loaded (/etc/systemd/system/rawtransfer-krios.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2018-06-08 18:01:18 CDT; 11h ago
 Main PID: 12510 (python)
    Tasks: 2
   Memory: 37.3G
      CPU: 56min 3.930s
   CGroup: /system.slice/rawtransfer-krios.service
           └─12510 python /usr/local/lib/python2.7/dist-packages/leginon/rawtransfer.py --method=rsync --source_pat

Jun 09 05:33:35 kraken rawtransfer.py[12510]: mkdirs /cryodata/frames/18jun08a/rawdata
Jun 09 05:33:35 kraken rawtransfer.py[12510]: rsync -av --remove-sent-files /data2/krios-k2-frames/20180609_0532250
Jun 09 05:34:26 kraken rawtransfer.py[12510]: chown -R 1005:1005 /cryodsending incremental file list
Jun 09 05:34:26 kraken rawtransfer.py[12510]: 20180609_05334110.mrc
Jun 09 05:34:40 kraken rawtransfer.py[12510]: sent 854,549,528 bytes  received 43 bytes  58,934,453.17 bytes/sec
Jun 09 05:34:40 kraken rawtransfer.py[12510]: total size is 854,340,848  speedup is 1.00
Jun 09 05:37:08 kraken rawtransfer.py[12510]: sending incremental file list
Jun 09 05:37:08 kraken rawtransfer.py[12510]: 20180609_05361711.mrc
Jun 09 05:37:23 kraken rawtransfer.py[12510]: sent 854,549,528 bytes  received 43 bytes  55,132,230.39 bytes/sec
Jun 09 05:37:23 kraken rawtransfer.py[12510]: total size is 854,340,848  speedup is 1.00

I tried stopping the systemd service and running rawtransfer by hand in a shell previously, an it just hung, with no output sent to var/log/rawtransfer-krios-k2.log.

By the way, this seems to be a mistake in my service file, as systemd is capturing all the output and error messages and redirecting them to the systemd journal (hence the status messages above). This is why nothing is ever showing up in the log files.

The remaining mystery and what was confusing me is these images seem to be stuck in the Samba transfer folder mounted from the Camera PC:

root@kraken:/data2/krios-k2-frames# ls
20180506_18432300.mrc  20180506_19304703.mrc  20180514_18512002.mrc
20180506_19222000.mrc  20180508_09474223.mrc  20180515_19260211.mrc
20180506_19254501.mrc  20180514_18410200.mrc  20180609_05435616.mrc
20180506_19274502.mrc  20180514_18461201.mrc

A 18jun08a folder was created in the frames folder, and it's filled with images, and the mounted folder is writeable. How can I determine what project these images belong to, and why are they not being transferred somewhere by rawtransfer.py?

RE: Help! RawTransfer is stuck -- how to debug? - Added by Anchi Cheng over 6 years ago

I am glad it works now but that does not answer why it did not run before.

Regarding images remaining on K2. An image that is saved to database contains a metadata field of the frame movie filename on K2 camera. The ocassional leftover on K2 camera means that they failed the search of the filename search in the database. This happens ocassionally. We collect data alternating between frame saved and not saved for different functions. Some timing issue may cause a transient acquisition intended not to save frames to save them. Since a transient image (such as an image taken with Acquire tool in Navigation node) is not recorded in the database, it will fail the filename search. These filenames reads like a timestamp. You can, and should remove them periodically so that rawtransfer.py won't have to check them for transferring again and again.

I do my best to make sure the command I issue for transient image acquisition to stripe off the frame saving part of the preset, but it seems that I may have still missed some sequence of image acquisition route. Until I figure out what sequence of event causes this, I can't not close the gap. Nevertheless, it does not miss movies that should be saved, so it is a minor bug.

    (1-3/3)