Project

General

Profile

Conflict between local leginon.cfg and rawtransfer.py command line options?

Added by Patrick Goetz over 6 years ago

This ticket is related to the question asked here: http://emg.nysbc.org/boards/6/topics/3139?r=3149

In pursuit of addressing everyone's image security concerns, I created per-lab image collection accounts on the main processing system that can only login from machines in the microscope area and which have limited privileges. I will document this when I have it all working, which is not yet.

Yesterday we tried testing the taylorimages image collection account. This account has the following local leginon.cfg setting:

[Images]
path: /EM/taylorlab/leginon

Images were subsequently being saved to the correct leginon folder:

root@kraken:/EM/taylorlab/leginon/18aug09b# ls -l rawdata
total 131072
-rw-rw-r-- 1 taylorimages taylorlab 56956944 Aug  9 13:09 18aug09b_00001en.count.mrc
-rw-rw-r-- 1 taylorimages taylorlab 56956944 Aug  9 13:20 18aug09b_00002en.count.mrc

But nothing was being written to the frames folder. When I checked the systemd rawtransfer.py log for this microscope, I noticed the following error messages:

Aug 10 11:33:33 kraken rawtransfer.py[19649]: frames_path = /EM/taylorlab/frames/18aug09b/rawdata
Aug 10 11:33:33 kraken rawtransfer.py[19649]:     Destination frame path does not starts with /cryodata. Skipped
Aug 10 11:33:33 kraken rawtransfer.py[19649]: **running /local/talos-k2-frames/20180809_13194701.mrc
Aug 10 11:33:33 kraken rawtransfer.py[19649]: frames_path = /EM/taylorlab/frames/18aug09b/rawdata
Aug 10 11:33:33 kraken rawtransfer.py[19649]:     Destination frame path does not starts with /cryodata. Skipped
Aug 10 11:33:33 kraken rawtransfer.py[19649]: **running /local/talos-k2-frames/20180809_13280000.mrc
Aug 10 11:33:33 kraken rawtransfer.py[19649]: frames_path = /EM/taylorlab/frames/18aug09c/rawdata
Aug 10 11:33:33 kraken rawtransfer.py[19649]:     Destination frame path does not starts with /cryodata. Skipped

So of course; this is because the rawtransfer script references a different destination_head:

root@kraken:/etc/systemd/system# cat rawtransfer-talos.service 
[Unit]
Description=rawtransfer.py service for Talos
After=network.target

[Service]
Type=simple
Restart=on-failure
User=root
Group=root

ExecStart=/usr/local/lib/python2.7/dist-packages/leginon/rawtransfer.py --method=rsync --source_path=/local/talos-k2-frames --camera_host=talos-k2 --destination_head=/cryodata 2>&1 >/var/log/rawtransfer-talos-k2.log

[Install]
WantedBy=multi-user.target

I think this just means that everything must be under the same root level destination_head, but I'm concerned that maybe we're going to run into problems because there are multiple leginon/frame subdirectories under this head? I.e. instead of

               ---  leginon
             /  
   /cryodata
             \
               --- frames

The proposed directory structure is going to be

                   ---- leginon
                  /
             labA
                  \
          /        ---- frames
         /
        /           ---- leginon
       /           / 
   /EM ----- labB
       \           \
        \           ---- frames
         \
          \        ---- leginon
                  /
             labC
                  \
                   ---- frames

So the question is, is there some way to localize the location of the frames folder similar to how having a local leginon.cfg file localizes the location of the leginon folder?

Second, and ongoing issue. Note the permissions on the rawdata:

root@kraken:/EM/taylorlab/leginon/18aug09b# ls -l rawdata
total 131072
-rw-rw-r-- 1 taylorimages taylorlab 56956944 Aug  9 13:09 18aug09b_00001en.count.mrc
-rw-rw-r-- 1 taylorimages taylorlab 56956944 Aug  9 13:20 18aug09b_00002en.count.mrc

One of the PI's conditions is that other labs not be able to see his data. It would save me jumping through a lot of hoops to change the umask on this so that the file permissions are rw-r---- rather than rw-rw-r-. rawtransfer runs as root, but the files are saved under the name of the user who runs the start-leginon command; in this case taylorimages. I think Anchi mentioned that rawtransfer is using the default umask; the question is just whose default umask -- the systems? roots? taylorimages? I would prefer not to modify the system wide umask, and am wondering if there is any other way to facilitate more restrictive permissions without doing so (which, by the way, I think should be the default behavior under any circumstances.


Replies (8)

RE: Conflict between local leginon.cfg and rawtransfer.py command line options? - Added by Anchi Cheng over 6 years ago

The destination_head option in rawtransfer.py is a filter for the following if statement sudo code:

if leginon_image_path_in_database not start_with(destination_head):
  skip

Therefore, in your new scheme, you can either not to specify the option which would just means process everything by default, or use the non-discriminating --destination_head=/EM. The head seperation by group will just be based on your lab division when you made the directories and included in the user-based leginon.cfg.

Regarding localizing frames directory, it will just be based on the same rule leginon folder is made. It is a simple string replacement of leginon to frames. Therefore, as long as you specify leginon path with the word leginon, you will get the frames path propagated fine.

The umask function we use when creating the leginon session image and frames paths preserve the umask of the parent directory. In this case, for labC, it would be /EM/labC/leginon and /EM/labC/frames. Since these two directories are created by you as the administrator, you should have the control of how it behaves.

RE: Conflict between local leginon.cfg and rawtransfer.py command line options? - Added by Patrick Goetz over 6 years ago

The umask function we use when creating the leginon session image and frames paths preserve the umask of the parent directory. In this case, for labC, it would be /EM/labC/leginon and /EM/labC/frames. Since these two directories are created by you as the administrator, you should have the control of how it behaves.

We couldn't get this to work as advertised. Here are the permissions on the parent directories we are able to control by hand (i.e. those not created by leginon):

root@kraken:/EM/taylorlab/frames/18aug09c/rawdata# cd /
root@kraken:/# ls -ld EM
drwxr-xr-x 5 root root 72 Jul 19 12:57 EM
root@kraken:/# cd EM
root@kraken:/EM# ls -ld taylorlab/
drwxr-s--- 4 taylorimages taylorlab 47 Aug  9 13:24 taylorlab/
root@kraken:/EM# cd taylorlab
root@kraken:/EM/taylorlab# ls -l
total 0
drwxr-s--- 16 taylorimages taylorlab 290 Aug 17 11:43 frames
drwxr-s--- 17 taylorimages taylorlab 310 Aug 17 11:42 leginon
root@kraken:/EM/taylorlab# cd frames
root@kraken:/EM/taylorlab/frames# ls -l
total 0
drwxrwsr-x 3 taylorimages taylorlab 29 Aug 17 11:33 18aug09b
drwxrwsr-x 3 taylorimages taylorlab 29 Aug 17 11:33 18aug09c
drwxrwsr-x 3 taylorimages taylorlab 29 Aug 17 11:34 18aug09d
drwxrwsr-x 3 taylorimages taylorlab 29 Aug 17 11:34 18aug09e
drwxrwsr-x 3 taylorimages taylorlab 29 Aug 17 11:34 18aug09f
drwxrwsr-x 3 taylorimages taylorlab 29 Aug 17 11:38 18aug17a
drwxrwsr-x 3 taylorimages taylorlab 29 Aug 17 11:43 18aug17b
drwxrwsr-x 3 taylorimages taylorlab 29 Jul 19 12:35 18jul19g
drwxrwsr-x 3 taylorimages taylorlab 29 Jul 19 12:54 18jul19h
drwxrwsr-x 3 taylorimages taylorlab 29 Jul 19 13:05 18jul19i
drwxrwsr-x 3 taylorimages taylorlab 29 Jun  1 15:08 18jun01a
drwxrwsr-x 3 taylorimages taylorlab 29 Jun  1 15:15 18jun01b
drwxrwsr-x 3 taylorimages taylorlab 29 May 24 12:11 18may24b
drwxrwsr-x 3 taylorimages taylorlab 29 May 24 12:16 18may24c
root@kraken:/EM/taylorlab/frames# cd 18aug09c
root@kraken:/EM/taylorlab/frames/18aug09c# ls -l
total 0
drwxrwsr-x 3 taylorimages taylorlab 77 Aug 17 11:34 rawdata
root@kraken:/EM/taylorlab/frames/18aug09c# ls -l rawdata
total 556216
-rwxrwxrwx 1 taylorimages taylorlab 569561248 Aug  9 13:28 18aug09c_00001en.count.frames.mrc
drwxrwsr-x 2 taylorimages taylorlab       297 Aug 17 11:33 references

Notice that leginon and frames folders (and the parent of these folders) have no permissions for "other", yet the final mrc file is being written with world writeable permissions, and the intermediate folders are set to 755. This is not so much a problem for people who are not members of the lab. The --- other permission prevents them from viewing/copying/modifying the underlying data. The problem is that a person in the taylorlab group will be able to modify, copy, or even delete the mrc files because of the r-x permissions on the parent directories, and I can't restrict this down any further as lab members need to be able to access the images to do research.

One of the PIs has requested the specific system design criterion that none of his grad students or postdocs be able to delete raw data files (some of the research is corporate sponsored), and at this point I'm not sure how to accomplish this without introducing data access restrictions that will be onerous. It would be nice if the final mrc file could be written with rw-r----- permissions.

RE: Conflict between local leginon.cfg and rawtransfer.py command line options? - Added by Anchi Cheng over 6 years ago

I just checked the code. In rawtransfer.py, after root performed rsync to put the movie files into the directory, it only did chown, not chmod. Therefore, the behavior you see would be the default behavior on your system.

If you so desire, we can enforce the desired mode as an option.

RE: Conflict between local leginon.cfg and rawtransfer.py command line options? - Added by Patrick Goetz over 6 years ago

It seems prudent to have the option of restricting permissions further by default. I can't imagine there aren't other cryo-em groups facing similar challenges. In our case, we're going to be making the facility available to external research groups, which could be competitors; hence the need for raw data security. And my PI's request that there should be a distinction between people who can work with raw data and people who can edit/delete raw data seems reasonable and necessary as well.

RE: Conflict between local leginon.cfg and rawtransfer.py command line options? - Added by Anchi Cheng over 6 years ago

Added this feature in Issue #5958 and pushed to trunk and myami-beta.

I recommend using this to modify, not redefine the permission. This operate recursively from session path.

RE: Conflict between local leginon.cfg and rawtransfer.py command line options? - Added by Patrick Goetz about 6 years ago

Thanks! However, it's not entirely clear how to use this. Does this now automatically set permissions to those of the parent directory?

RE: Conflict between local leginon.cfg and rawtransfer.py command line options? - Added by Anchi Cheng about 6 years ago

See the Issue #5958. I repeated the option description there.

    (1-8/8)