question about focussing and speed of data acquisition
Added by Anonymous almost 18 years ago
Data acquisition seems to be somewhat slower (2-3 x) in Leeds than what I read in papers on work done at Scripps with Leginon (even on good days when we do not have problems with drift in Leeds). The suspected reasons are that some of the steps in focussing and/ or exposure acquisition are CPU-intensive or require fetching a lot of data over the network. I'm not sure if it is both of those reasons or maybe just the slower network. (I think our network manager said that we are on a gigabit network but at present I can use 100 Mbps at most because there are problems with getting a gigabit ethernet card for the laptop I run Leginon on.) I can't account for the slower speed just on the basis of the exposure image or exposure pair though. I was wondering how many fa preset images the Focussing node acquires (with different beam tilts to do the autofocus related calculations) and if the number depends on whether the option to do astigmatism correction is picked? Also, is the autofocussing calculation computationally intensive? Presumably it computes a number of cross-correlation functions to get the image shift for the various images with different beam tilt. (De-spiking and other corrections of the fa preset images may be a concern as well),
William
Replies (1)
- Added by Jim Pulokas over 17 years ago
Over the last few months, I had a chance to do an extensive timing analysis on Leginon. I broke down the very basic tasks, like acquiring an image, saving an image and transfering an image over the network.
I found that network transfers contributed a small part in the overall process, however, this is with a gigabit network. Maybe in a 100Mbit network the network transfers do become significant. Also, one or two seconds per image may not seem significant, but when acquiring thousands of images, that can become important. Each image is transfered over the network from the camera host to the Leginon host. If the image is to be saved, then it may be transfered over the network again if using NFS.
Here are a few examples of timings: - readout time of 4kx4k image from gatan camera: 8.0 s
- network transfer of 4kx4k, 16bit/pixel image (32MB total): 0.6 s
- saving 4kx4k, 32bit float image (64MB) local disk: 0.4 s
- saving 4kx4k, 32bit float image (64MB) NFS: 1.4 s
- calculate stats (min,max,mean,std) on 4k image: 0.8 s
- 4k flat field correction: 0.5 s
- 4k corrector despike: 1.8 s
- 4k corrector clip: 1.1 s[/list:u:vvrqlfna]
If you want to produce similar timings, there is a module called speed.py in the Leginon 1.4.0. There are several functions for timing different things (but not much documentation).
It turned out that there was not much improvement to be made in these basic operations, but there were significant problems in larger procedures. For instance, stats were often being calculated 3 times on the same image. Many extra pauses were inserted in various places. As you mentioned, focusing was very inefficient.
Since version 1.3, there have been several improvements to speed things up. In the focuser, it was acquiring a total of 6 images at fa preset (2 for a drift measurement, 2 for x beam tilt measurement, 2 for y beam tilt measurement). This is now cut down to 4 images (2 for drift check, one more for x tilt, one more for y tilt). More specifically it is now doing this:
acquire first image at 0 beam tilt
acquire second image at 0 beam tilt (measure drift)
acquire image at +x beam tilt
acquire image at +y beam tilt (measure defocus and stig)[/list:u:vvrqlfna]If the stig option is turned of, it will now only do the x beam tilt, for a total of 3 images acquired.
One of the problems with the new method is that we are doing the correlation of 0 tilt and a +tilt. Before we correlated a -tilt and +tilt. We are now measuring a smaller total tilt angle so the measurement is not quite as good.
There were also a few extra unnecessary preset changes and pauses going on during focusing. There are now no hidden pauses, but the essential ones are available for the user to set. Mainly, the pause configured in presets manager is used after every preset change, and the pause in each acquisition node should be tuned to the shortest pause necessary to allow everything to be stable after a move (image shift or stage)
When statistics are calculated on an image, those stats are now cached in case another function needs to calculate them.
We typically use VNC to remotely access the Tecnai while running Leginon on a linux computer in another room. It was determined that having a VNC connection open while running Leginon could significantly slow down data transfers and computation.
If you can get away with acquiring raw images for the final acquisition, this can be a big savings of time, avoiding flat field correction, despike, clipping, etc. We have been doing this and then running a post processing script later. This saves computation time and also some savings due to only writing a 16bit image to disk rather than float.
Soon we will add an option to explicitly convert a corrected float image back to integer before saving.
In Leginon 1.4, there is a way to store timing information in the database while Leginon runs. It is disabled by default. Right now enabling/disabling this requires editing node.py. The method "storeTime" is disabled by returning immediately. Enable it by removing the return statement. Throughout many places in the Leginon code, a section of code can be timed by calling startTimer and stopTimer. This inserts time information into the TimerData table. This table grows very fast, which is why this is turned of by default.