Project

General

Profile

sacrificial square questions

Added by Anonymous over 18 years ago

I have some questions relating to the use of a sacrificial square to set up suitable parameters in the Hole Targeting and Exposure Targeting nodes. Is the sacrificial square exposed for long enough, sitting in the sq preset, while choosing suitable parameters for Hole Targeting that significant radiation damage is done to it? I'm wondering if I will get suitable (production quality) data from the sacrifical square after setting the Hole Targeting parameters or if this is not possible because of radiation damage. (The sq preset is low magnification; but the times to set the Hole Targeting parameters can be quite long - from a few minutes to about half an hour with about fifteen minutes being typical.) Also, what about the (smaller) area exposed while setting parameters in Exposure Targeting, in the hl preset? (The hl preset is at a higher magnification than the sq preset and sometimes it can still take a number of minutes to find suitable parameters for Exposure Targeting.) Finally, can I de-select squares picked in the atlas in the Square Targeting node - I might want to de-select the square used as the sacrifical square (after the parameters have been set in the Hole Targeting and Exposure Targeting nodes), or if I pick a square by accident or if I later decide to remove a square because I notice a problem with it?

William


Replies (14)

- Added by Anchi Cheng over 18 years ago

The sacrificial square is called so mainly because you may have to adjust

the beam size, location for the presets on the microscope yourself. The CCD exposure time for sq and hl are usually very short (5-100 ms) in comparison to the time it takes a human to check on the microscope (several sec minimum). If it takes you 10 times of adjusting them at all presets, you may

fry unknown number of holes. Because of the uncertainty of whether it is fried, we recommend not to use the same square for data collection. However, if you get through the preset alignment part quickly, you might be able to use the part you have not expose to high mags.

Setting suitable parameters in the Hole Targeting and Exposure Targeting require only one image each. When the CCD is not acquiring image, the beam is blanked and the grid is not exposed so the length of time you spend on setting these parameter does not count as the time the grid is exposed.

The sacroficial square you use for parameter setting would be considered done if you follow the Quick Start for MSI. You will find that when you refresh targets in Square Targeting. There is no need to "deselect" it.

Noticing problems on one or several targets after submission is a different situation. You can deselect a target on Square Targeting only if it has not been submitted. After they are submitted, you can only abort all remaining targets in the acquisition node that recieves the target list, Square in this case. At its current state, Leginon can not just remove a specific target in the middle of its list while the list is being processed. If you use user confirmation option in Hole Targeting, you can terminate further process by submitting empty hole target list on the particular problem square that Square node has just acquired.

See Leginon Manual Trouble Shooting Chapter regarding Pausing and Aborting more tips on what to do if you need to abort.

- Added by Anonymous over 18 years ago

Today I somehow ended up with a problematic set of targets in Square Targeting. At some point, I exited Leginon due to Leginon hanging (which happened immediately after turning all the checks off in the Focus Node and Z Focus node and also checks on targets in Hole Targeting and Exposure Targeting). I exited and re-started the Leginon client and Leginon. Then I loaded up the tiles again in Square Targeting and re-submitted the targets. All of the nodes seem to be waiting or completely inactive (as indicated by tool tips and also the stationary green wheel). The waiting nodes are the Square, Hole Targeting, Hole and Exposure Targeting. None of the nodes has any error messages although I notice that Exposure Targeting has the information message "Already processed this image.... republishing". I tried pressing the abort button and then the play button in some of the waiting nodes in an attempt to get rid off what appears to be an impossible to process target.

There does not seem to be any way of dealing with this problem apart from collecting a completely new atlas. If I reload the old atlas then Leginon still has the square target that was not completely processed

and somehow remembers that there is a target (image/ exposure) on the square that was not completely done but then gets mixed up somehow and fails to process the target,

William

- Added by Anonymous over 18 years ago

I tried again with the same cold grid. I quit and re-started Leginon and the Leginon client. I obtained a new atlas. All the checks were off (for focussing in the Focus and Z Focus nodes and hole targeting in the Hole Targeting and Exposure Targeting nodes). I kept settings obtained from the previous run of MSI-T on this grid (i.e. settings for hole targeting etc including template radius, threshold used in correlation, etc). I picked a number of squares in Square Targeting again and submitted the targets. I left Leginon to run and now two hours a later it looks like nothing has happened. Various nodes seem to be waiting on each other or something external to happen again. The following nodes are waiting and have a stationary green wheel/ circle next to them - Grid, Square, Hole Targeting, Hole, Z Focus and Exposure Targeting. None of the other nodes are doing anything (i.e. not even waiting - they do not have any green wheel next to them). According to the tool tip, the Grid node is processing and the last message in its log is: "(i) 6:21:17 Continuing...", Square node is waiting and the last message in its log is: "(i) 6:32:56 processing: Waiting for 6434 to be processed", Hole Targeting node is waiting and the last message in its log is: "(i) 6:33:00 Waiting for target list ID 16241...", Hole node is waiting and the last message in its log is: "(i) 6:34:04 processing: Waiting for 6436 to be processed", Z Focus node is processing and the last two messages in its log are: "(i) 6:33:50 Z Focus done with target list ID: (('bmsd027', 49152), 16265), status: success (i) 6:33:50 Continuing..." and the Exposure Targeting node is waiting with the last messages in its log: "(i) 6:34:06 Acquisition Targets: 0 (i) 6:34:06 Focus Targets: 0 (i) 6:34:06 Publishing targets... (i) 6:34:06 Waiting for target list ID 16611...",

William

- Added by Anonymous over 18 years ago

Also, when I quit Leginon after the described unsuccessful run I got the error message below from Leginon. There are not any error messages from the Leginon client (although I think the Instrument node vanished after quitting the main Leginon),

William


Exception happened during processing of request from ('127.0.0.2', 36007)

Unhandled exception in thread started by <bound method Thread.__bootstrap of <Thread(Thread-44, stopped daemon)>>

Traceback (most recent call last):

File "/usr/lib/python2.3/threading.py", line 444, in __bootstrap

_print_exc(file=s)

File "/usr/lib/python2.3/traceback.py", line 209, in print_exc

etype, value, tb = sys.exc_info()

AttributeError: 'NoneType' object has no attribute 'exc_info'

"WVNicholson" wrote: I tried again with the same cold grid. I quit and re-started Leginon and the Leginon client. I obtained a new atlas. All the checks were off (for focussing in the Focus and Z Focus nodes and hole targeting in the Hole Targeting and Exposure Targeting nodes). I kept settings obtained from the previous run of MSI-T on this grid (i.e. settings for hole targeting etc including template radius, threshold used in correlation, etc). I picked a number of squares in Square Targeting again and submitted the targets. I left Leginon to run and now two hours a later it looks like nothing has happened. Various nodes seem to be waiting on each other or something external to happen again. The following nodes are waiting and have a stationary green wheel/ circle next to them - Grid, Square, Hole Targeting, Hole, Z Focus and Exposure Targeting. None of the other nodes are doing anything (i.e. not even waiting - they do not have any green wheel next to them). According to the tool tip, the Grid node is processing and the last message in its log is: "(i) 6:21:17 Continuing...", Square node is waiting and the last message in its log is: "(i) 6:32:56 processing: Waiting for 6434 to be processed", Hole Targeting node is waiting and the last message in its log is: "(i) 6:33:00 Waiting for target list ID 16241...", Hole node is waiting and the last message in its log is: "(i) 6:34:04 processing: Waiting for 6436 to be processed", Z Focus node is processing and the last two messages in its log are: "(i) 6:33:50 Z Focus done with target list ID: (('bmsd027', 49152), 16265), status: success (i) 6:33:50 Continuing..." and the Exposure Targeting node is waiting with the last messages in its log: "(i) 6:34:06 Acquisition Targets: 0 (i) 6:34:06 Focus Targets: 0 (i) 6:34:06 Publishing targets... (i) 6:34:06 Waiting for target list ID 16611...",
William

Waiting for targets to process problem - Added by Anchi Cheng over 18 years ago

Hi, William,

The problem you have happens ocassionally to us, too, although we could never reproduce it enough to debug it. Let us know if you've found a way to reproduce it from every fresh session. What happens is that a node is waiting for a target to complete its processing which was almost completed in the sense that it did everything it needed to do except to log in the database that it has finished the task. As long as you are in the same session, Leginon nodes would always try to complete that particular target but could not because the node doing the task thinks it is done but the one waiting thinks it is not. Because the crash, it is sometimes so corrupted that it would not recover even if you go in the database and mark the apparent problem target done.

You can try using a python script Jim wrote, called killtargets.py. It is in Leginon distribution under the directory where the package is installed and subdirectory Leginon. Just run it as python script. You will need to know the session ID number, which you can find by moving the cursor on to the summary label in your web Image Viewer. The script forces all targets to be marked done. You can then select new targets on the same atlas after refreshing it.

If this does not work out, the sure thing to do is to start a new, clean session. All target processing information goes with the same session only.

Anchi

Re: Waiting for targets to process problem - Added by Anonymous over 18 years ago

It looks like the problem happens very often perhaps every time I run Leginon in fully-automated mode, with checks turned off, in the current release of Leginon. (I think I may be able to avoid this problem of nodes apparently deadlocking by running in queuing mode.) By the way, I recently checked the network speed between the Leginon laptop and the microscope PC. Apparently, it is about 13 Mbps (based on the average time to send 1kilobyte packets with ping between the systems - maybe this is an incorrect method) - the Leginon notes recommend gigabit with a minimum of 100 Mbps. I'm not sure if this means we are on a 10 Mbps network with virtually no contention (as it's possibly to fast for that) or a 100 Mbps network with rather a lot of contention. We were told by the network administrator that we were getting 100 Mbps but it may have changed to a slower speed at some time in the past. Would this be expected to cause these problems and if not what other problems would occur?

William

"anchi" wrote: Hi, William,

The problem you have happens ocassionally to us, too, although we could never reproduce it enough to debug it. Let us know if you've found a way to reproduce it from every fresh session. What happens is that a node is waiting for a target to complete its processing which was almost completed in the sense that it did everything it needed to do except to log in the database that it has finished the task. As long as you are in the same session, Leginon nodes would always try to complete that particular target but could not because the node doing the task thinks it is done but the one waiting thinks it is not. Because the crash, it is sometimes so corrupted that it would not recover even if you go in the database and mark the apparent problem target done.

You can try using a python script Jim wrote, called killtargets.py. It is in Leginon distribution under the directory where the package is installed and subdirectory Leginon. Just run it as python script. You will need to know the session ID number, which you can find by moving the cursor on to the summary label in your web Image Viewer. The script forces all targets to be marked done. You can then select new targets on the same atlas after refreshing it.

If this does not work out, the sure thing to do is to start a new, clean session. All target processing information goes with the same session only.

Anchi

Re: Waiting for targets to process problem - Added by Anonymous over 18 years ago

Apparently netperf is more suitable for testing the bandwidth than ping and according to netperf we are getting about 80 Mbps at an off-peak time which suggests we really are on a 100 Mbps network. I'm not sure how much the speed drops during normal working hours though with other users possibly on the network,

William

"WVNicholson" wrote: It looks like the problem happens very often perhaps every time I run Leginon in fully-automated mode, with checks turned off, in the current release of Leginon. (I think I may be able to avoid this problem of nodes apparently deadlocking by running in queuing mode.) By the way, I recently checked the network speed between the Leginon laptop and the microscope PC. Apparently, it is about 13 Mbps (based on the average time to send 1kilobyte packets with ping between the systems - maybe this is an incorrect method) - the Leginon notes recommend gigabit with a minimum of 100 Mbps. I'm not sure if this means we are on a 10 Mbps network with virtually no contention (as it's possibly to fast for that) or a 100 Mbps network with rather a lot of contention. We were told by the network administrator that we were getting 100 Mbps but it may have changed to a slower speed at some time in the past. Would this be expected to cause these problems and if not what other problems would occur?
William

[

Re: Waiting for targets to process problem - Added by Anonymous about 18 years ago

I seem to get the problem with nodes hanging quite often, when I run fully automatically (i.e. with checks turned off and without using queuing mode). It happens less often subsequent to upgrading to SUSE 10.0 and also I can generally recover from the problem by just re-starting Leginon and re-loading the atlas without using killtargets.py now, after the upgrade. However, it is still a nuisance as it can be a while before I discover that Leginon is hanging in this way (and stopped collecting data from the time the hang occurred) e.g. on the last session it was a couple of hours before I found out and re-started Leginon.

While Leginon was running and ended up with the hang I got the following error message:

Exception in thread data binder handler thread:

Traceback (most recent call last):

File "/usr/lib/python2.4/threading.py", line 442, in __bootstrap

self.run()

File "/usr/lib/python2.4/threading.py", line 422, in run

self.__target(*self.__args, **self.__kwargs)

File "/usr/lib/python2.4/site-packages/Leginon/databinder.py", line 131, in handleData

method(args)

File "/usr/lib/python2.4/site-packages/Leginon/targethandler.py", line 168, in handleTargetListDone

self.targetlistevents[targetlistid]['received'].set()

KeyError: 'received'

Nodes below all have a stationary green circle.

"Waiting" nodes (with last logged output) are:

Square

6:53.08 pm processing: Waiting for 9472 to be processed

6:53.08 pm output: Image displayed

6:53.08 pm output: Displaying image

6:53.08 pm output: Stats published...

6:53.08 pm output: Publishing stats...

6:53.07 pm output: Image published

Hole Targeting

6:53.12 pm Waiting for target list ID 5077...

6:53.12 pm Publishing targets...

6:53.12 pm Focus Targets: 1

6:53.11 pm Acquisition Targets: 5

6:53.11 pm Holes with good ice: 5

6:53.11 pm limit thickness

Hole

6:53.58 pm processing: Waiting for 9474 to be processed

6:53.58 pm output: Image displayed

6:53.58 pm output: Displaying image

6:53.58 pm output: Stats published...

6:53.58 pm output: Publishing stats...

6:53.58 pm output: Image published

6:53.57 pm output: Publishing image...

Exposure Targeting

6:53.59 pm Waiting for target list ID 5395...

6:53.59 pm Publishing targets...

6:53.59 pm Focus Targets: 0

6:53.59 pm Acquisition Targets: 0

6:53.59 pm apply template

6:53.59 pm Holes with good ice: 0

.

.

.

6:52.52 pm Target ID 2613 has been processed.

6:42.16 pm Waiting for target list ID 2613...

6:42.16 pm Publishing targets...

.

.

.

On previous "round" Exposure Targeting node did successfully get some

acquisition targets and a focus target.

"Processing" node is:

Focus

6:47:43 pm Continuing...

6:47:43 pm Focus done with target list ID: (('bmsd027',49152),2627), status: success

6:47:43 pm processing: Processing complete

6:47:43 pm output: Image displayed

6:47:43 pm output: Displaying image

.

.

.

According to information in Square Targeting node (not waiting or processing),

2 of 9 targets are done. This means that one square was done on this run (as

the other square was done as a 'sacrificial square' previously while setting

up - and which may have been 'cancelled' by using killtargets.py, I think),

William

"anchi" wrote: Hi, William,

The problem you have happens ocassionally to us, too, although we could never reproduce it enough to debug it. Let us know if you've found a way to reproduce it from every fresh session. What happens is that a node is waiting for a target to complete its processing which was almost completed in the sense that it did everything it needed to do except to log in the database that it has finished the task. As long as you are in the same session, Leginon nodes would always try to complete that particular target but could not because the node doing the task thinks it is done but the one waiting thinks it is not. Because the crash, it is sometimes so corrupted that it would not recover even if you go in the database and mark the apparent problem target done.

You can try using a python script Jim wrote, called killtargets.py. It is in Leginon distribution under the directory where the package is installed and subdirectory Leginon. Just run it as python script. You will need to know the session ID number, which you can find by moving the cursor on to the summary label in your web Image Viewer. The script forces all targets to be marked done. You can then select new targets on the same atlas after refreshing it.

If this does not work out, the sure thing to do is to start a new, clean session. All target processing information goes with the same session only.

Anchi

Re: Waiting for targets to process problem - Added by Anonymous about 18 years ago

The problem seems to occur whenever the Exposure Targeting node fails to find any acquisition or focus targets and then presumably submits an empty target list. (On normal runs of Leginon this can happen if there are difficulties with detecting a hole again in Exposure Targeting - possibly because of thick ice - or if the hole is poorly centred, especially if large-ish border is used to ensure the hole close to the centre is used, possibly due to the alignment between presets getting worse over the course of a run. In a long run, there will always be at least one hl preset image that the Exposure Targeting node has to process that has this problem even if the grid is an easy one to deal with - it certainly happens with negative stain grids where the holes are easy to detect.) I can reliably reproduce the node deadlock problem by using an artificially high threshold in Exposure Targeting settings for Threshold (e.g. 100).

I'm wondering if I changing the settings for "wait for a node to process the image" and "publish and wait for rejected targets" in the different nodes could help. However, I have checked that I have the recommended settings for the different nodes in table 9.3 of the manual and I'm not sure exactly what kind of effect changing either "wait for a node to process the image" or "publish and wait for rejected targets" in a node is going to have,

William

"anchi" wrote:
The problem you have happens ocassionally to us, too, although we could never reproduce it enough to debug it. Let us know if you've found a way to reproduce it from every fresh session. What happens is that a node is waiting for a target to complete its processing which was almost completed in the sense that it did everything it needed to do except to log in the database that it has finished the task. As long as you are in the same session, Leginon nodes would always try to complete that particular target but could not because the node doing the task thinks it is done but the one waiting thinks it is not. Because the crash, it is sometimes so corrupted that it would not recover even if you go in the database and mark the apparent problem target done.

Anchi

- Added by Anchi Cheng about 18 years ago

We often submit no empty target list in target finder nodes here and it has never been a problem. It is puzzling that you have the problem reproduced every time. When we had the hanging problem previously, it normally associate with random aborting target list that is being processed.

The settings "Wait for a node to process image" and "Publish and Wait for rejected targets" on an Acquistion node such as "Grid", "Square", "Hole", "Exposure" determines the workflow of the node relative to others. It is crucial to have them specifically assigned for each node because they need to behave differently from each other according to its position on the workflow. Changing these to other values will only cause more hanging or incorrect behavior.

"Wait for a node to process image" need to be active when it is in the middle of a depth-first traversal tree structure (See the figure in MSI Queuing option). The node recieves multiple targets but will acquire image of the first target, notify the target finder node following it to select targets on it for the image acquisition at the next level. If, say the node "Hole" does not wait for "Exposure Targeting" node to process image 1, it will attempt to process the next target (No. 2) in its waiting list. Since the "Exposure Targeting" node does not know this, it will select the target list on Image 1 and send the target list to "Exposure". "Exposure" node has no way to know that "Hole" is acquiring image out of order, so it will also try to acquire image from the target list generated from image 1.

As you can see, this can be a big mess. On the other hand, if you turn on "Wait for a node to process image" at a position that its image is not used in the next workflow, you will be stucked forever at the node. This is because Acquisition node clase broadcast that it has just acquired an image and will not continue to its next step, whether it is to process the next target or to send back to the target finder node the message that it has completed the target list, UNTIL it recieves a message that says the particular image is processed. Since no node processes the image, the message will never come.

"Publish and wait for rejected target" allows the focus targets to be bundled with the acquisition targets and routed through its corresponding acquisition node and be processed before the acquisition is taken place. You have to follow the settings or Legion will not do any autofocusing. In MSI's, the target finders send the target list that contains both focus and acquistion type of targets to, say "Exposure" node. With "Publish and wait for rejected target" enabled in "Exposure", the node divide the target list by types and since acquisition class that "Exposure" is derived from only process acquisition targets, the focus target will be published and processed first by "Focus" node that recieves the published focus target list. "Exposure" node just wait there until "Focus" node declares that it is done with the focus target list.

What is important to remember is that these settings are specific to the application and the workflow because they match the event bindings behind the application you run. If you are curious, there is a figure shows the event bindings of MSI in the Chapter on Create and Edit Applications in the full manual.

Jim and I discussed a bit on the error message you reported in the last posting and thought it might have come from losing network communication during information transfer. Does the same message show up everytime when you get this problem? We have never gotten the same error as you do.

Anchi

- Added by Anonymous about 18 years ago

I wasn't able to reproduce the problem again although it is possible that this is because of changes made in the Leginon client for a bug fix relating to a misspelt class name (described in Bugzilla) which I previously only made on the main Leginon system.

The error message about line 168 etc started to happen subsequent to the upgrade from SUSE 9.3 to SUSE 10.0. I got the error message described earlier in this thread before the upgrade. It reverted to the old apparently less informative error message with the bug fix for the misspelt class name in the main Leginon. I haven't observed the problem since putting the bug fix in the Leginon client but as I made the change recently and have not run Leginon many times since, I can't be sure that the problem has gone away yet,

William

"anchi" wrote: We often submit no empty target list in target finder nodes here and it has never been a problem. It is puzzling that you have the problem reproduced every time. When we had the hanging problem previously, it normally associate with random aborting target list that is being processed.

Jim and I discussed a bit on the error message you reported in the last posting and thought it might have come from losing network communication during information transfer. Does the same message show up everytime when you get this problem? We have never gotten the same error as you do.

Anchi

Deadlock problem still in Leginon 1.3 - Added by Anonymous over 17 years ago

By the way, I recently tried running a fully automated run (without queuing) with Leginon using version 1.3 and got the deadlock problem again. The error message on subsequently exiting Leginon was:


Exception happened during processing of request from ('129.11.140.56', 58537)

Traceback (most recent call last):

File "/usr/lib/python2.4/SocketServer.py", line 463, in process_request_threadException in thread Thread-25 (most likely raised during interpreter shutdown):

Traceback (most recent call last):

File "/usr/lib/python2.4/threading.py", line 442, in __bootstrap

File "/usr/lib/python2.4/threading.py", line 422, in run

File "/usr/lib/python2.4/SocketServer.py", line 466, in process_request_thread File "/usr/lib/python2.4/SocketServer.py", line 270, in handle_error

File "/usr/lib/python2.4/traceback.py", line 212, in print_exc

File "/usr/lib/python2.4/traceback.py", line 125, in print_exception

File "/usr/lib/python2.4/traceback.py", line 69, in print_tb

File "/usr/lib/python2.4/linecache.py", line 14, in getline

File "/usr/lib/python2.4/linecache.py", line 40, in getlines

File "/usr/lib/python2.4/linecache.py", line 107, in updatecache

exceptions.TypeError: object does not support item assignment

/home/bmswvn>

I'm not aware if the previous suggestion to use killtargets.py to start from scratch works or not (as I have not tried this workaround in Leginon 1.3 yet). I actually tried to do something else - use the abort buttons in some of the nodes because I wanted to keep the targets I had chosen in the atlas in square targetting. This did not work and I got deadlocking with a different error message subsequently:

Exception in thread data binder handler thread:

Traceback (most recent call last):

File "/usr/lib/python2.4/threading.py", line 442, in __bootstrap

self.run()

File "/usr/lib/python2.4/threading.py", line 422, in run

self.__target(*self.__args, **self.__kwargs)

File "/usr/lib/python2.4/site-packages/Leginon/databinder.py", line 131, in handleData

method(args)

File "/usr/lib/python2.4/site-packages/Leginon/targethandler.py", line 282, in handleTargetListDone

self.targetlistevents[targetlistid]['received'].set()

KeyError: 'received'

I suppose the first error message is not all that helpful as the traceback does not appear to go back into Leginon code for some reason,

William

- Added by Anonymous over 17 years ago

I've confirmed that the deadlock problem does affect us fairly consistently in Leeds for fully automated non-queuing runs of Leginon, in version 1.3 still. It appears that the deadlock happens more consistently if a browser (possibly with a busy Javascript/ AJAX application like Yahoo's new Web 2.0 style mail - this is what seemed to give me more problems anyway) or other busy application is running at the same time on the laptop I use to run Leginon (but not necessarily so it uses a significant percentage of CPU as observed by the "top" program). Also, having an empty target list generated by either the Hole Targeting or Exposure Targeting node at some point during the course of a run seems to be required for deadlock to occur. (Runs in MSI raster which often do not have empty targetting lists generally do not deadlock for example.) So maybe this gives some kind of indication of what would have to be done to reproduce the problem in Scripps. Unfortunately, quitting other applications like the browser does not reliably avoid deadlock though. I'm wondering also if it really is a true deadlock problem - with threads/ nodes holding resources that the other thread/ node requires to proceed - or if one of the threads/ nodes dies or times out waiting for a resource from the operating system (e.g. like fetching an image over the network) and then the other nodes wait for required stuff to happen in the failed node/ thread,

William

- Added by Anonymous almost 17 years ago

"WVNicholson" wrote: I've confirmed that the deadlock problem does affect us fairly consistently in Leeds for fully automated non-queuing runs of Leginon, in version 1.3 still. It appears that the deadlock happens more consistently if a browser (possibly with a busy Javascript/ AJAX application like Yahoo's new Web 2.0 style mail - this is what seemed to give me more problems anyway) or other busy application is running at the same time on the laptop I use to run Leginon (but not necessarily so it uses a significant percentage of CPU as observed by the "top" program). Also, having an empty target list generated by either the Hole Targeting or Exposure Targeting node at some point during the course of a run seems to be required for deadlock to occur. (Runs in MSI raster which often do not have empty targetting lists generally do not deadlock for example.) So maybe this gives some kind of indication of what would have to be done to reproduce the problem in Scripps. Unfortunately, quitting other applications like the browser does not reliably avoid deadlock though. I'm wondering also if it really is a true deadlock problem - with threads/ nodes holding resources that the other thread/ node requires to proceed - or if one of the threads/ nodes dies or times out waiting for a resource from the operating system (e.g. like fetching an image over the network) and then the other nodes wait for required stuff to happen in the failed node/ thread,
William

I probably should mention that I think I have Leginon 1.4.1 working on the new desktop PC and also it appears to avoid the problem with deadlock for non-queuing automated runs that I had with Leginon 1.3 (and some earlier versions) in the past on the laptop. I do not know if I'm avoiding deadlock due to using the newer PC (faster and with other differences) or the newer version of Leginon. I still have to use some features in the newer version of Leginon (like doing calibrations) on the new PC before I'm confident that everything is working and I'm ready to upgrade the version of Leginon on the laptop,

William

    (1-14/14)