Leginon overnight crash
Added by Christopher Lilienthal almost 5 years ago
We encountered an error while a user was collecting data using Leginon overnight and I was hoping someone had some insight into the issue.
Leginon stopped working on 11/5/2019 at 11:59:53 p.m. with session 19nov05g. The microscope was operating in automatic data collection mode and should have run overnight, however, something happened to make it stop data collection. Leginon was stuck in Exposure Targeting mode (green arrow circling) with the following traceback in the terminal.
Traceback (most recent call last): File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/leginon/databinder.py", line 131, in handleData method(args) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/leginon/watcher.py", line 35, in handleEvent self.processEvent(pubevent) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/leginon/watcher.py", line 43, in processEvent self.processData(newdata) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/leginon/imagewatcher.py", line 47, in processData self.processImageData(idata) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/leginon/targetfinder.py", line 458, in processImageData self.setImageTiltAxis(imagedata) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/leginon/targetfinder.py", line 561, in setImageTiltAxis tem = imagedata['scope']['tem'] File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/sinedon/data.py", line 517, in __getitem__ return self.special_getitem(key, dereference=True) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/sinedon/data.py", line 502, in special_getitem value = value.getData(**kwargs) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/sinedon/data.py", line 254, in getData referent = datamanager.getData(self, **kwargs) File "/lsi/cryoem_automation/myami/trunk/lib/python2.7/site-packages/sinedon/data.py", line 159, in getData raise DataAccessError('referenced data can not be found: %s' % (datareference,)) DataAccessError: referenced data can not be found: DataReference[class: ScopeEMData, dmid: (('my_linux_host', 49152), 169321), dbid: 882333, referent: weak]
I have replaced the actual linux hostname with my_linux_hostname but this host where the user ran start_leginon.py. I reached out to Anchi and she suggested that I make this a forum post but provided the following information:
"We have seen this on some newer installation where local IT either use newer mysql server or local
IT restricts the connection time allowed by a single connection.Reference data is kept as a weak reference and its related query only execute when the information
within is needed. Therefore, if something needs to be held really long, you can get this error if the
original connection was closed.For these people and future comparibility, I started a myami-pymysql branch. pymysql allows
automatic reconnection and solved this problem. This branch will be update along-side myami-beta.
I may switch the whole installation over once I solve the installation issue of the module on Windows.Therefore, if this becomes an issue that you can not resolve by increasing the connection time allowed, this is
an option for you."
With that information I went and interrogated my database server.
The database server in question is running Red Hat Enterprise Linux 7 and has been the database server since October of 2017. The database being used is MariaDB v5.5.64.
According to the documentation I created when I setup and configured this database server I did not alter the default timeout values of the MariaDB server. Just to be sure, I connected to the database and ran a query to find the values of all the timeout variables:
MariaDB [(none)]> SHOW VARIABLES LIKE '%timeout%'; +----------------------------+----------+ | Variable_name | Value | +----------------------------+----------+ | connect_timeout | 10 | | deadlock_timeout_long | 50000000 | | deadlock_timeout_short | 10000 | | delayed_insert_timeout | 300 | | innodb_lock_wait_timeout | 50 | | innodb_rollback_on_timeout | OFF | | interactive_timeout | 28800 | | lock_wait_timeout | 31536000 | | net_read_timeout | 30 | | net_write_timeout | 60 | | slave_net_timeout | 3600 | | thread_pool_idle_timeout | 60 | | wait_timeout | 28800 | +----------------------------+----------+ 13 rows in set (0.01 sec)
I verified that these values are in fact the default values.
Has anyone encountered this sort of error or can point me in the right direction?
Thank you,
Chris