Project

General

Profile

Actions

Bug #3025

open

MySQL server has gone away

Added by Melody Campbell almost 10 years ago. Updated over 9 years ago.

Status:
In Test
Priority:
High
Category:
-
Target version:
-
Start date:
03/22/2015
Due date:
% Done:

0%

Estimated time:
Affected Version:
Appion/Leginon 3.1.0
Show in known bugs:
No
Workaround:

Description

Hi,
I think we might have had an issue with this earlier but I can't seem to find it. My cl2d jobs (both xmipp2 and xmipp3) won't upload to appion when I run them (from garibaldi). I get the following error:

Traceback (most recent call last):
File "/opt/applications/myami/trunk/bin/runXmipp3CL2D.py", line 634, in <module>
cl2d.start()
File "/opt/applications/myami/trunk/bin/runXmipp3CL2D.py", line 605, in start
self.apix = apStack.getStackPixelSizeFromStackId(self.runparams['stackid'])*self.runparams['bin']
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/appionlib/apStack.py", line 560, in getStackPixelSizeFromStackId
stackdata = getOnlyStackData(stackId, msg=msg)
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/appionlib/apStack.py", line 144, in getOnlyStackData
stackdata = appiondata.ApStackData.direct_query(stackid)
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/sinedon/data.py", line 428, in direct_query
result = db.direct_query(cls, dbid, **kwargs)
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/sinedon/dbdatakeeper.py", line 59, in direct_query
result = self.dbd.multipleQueries(queryinfo, readimages=readimages)
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/sinedon/sqldict.py", line 246, in multipleQueries
return multipleQueries(self.db, queryinfo, readimages)
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/sinedon/sqldict.py", line 520, in init
self.execute()
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/sinedon/sqldict.py", line 533, in execute
c = self._cursor()
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/sinedon/sqldict.py", line 523, in _cursor
self.db.ping()
_mysql_exceptions.OperationalError: (2006, 'MySQL server has gone away')
Exception _mysql_exceptions.OperationalError: (2006, 'MySQL server has gone away') in <bound method CL2D.
_del__ of <__main__.CL2D object at 0x1b0c210>> ignored

The directory is here: /gpfs/group/em/appion/15mar13a/align/cl2d3-why

Dmitry had written a script to upload the results to appion even if it crashed in this way, but it doesn't seem to work for me. Perhaps it is related to this issue and has to do with the server being reset? #2414

Thanks!
Melody

Actions #1

Updated by Sargis Dallakyan almost 10 years ago

  • Status changed from New to In Test
  • Assignee changed from Sargis Dallakyan to Melody Campbell

Hi Melody,

I've added the following lines under [mysqld] for fishtail:/etc/my.cnf

interactive_timeout = 864000
wait_timeout = 864000
max_allowed_packet = 64M

I've restarted mysqld on fishtail and hope that fixes this issue. I've searched 'MySQL server has gone away' in here and the only reported case of this is indeed #2414. max_allowed_packet = 64M added following http://stackoverflow.com/questions/12425287/mysql-server-has-gone-away-when-importing-large-sql-file

Actions #2

Updated by Melody Campbell over 9 years ago

Hi Sargis,

I now get this error:
!!! WARNING: could not create stack average, average.mrc
... Inserting CL2D Run into DB
!!! WARNING: could not find average mrc file: /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit/average.mrc

lines= ['\tlibmpi.so.0 => /opt/applications/openmpi/1.4.3/gnu/lib/libmpi.so.0 (0x00002abe5a18b000)\n', '\tlibmpi_cxx.so.0 => /opt/applications/openmpi/1.4.3/gnu/lib/libmpi_cxx.so.0 (0x00002abe5a574000)\n', '\tlibopen-rte.so.0 => /opt/applications/openmpi/1.4.3/gnu/lib/libopen-rte.so.0 (0x00002abe5a78f000)\n', '\tlibopen-pal.so.0 => /opt/applications/openmpi/1.4.3/gnu/lib/libopen-pal.so.0 (0x00002abe5aa1d000)\n']
/gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit/alignedStack.hed
Traceback (most recent call last):
File "/opt/applications/myami/trunk/bin/runXmippCL2D.py", line 629, in <module>
cl2d.start()
File "/opt/applications/myami/trunk/bin/runXmippCL2D.py", line 610, in start
self.insertAlignStackRunIntoDatabase("alignedStack.hed")
File "/opt/applications/myami/trunk/bin/runXmippCL2D.py", line 392, in insertAlignStackRunIntoDatabase
apDisplay.printError("could not find reference stack file: "+refstackfile)
File "/opt/applications/myami/trunk/lib/python2.6/site-packages/appionlib/apDisplay.py", line 65, in printError
raise Exception, colorString("\n * FATAL ERROR *\n"+text+"\n\a","red")
Exception: * FATAL ERROR *
could not find reference stack file: /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit/part15mar23o07_level_-1_.hed

which i have also documented here: #2303

any suggestions?
thanks!
Melody

Actions #3

Updated by Melody Campbell over 9 years ago

Also, I can't change the status from "In test..."

Actions #4

Updated by Sargis Dallakyan over 9 years ago

Hi Melody,

I've added you to Appion project members list. Hopefully, this will fix the issues with Redmine you are having lately.

Melody Campbell wrote:

I now get this error:
!!! WARNING: could not create stack average, average.mrc
... Inserting CL2D Run into DB
!!! WARNING: could not find average mrc file: /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit/average.mrc

At first, I thought there might be a permission issue on /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit, but after reading #2303 I'm not sure now. Run the following commands and past the output here to see if there are more clues there:

ls /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit
more /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit/*.o*

Thanks,
Sargis

Actions #5

Updated by Melody Campbell over 9 years ago

Hi Sargis, I ran the two commands, this is what I get:

home/melody> ls /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit
alignedStack.hed cl2d3-newsargisedit.appionsub.job.e7418509 runXmippCL2D.log xmipp.std
alignedStack.img cl2d3-newsargisedit.appionsub.job.o7418509 thread000.log
cl2d-15mar23o07-params.pickle cl2d3-newsargisedit.appionsub.log thread001.log
cl2d3-newsargisedit.appionsub.job run_commands.log xmipp.log

more /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit/*.o*
the local scratch directory for this session is: /scratch/melody/7418509.garibaldi01-adm.cluster.net
the distributed scratch directory for this session is: /gpfs/work/melody/7418509.garibaldi01-adm.cluster.net
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.

Actions #6

Updated by Sargis Dallakyan over 9 years ago

Hi Melody,

I've looked at /gpfs/group/em/appion/15mar13a/align/cl2d3-newsargisedit/xmipp.std and see lots of messages like this one:

/opt/applications/xmipp/2.4/gnu/bin/xmipp_mpi_class_averages: symbol lookup error: /opt/applications/xmipp/2.4/gnu/bin/xmipp
_mpi_class_averages: undefined symbol: _ZN7SelFileC1Ev

This probably means that /opt/applications/xmipp/2.4/gnu/bin/xmipp_mpi_class_averages is build with different mpi than the one loaded in your env during runtime. Ask JC to see if he knows which mpi is used to build xmipp_mpi_class_averages. runXmippCL2D.py removes temporary files and it would be difficult to make a test case to troubleshoot this. Try using Xmipp 3 instead to see if it works. Dmitry made a bugfix yesterday for Xmipp3 CL2D #3027.

Ask JC to update /opt/applications/myami/trunk first before testing Xmipp3 CL2D.

Actions

Also available in: Atom PDF