Project

General

Profile

Actions

Feature #2744

closed

Change the default memory for Cl2d jobs submitted to garibaldi

Added by Melody Campbell about 10 years ago. Updated almost 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Amber Herold
Category:
Web interface
Target version:
-
Start date:
04/25/2014
Due date:
% Done:

0%

Estimated time:

Description

Hi,

Whenever a cl2d job is submitted to garibaldi, the default memory is always 2gb. This will always cause the run to fail as it is not adequate to run the alignments. What we have been doing is simply modifying the job file to the amount of memory we want (each garibaldi node as 47gb memory, so we usually just use, if n= nodes, 46*n gb of memory for the run.) With multiple new users in the lab, it would be great if this could be the default, as it's just another thing for them to have to try and troubleshoot if their job fails, as modifying job files is rather overwhelming for a first time user.

Alternatively, if we could just specify how much memory is needed when specifying the number of processors needed in the user interface it would be great too.

Thanks,
Melody


Files

runXmipp3CL2D.py_Launcher.png (12.1 KB) runXmipp3CL2D.py_Launcher.png Sargis Dallakyan, 06/13/2014 02:36 PM

Related issues 1 (0 open1 closed)

Related to Appion - Bug #2880: cl2d jobs won't runClosedAnchi Cheng08/05/2014

Actions
Actions #1

Updated by Sargis Dallakyan about 10 years ago

  • Status changed from New to Assigned
  • Affected Version changed from Appion/Leginon 2.1.0 to Appion/Leginon 3.0.0

I've looked to see where the default memory of 2gb is coming. This is set in myami queue on garibaldi:

garibaldi00 align/cl2d7> qstat -f -Q myami
Queue: myami
    queue_type = Execution
    total_jobs = 2
    state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:2 Exiting:0 
    resources_max.cput = 200000:00:00
    resources_max.mem = 9588gb
    resources_max.ncpus = 1760
    resources_max.nodect = 173
    resources_max.walltime = 900:00:00
    resources_default.cput = 01:00:00
    resources_default.mem = 2gb <=====
    resources_default.ncpus = 1
    resources_default.nodect = 1
    resources_default.nodes = 1
    resources_default.walltime = 00:00:12
    mtime = 1385248723
    resources_assigned.mem = 45097156608b
    resources_assigned.ncpus = 2
    resources_assigned.nodect = 3
    max_user_run = 200
    keep_completed = 60
    enabled = True

I found 'memorymax' => '47' line in myamiweb/config.php for garibaldi host. I'll read through the code to see how to use this to set the default for requested memory.

Actions #2

Updated by Sargis Dallakyan almost 10 years ago

Added Processing Host Parameters options to Xmipp 3 Clustering 2D Alignment as shown in the image below.

Users can now specify the amount of memory needed (in gb) and it defaults to 47gb for garibaldi.

Actions #3

Updated by Melody Campbell almost 10 years ago

Hi Sargis,

Thanks so much for coding this. It does make the job file correctly, which is awesome. The only thing is i'm not sure if the xmipp 3 version of cl2d works properly on garibaldi, i have been using the xmipp 2 cl2d. I have launched a job on garibaldi and we'll see if it runs or not, (which will take a while-- the garibaldi queue for big jobs might be a couple days....)

I'll keep you posted,
Melody

Actions #4

Updated by Melody Campbell almost 10 years ago

Hi Sargis,

Emily and I just tested Xmipp3. I tried to test it on garibaldi but it's not installed there. Emily tested it on guppy and it didn't upload properly-- however, the PBS script seemed to work correctly. Here is Emily's directory where she tried to run it: /ami/data15/appion/14jun11c/align/cl2d1

I think, however, that as Xmipp2 has been running to completion lately, we would all be really happy if you could add the "Processing Host Parameters" appion module for the xmipp2 version of cl2d because then we could all start using it immediately on Garibaldi.

Thanks so much, and if you have any questions I'm happy to clarify.

Cheers,
Melody

Actions #5

Updated by David Veesler almost 10 years ago

Appion now allows to choose the queue we want to submit to and adapts the number of processors requested per node accordingly (e.g. 8 procs/node on garibaldi).

However, it still defaults to 2gb/node for jobs submitted to garibaldi which is an issue.
Ideally, we would like to be able to type in the requested memory.

Actions #6

Updated by Amber Herold almost 10 years ago

  • Status changed from In Test to Assigned
  • Assignee changed from Melody Campbell to Amber Herold

I can add in the mem parameter today.

Actions #7

Updated by Amber Herold almost 10 years ago

  • Status changed from Assigned to In Test
  • Assignee changed from Amber Herold to David Veesler

David,
Go ahead and give this a try. Should be working on longboard today. Everywhere else tomorrow.

Actions #8

Updated by Melody Campbell almost 10 years ago

  • Assignee changed from David Veesler to Amber Herold

Hi,

So i'm pretty sure cl2d2 will not work at all anymore.

On longboard/beta I get this error after the job is submitted:

!!! WARNING: could not create stack average, average.mrc
... Inserting CL2D Run into DB

lines= ['\tlibmpi.so.1 => /usr/lib64/libmpi.so.1 (0x0000003c67000000)\n', '\tlibmpi_cxx.so.1 => /usr/lib64/libmpi_cxx.so.1 (0x00007f4a85cb7000)\n']
/ami/data00/appion/14jul31e/align/cl2d15/alignedStack.hed
Traceback (most recent call last):
File "/opt/myamisnap/bin/runXmippCL2D.py", line 624, in <module>
cl2d.start()
File "/opt/myamisnap/bin/runXmippCL2D.py", line 605, in start
self.insertAlignStackRunIntoDatabase("alignedStack.hed")
File "/opt/myamisnap/bin/runXmippCL2D.py", line 386, in insertAlignStackRunIntoDatabase
apDisplay.printError("could not find average mrc file: "+avgmrcfile)
File "/opt/myamisnap/lib/appionlib/apDisplay.py", line 65, in printError
raise Exception, colorString("\n * FATAL ERROR *\n"+text+"\n\a","red")
Exception: * FATAL ERROR *
could not find average mrc file: /ami/data00/appion/14jul31e/align/cl2d15/average.mrc

In this directory: /ami/data00/appion/14jul31e/align/cl2d15

And on cronus3/beta I get this error and it won't even submit
ERROR in job submission. Check the cluster setup. Ensure the ,.appio.cfg configuration file is correct (http://emg.nysbc.org/projects/appion/wiki/Configure_appioncfg) Job type: partalign partalign ['runXmippCL2D.py', '--stack=116', '--lowpass=15', '--highpass=2000', '--num-part=1999', '--num-ref=20', '--bin=2', '--max-iter=15', '--nproc=32', '--fast', '--classical_multiref', '--correlation', '--commit', '--nodes=4', '--ppn=8', '--mem=180', '--walltime=240', '--cput=24000', '--queue=myami', '--description=data00', '--runname=cl2d2', '--rundir=/ami/data00/appion/14jul31e/align/cl2d2', '--projectid=414', '--expid=13758', '--jobtype=partalign', '--jobid=548']

ERROR in job submission. Check the cluster setup. Ensure the ,.appio.cfg configuration file is correct (http://emg.nysbc.org/projects/appion/wiki/Configure_appioncfg) Job type: partalign partalign ['/opt/myamisnap/bin/appion', 'runXmippCL2D.py', '--stack=116', '--lowpass=15', '--highpass=2000', '--num-part=1999', '--num-ref=20', '--bin=2', '--max-iter=15', '--nproc=14', '--fast', '--classical_multiref', '--correlation', '--commit', '--nodes=4', '--ppn=4', '--walltime=2', '--cput=200', '--description=test', '--runname=cl2d6', '--rundir=/ami/data15/appion/14jul31e/align/cl2d6', '--projectid=414', '--expid=13758', '--jobtype=partalign', '--jobid=551']

Actions #9

Updated by Amber Herold almost 10 years ago

  • Related to Bug #2880: cl2d jobs won't run added
Actions #10

Updated by Amber Herold almost 10 years ago

Looks like the Cronus3 issue was related to data00 access, and the other error related to average.mrc was reported by David prior to me adding the processing host parameters, so lets continue discussion of that in #2880.

Actions #11

Updated by Amber Herold almost 10 years ago

  • Status changed from In Test to Closed
Actions

Also available in: Atom PDF