Bug #4622: AverageStack fails with Stack is too large error when the computer has too much memory - Appion - Electron Microscopy Group

Actions

Copy link

Bug #4622

closed

AverageStack fails with Stack is too large error when the computer has too much memory

Added by Anchi Cheng almost 9 years ago. Updated over 7 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Neil Voss

Category:

Target version:

Appion/Leginon 3.3

Start date:

11/28/2016

Due date:

% Done:

Estimated time:

Affected Version:

Appion/Leginon 3.2

Show in known bugs:

Workaround:

Description

This error is reported by a local user on semc-cluster. It can be reproduced with the stack of 154635 particles of 160 boxsize. Stack average part failed due to the memory needed by the chunk based on self.freememory is larger than the hard-coded memorylimit value defined at the start of apImageFile module.

>>> from appionlib import apStack
>>> a = apStack.AverageStack(True)
>>> a.start('start.hed',None)

 ... processStack: Free memory: 206.4 GB
 ... processStack: Box size: 160
 ... processStack: Memory used per part: 100.0 kB
 ... processStack: Max particles in memory: 2164206
 ... processStack: Particles allowed in memory: 108210
 ... processStack: Number of particles in stack: 154635
 ... processStack: Particle loop num chunks: 2
 ... processStack: Particle loop step size: 77317
 ... processStack: partnum 1 to 77317 of 154635
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/acheng/myami/appion/appionlib/apImagicFile.py", line 974, in start
    stackdata = readImagic(stackfile, first=first, last=last, msg=False)
  File "/home/acheng/myami/appion/appionlib/apImagicFile.py", line 168, in readImagic
    %(numpart, apDisplay.bytes(partbytes*numpart)))
  File "/home/acheng/myami/appion/appionlib/apDisplay.py", line 65, in printError
    raise Exception, colorString("\n *** FATAL ERROR ***\n"+text+"\n\a","red")
Exception: 
 *** FATAL ERROR ***
Stack is too large to read 77317 particles, requesting 7.4 GB

Neil, how do you want this to be handled ? Many cluster nodes have large memory but shared by many processors. Unless the user always blocks everyone else from accessing a node, the shared memory is not really all available. I hope the rest of you weigh in as well since if we assign it too small it will slow things down.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Anchi Cheng almost 9 years ago

Related to Bug #4651: Xmipp added

Actions

Copy link

Updated by Anchi Cheng almost 9 years ago

Neil,

If you are doing anything about this, I am going to increase the limit to 16 GB as a work around.

Actions

Copy link

Updated by Neil Voss almost 9 years ago

I could take the available memory divided by the number of processors.

Actions

Copy link

Updated by Anchi Cheng almost 9 years ago

How about set it to mem.free()*1024 / 2 (i.e. half of the free memory) ?

Actions

Copy link

Updated by Neil Voss almost 9 years ago

Actually, it is already mem.free()*1024 / 10

Actions

Copy link

Updated by Neil Voss almost 9 years ago

 
 ... processStack: Free memory: 206.4 GB
 ... processStack: Memory used per part: 100.0 kB
 ... processStack: Max particles in memory: 2164206
 ... processStack: Particles allowed in memory: 108210
 ... processStack: Number of particles in stack: 154635
 ... processStack: Particle loop num chunks: 2
 ... processStack: Particle loop step size: 77317
 ... processStack: partnum 1 to 77317 of 154635

So based on these numbers, it is only requesting to use 7.4 GB of 206.4 GB

Actions

Copy link

Updated by Anchi Cheng almost 9 years ago

The code has two parts that conflict which is why it failed:

There is a hard-code bytelimit at the start of the file used in readImagicData and then there is self.freememory used in initValues of processStack.

I have tested changing bytelimit to calculated from mem.free()*1024 / 2 to resolve this. As long as mem.free() does not change suddenly, it would work.

Actions

Copy link