Project

General

Profile

Actions

Bug #4396

closed

Relion 2d alignment fails on upload with empty classes

Added by Neil Voss over 8 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
08/18/2016
Due date:
% Done:

0%

Estimated time:
Affected Version:
Appion/Leginon 3.3
Show in known bugs:
No
Workaround:

Description

Works under trunk wonder if something is missing in the branches.

Trunk: works
Beta: ?
3.2: ?


Related issues 4 (3 open1 closed)

Related to Appion - Feature #3971: Add RELION 2D alignment and classificationIn TestCarl Negro02/19/2016

Actions
Related to Appion - Bug #3851: uplodaded maxlikeruns may have wrong references in the databaseAssignedAnchi Cheng12/23/2015

Actions
Related to Appion - Bug #4566: Relion 2D tool in appion uploads wrong resultsIn Code Review11/04/2016

Actions
Has duplicate Appion - Bug #4322: Upload RMAXLIKE alignment failed?Closed07/19/2016

Actions
Actions #1

Updated by Neil Voss over 8 years ago

  • Related to Feature #3971: Add RELION 2D alignment and classification added
Actions #2

Updated by Neil Voss over 8 years ago

I've tracked it down. Sometimes it crashes sometimes it does not. The problem occurs during the alignment of the class averages. I do this so that it two class averages are similar, they are in the same orientation; necessary for things like coran/PCA.

/emg/sw/relion/bin/relion_refine   --i part16aug18v24_it008_classes.mrcs  \
--o /emg/data/appion/06jul12a/align/maxlike2/ref16aug18v24  --angpix 4.8900  \
--iter 8  --K 1  --psi_step 5  --tau2_fudge 1.0  --particle_diameter 190.0  --j 1  --dont_check_norm 
 Running in double precision. 
 Estimating initial noise spectra 
   0/   0 sec ............................................................~~(,_,">
WARNING: There are only 9 particles in group 1
WARNING: You may want to consider joining some micrographs into larger groups to obtain more robust noise estimates. 
         You can do so by using the same rlnMicrographName label for particles from multiple different micrographs in the input STAR file. 
 Estimating accuracies in the orientational assignment ... 
   0/   0 sec ............................................................~~(,_,">
 Auto-refine: Estimated accuracy angles= 0.05 degrees; offsets= 0.05 pixels
 CurrentResolution= 65.2 Angstroms, which requires orientationSampling of at least 36 degrees for a particle of diameter 190 Angstroms
 Oversampling= 0 NrHiddenVariableSamplingPoints= 2088
 OrientationalSampling= 5 NrOrientations= 72
 TranslationalSampling= 2 NrTranslations= 29
=============================
 Oversampling= 1 NrHiddenVariableSamplingPoints= 66816
 OrientationalSampling= 2.5 NrOrientations= 576
 TranslationalSampling= 1 NrTranslations= 116
=============================
 Estimated memory for expectation step  > 0.10236 Gb, available memory = 2 Gb.
 Estimated memory for maximization step > 0.000152752 Gb, available memory = 2 Gb.
 Expectation iteration 1 of 8
000/??? sec ~~(,_,">                                                          [oo] exp_thisparticle_sumweight= nan
 part_id= 6
 group_id= 0 mymodel.scale_correction[group_id]= 1
 exp_ipass= 0
 sampling.NrDirections(0, true)= 1 sampling.NrDirections(0, false)= 1
 sampling.NrPsiSamplings(0, true)= 72 sampling.NrPsiSamplings(0, false)= 72
 mymodel.sigma2_noise[ipart]= 
      -nan
       nan
       nan
       nan
      -nan
       nan
       nan
       nan
      -nan
       nan
       nan
      -nan

 wsum_model.sigma2_noise[ipart]= 
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0

 wsum_model.pdf_direction[ipart]= 
         0

written out Mweight.spi
 exp_thisparticle_sumweight= nan
 exp_min_diff2[ipart]= 9.9e+100
ERROR!!! zero sum of weights....
File: src/ml_optimiser.cpp line: 4103

In thread 0
^CTraceback (most recent call last):
  File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 386, in <module>
!!! WARNING: could not run eman command: /emg/sw/relion/bin/relion_refine   \
--i part16aug18v24_it008_classes.mrcs  --o /emg/data/appion/06jul12a/align/maxlike2/ref16aug18v24  \
--angpix 4.8900  --iter 8  --K 1  --psi_step 5  --tau2_fudge 1.0  --particle_diameter 190.0  \
--j 1  --dont_check_norm 
Traceback (most recent call last):
  File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 537, in <module>
    maxLike.start()
  File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 381, in start
    maxLike.start()
  File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 508, in start
    self.runUploadScript()
  File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 217, in runUploadScript
    self.alignReferences(runparams)
  File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 437, in alignReferences
    proc.communicate()
  File "/usr/lib64/python2.6/subprocess.py", line 728, in communicate
    self.wait()
  File "/usr/lib64/python2.6/subprocess.py", line 1307, in wait
    pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
  File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call
    return func(*args)
KeyboardInterrupt
    apEMAN.executeEmanCmd(relioncmd, verbose=True, showcmd=True)
  File "/usr/lib64/python2.6/site-packages/appionlib/apEMAN.py", line 233, in executeEmanCmd
    out, err = emanproc.communicate()
  File "/usr/lib64/python2.6/subprocess.py", line 728, in communicate
    self.wait()
  File "/usr/lib64/python2.6/subprocess.py", line 1307, in wait
    pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
  File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call
    return func(*args)
KeyboardInterrupt

Actions #3

Updated by Neil Voss over 8 years ago

  • Assignee changed from Neil Voss to Anchi Cheng

Hi Anchi, I got this to break once. I ran the command again and it worked (from scratch, upload still fails on failed run). I am having trouble reproducing this otherwise (over 20 runs or maxlike) and I wanted to check to make sure that this is the error that you are seeing.

Actions #4

Updated by Anchi Cheng over 8 years ago

  • Assignee changed from Anchi Cheng to Neil Voss

Took me a while to reproduce, but I have two examples now for you. These are on guppy:/home/acheng/tests/test_relion_uploadbug. The error message from relion is the same.

badclasses.mrcs fails in the same way as yours, while goodclasses.mrcs get through uploadscipt.sh fine.

The two are from a user's run in /gpfs/appion/gscapin/16jun19a/align/rmaxlike13. The bad one is iter030/part16jul11k49_it030_classes.mrcs and the good one from the same run but iter029.

What I can tell is that something corrupted iteration 30. the mrcs had nan in header for amean and rms. I can read some of the frames with mrc module. After checking the sub numpy array one by one, it turned out that one of the stack frame (13) has all nan. Once I replace it with a zero array as in other empty frames, everything was o.k (cleanclass.mrcs)

Please check if you have the same thing. I think it is a relion bug since these files are generated by relion. We can add a correction step to look for nan frames in the classes.mrcs before this step to avoid this, I think.

Actions #5

Updated by Anchi Cheng about 8 years ago

  • Has duplicate Bug #4322: Upload RMAXLIKE alignment failed? added
Actions #6

Updated by Anchi Cheng about 8 years ago

  • Has duplicate Bug #4229: Relion 2d class average upload errror added
Actions #7

Updated by Anchi Cheng about 8 years ago

  • Has duplicate deleted (Bug #4229: Relion 2d class average upload errror)
Actions #8

Updated by Anchi Cheng about 8 years ago

  • Status changed from Assigned to In Code Review
  • Target version set to Appion/Leginon 3.3
  • Affected Version changed from Appion/Leginon 3.2 to Appion/Leginon 3.3

Added a replaceNaNImageInStack function. Tested with a 100 class run which had one frame needed correction. Seems to work.

Neil, please check if you would like the function moved to another place.

Actions #9

Updated by Neil Voss over 7 years ago

  • Related to Bug #3851: uplodaded maxlikeruns may have wrong references in the database added
Actions #10

Updated by Neil Voss over 7 years ago

  • Related to Bug #4566: Relion 2D tool in appion uploads wrong results added
Actions #11

Updated by Anchi Cheng almost 7 years ago

  • Status changed from In Code Review to Closed

not a problem any more

Actions

Also available in: Atom PDF