Bug #4396
closedRelion 2d alignment fails on upload with empty classes
Added by Neil Voss about 9 years ago. Updated over 7 years ago.
0%
Description
Works under trunk wonder if something is missing in the branches.
Trunk: works
Beta: ?
3.2: ?
Updated by Neil Voss about 9 years ago
- Related to Feature #3971: Add RELION 2D alignment and classification added
Updated by Neil Voss about 9 years ago
I've tracked it down. Sometimes it crashes sometimes it does not. The problem occurs during the alignment of the class averages. I do this so that it two class averages are similar, they are in the same orientation; necessary for things like coran/PCA.
/emg/sw/relion/bin/relion_refine --i part16aug18v24_it008_classes.mrcs \
--o /emg/data/appion/06jul12a/align/maxlike2/ref16aug18v24 --angpix 4.8900 \
--iter 8 --K 1 --psi_step 5 --tau2_fudge 1.0 --particle_diameter 190.0 --j 1 --dont_check_norm
Running in double precision.
Estimating initial noise spectra
0/ 0 sec ............................................................~~(,_,">
WARNING: There are only 9 particles in group 1
WARNING: You may want to consider joining some micrographs into larger groups to obtain more robust noise estimates.
You can do so by using the same rlnMicrographName label for particles from multiple different micrographs in the input STAR file.
Estimating accuracies in the orientational assignment ...
0/ 0 sec ............................................................~~(,_,">
Auto-refine: Estimated accuracy angles= 0.05 degrees; offsets= 0.05 pixels
CurrentResolution= 65.2 Angstroms, which requires orientationSampling of at least 36 degrees for a particle of diameter 190 Angstroms
Oversampling= 0 NrHiddenVariableSamplingPoints= 2088
OrientationalSampling= 5 NrOrientations= 72
TranslationalSampling= 2 NrTranslations= 29
=============================
Oversampling= 1 NrHiddenVariableSamplingPoints= 66816
OrientationalSampling= 2.5 NrOrientations= 576
TranslationalSampling= 1 NrTranslations= 116
=============================
Estimated memory for expectation step > 0.10236 Gb, available memory = 2 Gb.
Estimated memory for maximization step > 0.000152752 Gb, available memory = 2 Gb.
Expectation iteration 1 of 8
000/??? sec ~~(,_,"> [oo] exp_thisparticle_sumweight= nan
part_id= 6
group_id= 0 mymodel.scale_correction[group_id]= 1
exp_ipass= 0
sampling.NrDirections(0, true)= 1 sampling.NrDirections(0, false)= 1
sampling.NrPsiSamplings(0, true)= 72 sampling.NrPsiSamplings(0, false)= 72
mymodel.sigma2_noise[ipart]=
-nan
nan
nan
nan
-nan
nan
nan
nan
-nan
nan
nan
-nan
wsum_model.sigma2_noise[ipart]=
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
wsum_model.pdf_direction[ipart]=
0
written out Mweight.spi
exp_thisparticle_sumweight= nan
exp_min_diff2[ipart]= 9.9e+100
ERROR!!! zero sum of weights....
File: src/ml_optimiser.cpp line: 4103
In thread 0
^CTraceback (most recent call last):
File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 386, in <module>
!!! WARNING: could not run eman command: /emg/sw/relion/bin/relion_refine \
--i part16aug18v24_it008_classes.mrcs --o /emg/data/appion/06jul12a/align/maxlike2/ref16aug18v24 \
--angpix 4.8900 --iter 8 --K 1 --psi_step 5 --tau2_fudge 1.0 --particle_diameter 190.0 \
--j 1 --dont_check_norm
Traceback (most recent call last):
File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 537, in <module>
maxLike.start()
File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 381, in start
maxLike.start()
File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 508, in start
self.runUploadScript()
File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 217, in runUploadScript
self.alignReferences(runparams)
File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 437, in alignReferences
proc.communicate()
File "/usr/lib64/python2.6/subprocess.py", line 728, in communicate
self.wait()
File "/usr/lib64/python2.6/subprocess.py", line 1307, in wait
pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call
return func(*args)
KeyboardInterrupt
apEMAN.executeEmanCmd(relioncmd, verbose=True, showcmd=True)
File "/usr/lib64/python2.6/site-packages/appionlib/apEMAN.py", line 233, in executeEmanCmd
out, err = emanproc.communicate()
File "/usr/lib64/python2.6/subprocess.py", line 728, in communicate
self.wait()
File "/usr/lib64/python2.6/subprocess.py", line 1307, in wait
pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call
return func(*args)
KeyboardInterrupt
Updated by Neil Voss about 9 years ago
- Assignee changed from Neil Voss to Anchi Cheng
Hi Anchi, I got this to break once. I ran the command again and it worked (from scratch, upload still fails on failed run). I am having trouble reproducing this otherwise (over 20 runs or maxlike) and I wanted to check to make sure that this is the error that you are seeing.
Updated by Anchi Cheng about 9 years ago
- Assignee changed from Anchi Cheng to Neil Voss
Took me a while to reproduce, but I have two examples now for you. These are on guppy:/home/acheng/tests/test_relion_uploadbug. The error message from relion is the same.
badclasses.mrcs fails in the same way as yours, while goodclasses.mrcs get through uploadscipt.sh fine.
The two are from a user's run in /gpfs/appion/gscapin/16jun19a/align/rmaxlike13. The bad one is iter030/part16jul11k49_it030_classes.mrcs and the good one from the same run but iter029.
What I can tell is that something corrupted iteration 30. the mrcs had nan in header for amean and rms. I can read some of the frames with mrc module. After checking the sub numpy array one by one, it turned out that one of the stack frame (13) has all nan. Once I replace it with a zero array as in other empty frames, everything was o.k (cleanclass.mrcs)
Please check if you have the same thing. I think it is a relion bug since these files are generated by relion. We can add a correction step to look for nan frames in the classes.mrcs before this step to avoid this, I think.
Updated by Anchi Cheng almost 9 years ago
- Has duplicate Bug #4322: Upload RMAXLIKE alignment failed? added
Updated by Anchi Cheng almost 9 years ago
- Has duplicate Bug #4229: Relion 2d class average upload errror added
Updated by Anchi Cheng almost 9 years ago
- Has duplicate deleted (Bug #4229: Relion 2d class average upload errror)
Updated by Anchi Cheng almost 9 years ago
- Status changed from Assigned to In Code Review
- Target version set to Appion/Leginon 3.3
- Affected Version changed from Appion/Leginon 3.2 to Appion/Leginon 3.3
Added a replaceNaNImageInStack function. Tested with a 100 class run which had one frame needed correction. Seems to work.
Neil, please check if you would like the function moved to another place.
Updated by Neil Voss over 8 years ago
- Related to Bug #3851: uplodaded maxlikeruns may have wrong references in the database added
Updated by Neil Voss over 8 years ago
- Related to Bug #4566: Relion 2D tool in appion uploads wrong results added
Updated by Anchi Cheng over 7 years ago
- Status changed from In Code Review to Closed
not a problem any more