Bug #4396
closedRelion 2d alignment fails on upload with empty classes
Added by Neil Voss over 8 years ago. Updated almost 7 years ago.
0%
Description
Works under trunk wonder if something is missing in the branches.
Trunk: works
Beta: ?
3.2: ?
Updated by Neil Voss over 8 years ago
- Related to Feature #3971: Add RELION 2D alignment and classification added
Updated by Neil Voss over 8 years ago
I've tracked it down. Sometimes it crashes sometimes it does not. The problem occurs during the alignment of the class averages. I do this so that it two class averages are similar, they are in the same orientation; necessary for things like coran/PCA.
/emg/sw/relion/bin/relion_refine --i part16aug18v24_it008_classes.mrcs \ --o /emg/data/appion/06jul12a/align/maxlike2/ref16aug18v24 --angpix 4.8900 \ --iter 8 --K 1 --psi_step 5 --tau2_fudge 1.0 --particle_diameter 190.0 --j 1 --dont_check_norm Running in double precision. Estimating initial noise spectra 0/ 0 sec ............................................................~~(,_,"> WARNING: There are only 9 particles in group 1 WARNING: You may want to consider joining some micrographs into larger groups to obtain more robust noise estimates. You can do so by using the same rlnMicrographName label for particles from multiple different micrographs in the input STAR file. Estimating accuracies in the orientational assignment ... 0/ 0 sec ............................................................~~(,_,"> Auto-refine: Estimated accuracy angles= 0.05 degrees; offsets= 0.05 pixels CurrentResolution= 65.2 Angstroms, which requires orientationSampling of at least 36 degrees for a particle of diameter 190 Angstroms Oversampling= 0 NrHiddenVariableSamplingPoints= 2088 OrientationalSampling= 5 NrOrientations= 72 TranslationalSampling= 2 NrTranslations= 29 ============================= Oversampling= 1 NrHiddenVariableSamplingPoints= 66816 OrientationalSampling= 2.5 NrOrientations= 576 TranslationalSampling= 1 NrTranslations= 116 ============================= Estimated memory for expectation step > 0.10236 Gb, available memory = 2 Gb. Estimated memory for maximization step > 0.000152752 Gb, available memory = 2 Gb. Expectation iteration 1 of 8 000/??? sec ~~(,_,"> [oo] exp_thisparticle_sumweight= nan part_id= 6 group_id= 0 mymodel.scale_correction[group_id]= 1 exp_ipass= 0 sampling.NrDirections(0, true)= 1 sampling.NrDirections(0, false)= 1 sampling.NrPsiSamplings(0, true)= 72 sampling.NrPsiSamplings(0, false)= 72 mymodel.sigma2_noise[ipart]= -nan nan nan nan -nan nan nan nan -nan nan nan -nan wsum_model.sigma2_noise[ipart]= 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 wsum_model.pdf_direction[ipart]= 0 written out Mweight.spi exp_thisparticle_sumweight= nan exp_min_diff2[ipart]= 9.9e+100 ERROR!!! zero sum of weights.... File: src/ml_optimiser.cpp line: 4103 In thread 0 ^CTraceback (most recent call last): File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 386, in <module> !!! WARNING: could not run eman command: /emg/sw/relion/bin/relion_refine \ --i part16aug18v24_it008_classes.mrcs --o /emg/data/appion/06jul12a/align/maxlike2/ref16aug18v24 \ --angpix 4.8900 --iter 8 --K 1 --psi_step 5 --tau2_fudge 1.0 --particle_diameter 190.0 \ --j 1 --dont_check_norm Traceback (most recent call last): File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 537, in <module> maxLike.start() File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 381, in start maxLike.start() File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 508, in start self.runUploadScript() File "/emg/sw/myami/appion/bin/relionMaxlikeAlignment.py", line 217, in runUploadScript self.alignReferences(runparams) File "/emg/sw/myami/appion/bin/uploadRelion2DMaxlikeAlign.py", line 437, in alignReferences proc.communicate() File "/usr/lib64/python2.6/subprocess.py", line 728, in communicate self.wait() File "/usr/lib64/python2.6/subprocess.py", line 1307, in wait pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call return func(*args) KeyboardInterrupt apEMAN.executeEmanCmd(relioncmd, verbose=True, showcmd=True) File "/usr/lib64/python2.6/site-packages/appionlib/apEMAN.py", line 233, in executeEmanCmd out, err = emanproc.communicate() File "/usr/lib64/python2.6/subprocess.py", line 728, in communicate self.wait() File "/usr/lib64/python2.6/subprocess.py", line 1307, in wait pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call return func(*args) KeyboardInterrupt
Updated by Neil Voss over 8 years ago
- Assignee changed from Neil Voss to Anchi Cheng
Hi Anchi, I got this to break once. I ran the command again and it worked (from scratch, upload still fails on failed run). I am having trouble reproducing this otherwise (over 20 runs or maxlike) and I wanted to check to make sure that this is the error that you are seeing.
Updated by Anchi Cheng over 8 years ago
- Assignee changed from Anchi Cheng to Neil Voss
Took me a while to reproduce, but I have two examples now for you. These are on guppy:/home/acheng/tests/test_relion_uploadbug. The error message from relion is the same.
badclasses.mrcs fails in the same way as yours, while goodclasses.mrcs get through uploadscipt.sh fine.
The two are from a user's run in /gpfs/appion/gscapin/16jun19a/align/rmaxlike13. The bad one is iter030/part16jul11k49_it030_classes.mrcs and the good one from the same run but iter029.
What I can tell is that something corrupted iteration 30. the mrcs had nan in header for amean and rms. I can read some of the frames with mrc module. After checking the sub numpy array one by one, it turned out that one of the stack frame (13) has all nan. Once I replace it with a zero array as in other empty frames, everything was o.k (cleanclass.mrcs)
Please check if you have the same thing. I think it is a relion bug since these files are generated by relion. We can add a correction step to look for nan frames in the classes.mrcs before this step to avoid this, I think.
Updated by Anchi Cheng about 8 years ago
- Has duplicate Bug #4322: Upload RMAXLIKE alignment failed? added
Updated by Anchi Cheng about 8 years ago
- Has duplicate Bug #4229: Relion 2d class average upload errror added
Updated by Anchi Cheng about 8 years ago
- Has duplicate deleted (Bug #4229: Relion 2d class average upload errror)
Updated by Anchi Cheng about 8 years ago
- Status changed from Assigned to In Code Review
- Target version set to Appion/Leginon 3.3
- Affected Version changed from Appion/Leginon 3.2 to Appion/Leginon 3.3
Added a replaceNaNImageInStack function. Tested with a 100 class run which had one frame needed correction. Seems to work.
Neil, please check if you would like the function moved to another place.
Updated by Neil Voss over 7 years ago
- Related to Bug #3851: uplodaded maxlikeruns may have wrong references in the database added
Updated by Neil Voss over 7 years ago
- Related to Bug #4566: Relion 2D tool in appion uploads wrong results added
Updated by Anchi Cheng almost 7 years ago
- Status changed from In Code Review to Closed
not a problem any more