Bug #2873
closedXmipp2 CL2D class averages summary page does not always correctly map the class averages to the underlying stack of raw particles
0%
Don't request more than 64 classes from a CL2D calculation
Description
This is a somewhat odd bug that was noticed by Gabe Ozorowski. It can be most-easily demonstrated with the following data set in project 444 (RHoffman - testing) session 14jul28b. (Anyone that can't view that project/session that wants to, please email me. I've attached a graphic that should suffice.)
I ran the same CL2D calculation, once stopping at 32 classes, and once continuing to 256 classes. "Level 4" of the 256-class calculation also has 32 classes.
So the summary page for the 32-class calculation:
http://longboard.scripps.edu/gammamyamiweb/processing/viewstack.php?expId=13761&clusterId=15&file=/gpfs/group/em/appion/rmhoff/14jul28b/align/cl2d1/part14jul28o16_level_04_.hed
and for "level 4" of the 256-class calculation:
http://longboard.scripps.edu/gammamyamiweb/processing/viewstack.php?expId=13761&clusterId=20&file=/gpfs/group/em/appion/rmhoff/14jul28b/align/cl2d2/part14jul29r01_level_04_.hed
should have the same results (varying only in whatever random seeding is associated with a CL2D run.)
Indeed, the images displayed on those two links are nearly identical (see cl2dbug.png, attached.) But the single particles correctly correspond to the class average in the 32-class example (top part of figure, annotated with green arrows) and do not correctly correspond for the level-4-of-256-class example (bottom part of figure, annotated with red arrows.)
From looking at this yesterday, we think there's some reason to suspect that it isn't a problem with CL2D but with the post-processing required to generate these summaries.
It's a pernicious problem because the incorrectly-mapped single particles will also get transferred to any substacks. We've found this problem for any (Appion-bassed Xmipp2) CL2D calculation that gives more than 64 classes. Confirmation from Dmitry and any others watchers is desirable.
Files
Updated by Dmitry Lyumkis over 10 years ago
strange ... actually every cluster is wrong in Cl2D2 (2,4,8, up to 256), whereas all of them are correct in CL2D1. I don't have this issue in the recent CL2D data that I've processed. I'm inclined to think that something else might be messed up with the records (or job?) rather than the post-processing. How come there's no job file in either of those directories? I wanted to check the difference there, but didn't see a .job file.
Updated by Ryan Hoffman over 10 years ago
- File cl2d cl2d added
- File cl2d.e5564585 cl2d.e5564585 added
- File cl2d.o5564585 cl2d.o5564585 added
Do the .job files get generated if the command was run from a manually-submitted batch script?
I've attached the PBS script and the log files for the 256-class calculation. I think I ran the 32-class companion calculation on an interactive node, so I think I lost the STDERR.
Updated by Sargis Dallakyan about 10 years ago
- Related to Bug #2880: cl2d jobs won't run added
Updated by Sargis Dallakyan about 10 years ago
- Status changed from New to In Test
- Assignee changed from Dmitry Lyumkis to Ryan Hoffman
I have recently made a revision (r18592) that might provide a fix for this. This wasn't indeed a problem with CL2D but with the post-processing required to generate these summaries. We did a great teamwork to figure out that this was related to the file system used and the way pytohn's glob function returns directory listings. Please try again using myami/trunk and close this issue if the test passes.
Updated by Ryan Hoffman about 10 years ago
- Status changed from In Test to Closed
Gabe and I have independently confirmed that the bug is resolved. Thanks very much.