Feature #5927
closedArchive data as LZW TIFF rather than mrc.bz2?
0%
Description
Hi Anchi
Not sure whether this is a leginon, appion, or myami thing, but would it be possible to add an option to archive stacks as LZW compressed TIFF files, rather than bzipped mrc files?
Relion 3, cryoSPARC 2, EMAN, CisTEM and Motioncor2 all support direct use of LZW TIFFS for frame alignment and in the case of Relion 3 and CryoSPARC 2, polishing, which saves a lot of complications and storage issues.
Cheers
Oli
Updated by Oliver Clarke over 6 years ago
Meant to note - this would also be really helpful during data collection for use with Relion's new on-the-fly auto-processing script, relion_it.py. (e.g. see this tweet from one of the devs, Takanori Nakane, which I think demonstrates the utility: https://twitter.com/biochem_fan/status/1026147739367813120)
Without LZW TIFF support this is a little awkward, as one has to unzip the mrc.bz2, then convert to LZW TIFF, which is a bit slow for on-the-fly processing
Oli
Updated by Anchi Cheng over 6 years ago
- Status changed from New to Won't Fix or Won't Do
Archiving in bzip2 is more space-efficient than lzw tiff. Therefore, I will not implement this.
Relion takes mrc stack. It just wants to call them mrcs.
For system with efficient network throughput, it is more efficient to keep uncompressed frame stack to not to have to uncompress in the programes. But tiff option is good for those without such speedy network.
K2 data collection in lzw tiff is in Issue #2727 which is almost ready. You can use that instead.
Updated by Oliver Clarke over 6 years ago
Hi Anchi,
Yes, relion takes mrc stacks, but it does not take mrc.bz2, whereas it can process tiffs directly. Storing decompressed mrc stacks is not practical for many users, because of the huge amount of space they take up, and motion correction of lzw tiffs in relion is very fast - 400-800 movies per hour on my workstation (2x12 cores, 256G RAM).
Also, while it is true that archiving in mrc.bz2 is slightly more space efficient, it is not a huge difference - lzw tiffs take up ~20% more space. I think for the convenience of being able to process them directly in relion, cistem, cryosparc or really any major package now it is worth it, but of course it is totally your call.
Doesn't issue #2727 relate to SerialEM, meaning this would not be a feature available for data collection in leginon? or am I misunderstanding?
Cheers
Oli
Anchi Cheng wrote:
Archiving in bzip2 is more space-efficient than lzw tiff. Therefore, I will not implement this.
Relion takes mrc stack. It just wants to call them mrcs.
For system with efficient network throughput, it is more efficient to keep uncompressed frame stack to not to have to uncompress in the programes. But tiff option is good for those without such speedy network.
K2 data collection in lzw tiff is in Issue #2727 which is almost ready. You can use that instead.
Updated by Anchi Cheng over 6 years ago
Also motioncor speed is not cpu determined.
Updated by Oliver Clarke over 6 years ago
Oh well that's great then! If direct collection in TIF format will be available that totally obviates the need for this, sorry I didn't realize it was on the cards! :)
Motioncor2 (the original) is GPU accelerated. The relion-3 implementation of Motioncor2 is written for CPU, and the number of cores and RAM you have available absolutely does make a difference.
Cheers
Oli
Anchi Cheng wrote:
Also motioncor speed is not cpu determined.