Feature #6748: Provide native Appion support for Topaz particle picking - Appion - Electron Microscopy Group

Actions

Copy link

Feature #6748

open

Provide native Appion support for Topaz particle picking

Added by Neil Voss over 6 years ago. Updated almost 6 years ago.

Status:

Assigned

Priority:

Normal

Assignee:

Sargis Dallakyan

Category:

Image Processing

Target version:

Appion/Leginon 3.4

Start date:

02/15/2019

Due date:

% Done:

Estimated time:

Description

Based on Micah's recent presentation, Neil is trying to get Topaz into Appion:

Scope of Project:
Implement Topaz deep learning particle picker within Appion particle picking framework and provide compatibility with Relion Processing Software.

Project Stages:
1. Python backend to process images and launch Topaz picking sessions
A. Image processing, prepare micrographs for Topaz
B. Copy manual, template, or DoG particle picks from Appion and give to Topaz for initial training
C. Implement Topaz training, choose default or user custom values
D. Apply Topaz neural network and process all micrographs
E. Python script to extract particles from Topaz (star file) for Appion use
F. Apply threshold and upload particles to Appion database
G. Automatically generate Star file for going straight into Relion
2. Topaz GPU and parameter optimization
3. Web interface
A. Create standard PHP web interface to launch Topaz jobs; initial implementation will be simple for new users to use
B. Create PHP web interface for re-processing images with a different threshold

Limitations of proposed implementation, that could be updated in future:
• Limited feedback: only standard particle picking tools and images will be available.
• Requires an existing particle picking run to be used to train the neural network.
• Complex model training methods will be unavailable in first implementation (requires more user interaction during training process than Appion is built for).
• Neural network models are saved but not easily transferred to other data sets.
• Neural network is trained from scratch for each job submission.
• Particle picking threshold is chosen at initial launch and cannot be changed without launching another job (design proposed above).

Files

Download all files

topaz_instructions.pptx (979 KB) topaz_instructions.pptx	Micah presentation	Neil Voss, 02/15/2019 03:29 PM
topaz_input_prep.py (2.63 KB) topaz_input_prep.py	/home/mrapp/topaz_input_prep.py	Sargis Dallakyan, 02/15/2019 03:40 PM
topaz_GUI_full.html (429 KB) topaz_GUI_full.html		Alex Noble, 02/15/2019 04:33 PM
topaz_GUI_full_3-1-19d.html (464 KB) topaz_GUI_full_3-1-19d.html		Alex Noble, 03/05/2019 05:29 PM
topaz.html (510 KB) topaz.html		Alex Noble, 10/06/2019 11:05 PM

Actions

Copy link

Updated by Neil Voss over 6 years ago

File topaz_instructions.pptx topaz_instructions.pptx added

Actions

Copy link

Updated by Sargis Dallakyan over 6 years ago

File topaz_input_prep.py topaz_input_prep.py added

Actions

Copy link

Updated by Neil Voss over 6 years ago

It looks like as of Feb 1, 2019 topaz now just uses MRC files, no more TIFF file requirement.

https://github.com/tbepler/topaz/commit/0079bc4ce46084e94a9754970e62696aa55ee301#diff-d077fc113d32c352d0d8fb0f6b400d9c

I would prefer stick to MRC, which means you would need to upgrade Topaz to a newer version. Do you see any problem with this?

Actions

Copy link

Updated by Sargis Dallakyan over 6 years ago

I don't see problem. As a side note, we have topaz running in a singularity container.

[root@SEMC-head ~]# more /gpfs/sw/bin/topaz
#!/bin/bash
/gpfs/sw/bin/singularity  exec --nv -B /gpfs:/gpfs  /gpfs/sw/singularity/images/centos7 topaz  "$@"

Actions

Copy link

Updated by Alex Noble over 6 years ago

File topaz_GUI_full.html topaz_GUI_full.html added

Topaz training needs to be done on preprocessed images (normalization in particular is a must). 'topaz preprocess' can output as mrcs, tiffs, and pngs. Note that mrcs and tiffs use floats while pngs are integers.

Things have been simplified some since the workflow that Micah wrote up. Here are some ways:

1) 'topaz train' now accepts a single folder --train-images. All that is required is that the file passed in --train-targets points to images in that directory.
2) 'topaz train' doesn't need --test-images or --test-targets. Just use the --k-fold option to subdivide the training data by a factor of 1/kfold where by default one fold is held out (--fold=0). E.g. --k-fold=5 will divide the training data into 5 chunks and the first chunk (20%) will be used as the test data [this has actually always been true].
3) 'topaz train' now accepts star, tab-delimited, and CSV particle coordinates (CSV is from the standalone GUI).
4) 'topaz convert' now exists to convert between the above formats.
5) The Relion issue with the extracted star file is fixed.

Also, I am actively writing a standalone GUI that includes command generation and particle picking all in one html file. The preprocessing command generation is done. I am working on the train and extract commands now. I have attached the current dev version. A release version will be ready in possibly a few days.

Tristan is also working on a basic tutorial, a workflow for cross-validation, and implementation of a method that will allow for pi to be optimized.

Actions

Copy link

Updated by Neil Voss over 6 years ago

Hi Micah,

Can you upload a sample of all_images.txt, all_filtered_images.txt, and all_coord.txt? I am trying to recreate these files from an Appion dump, so having an actual file would be helpful.

Actions

Copy link

Updated by Neil Voss over 6 years ago

Hi again,

I managed to find a copy. Computer was just running slow.

Actions

Copy link

Updated by Neil Voss over 6 years ago

One question.

I see that when topaz splits the test and training sets, it selects whole micrographs for each category. Would it be better to have test and training particles within the same micrograph?

Actions

Copy link

Updated by Alex Noble over 6 years ago

Hi Neil,

Tristan and I have just released Topaz v0.1.0 with a full working GUI:

https://github.com/tbepler/topaz/

You can use this GUI to understand the whole workflow and which options are useful. For instance, the user only needs to provide a directory to micrographs and particle picks (in either .star, .box, Topaz .txt, or Topaz GUI .csv files); the K-fold option takes care of splitting the dataset into test/train.

To answer your question, I think that Topaz requires test and train sets to be different sets of micrographs.

Actions

Copy link

#10

Updated by Neil Voss over 6 years ago

Hi Alex,

I know I am kind of re-inventing the wheel here, but as we discussed my initial hope was to create a fully-automated Topaz picker. I do worry that no one will use my fully automated picker, because you would have to have particle picks in Appion to launch Topaz. Let me know if I am doing something worthless to the group.

So, instead of using 15 Topaz scripts, I am doing all the heavy lifting in one simple Appion script. From the format of the input txt files, it looks like I could split the training and test sets in the same micrograph, but their might be unexpected memory limitations on the GPU side, because now we are using all 30 micrographs for training instead of 20 for training and 10 for testing.

Neil

Actions

Copy link

#11

Updated by Alex Noble over 6 years ago

File topaz_GUI_full_3-1-19d.html topaz_GUI_full_3-1-19d.html added

Hi Neil,

NYSBC will use your Appion implementation. Several internal users/staff are already using it with success and pushing resolution/isotropy with it. We want it to become routine and will likely begin including Topaz in our regular Appion workshops.

Right now, Topaz requires 3 or 4 commands that will take you from {micrographs + particle picks} to {particle coordinates}, not 15. The GUI goes through these steps and fills in most of the values for the user. I have attached it - please look it over. There is also a basic tutorial now on the GihHub page that goes over the same steps with less verbosity: https://github.com/tbepler/topaz/blob/gui/tutorial/01_quick_start_guide.ipynb

I don't think RAM considerations are an issue - someone internally tried training on 60,000 particles across several thousand images and it worked fine. It just took a little longer.

BTW, I am going on a 2-week vacation in a couple days, so I won't be available very soon. Tristan will be available: tbepler at gmail

Actions

Copy link

#12

Updated by Alex Noble over 6 years ago

Hi Neil,

What is the status on Topaz integration, and can Tristan and I help? Have you tried the Topaz GUI? It might be easy to just throw that into Appion and connect it to the database.

Thanks,
-Alex

Actions

Copy link

#13

Updated by Sargis Dallakyan over 6 years ago

Assignee changed from Neil Voss to Sargis Dallakyan

Thanks Alex, Topaz GUI looks good. I've added it to myamiweb/processing. For now, I made sure it's positioned correctly with the rest of Appion UI. I'll work on pre-processing next; will need to change the input/output and add an option to run jobs from the web UI.

Actions

Copy link

#14

Updated by Bridget Carragher over 6 years ago

Cool! I am delighted that this might be a possibility - Topaz is very popular with our users and I would lvoe for it to be an integral Appion option.

Actions

Copy link

#15

Updated by Sargis Dallakyan over 6 years ago

Finished pre-processing part. Logged in users can now submit pre-processing jobs. I've created a new submitJob function based on submitAppionJob so that it would return back to Topaz GUI, loop 100 times til it finds the output files, then select Picking tab and load images there.

Since topaz.html has around 11k lines, it takes a lot of scrolling to go up and down. Should we split it to .css and .js to make development process faster?

Actions

Copy link

#16

Updated by Bridget Carragher over 6 years ago

This sounds great! Shall we have a little mini meeting to chat this through? We can also chat at Appion developers tomorrow of course...

Actions

Copy link

#17

Updated by Sargis Dallakyan over 6 years ago

Sounds good. Would be great to have feedback from users and developers as we move forward on this project.

Actions

Copy link

#18

Updated by Alex Noble over 6 years ago

Hi Sargis,

Awesome work, this is a great start!

I took a look at the code difference and it seems that apart from adding the 'name=' to the input tags, there are only a few code changes. This is nice because it allows for easier re-integration of future GUI releases. I think the remaining parts of the GUI will be similar.

Do you have time some day to skype with me and show me what was done on the php side? I may be able to finish integration if it's relatively straight-forward.

Thanks!
-Alex

Actions

Copy link

#19

Updated by Sargis Dallakyan over 6 years ago

Hi Alex,

Sounds good. I also have saving particle picks working. I will commit that code later this afternoon. The rest should be relatively straight-forward. I have a meeting with Bridget tomorrow at 3pm EST (Tuesday July 2nd). If you can join us, we can go through code changes. If not, we can schedule a meeting some other time.

Thanks,
Sargis

Actions

Copy link

#20

Updated by Alex Noble over 6 years ago

Hi Sargis,

Buffer4 was not responding. I've rebooted it and rerun the same command with updated topazDenoiser.py. It seems to be working; created efn_td preset and processed 7 images so far.

Actions

Copy link

#26

Updated by Alex Noble about 6 years ago

Hi Sargis,

Thank you!!

I missed one important thing: Can you make the --device option in `topaz denoise` defaulted to 0, and add the option to the Appion webpage? The options are: positive integers for GPU id, or -1 for CPU.

Best,
-Alex

Actions

Copy link

#27

Updated by Alex Noble about 6 years ago

Hi Sargis,

Oh this is weird... It seems that the device is set to 0, but it doesn't use the GPU and instead uses the GPU... I am testing on semccatchup01 and krios04buffer where nobody else is running anything. Do you see the same behavior? Here is what the output is for me:

Starting image 5 ( skip:468, remain:5 ) id:12182678, file: 19jul10e_grid1new_022gr_01sq_02hln_014enn
... Pixel size: 0.854905
...
... # using device=0 with cuda=False

using model: L2

1 of 1 completed.

==== Committing data to database ====

SUMMARY: topazDenoiser
------------------------------------------
TIME: 1 min 21 sec
AVG TIME: 1.31 +/- 0.64 min
(- REMAINING TIME: 7 min 50 sec for 4 images )
-----------------------------------------

When I run a `topaz denoise` command outside of the loop, it has the same problem.

BUT when I run the same topaz_denoiser.py command on node44, it uses the GPU properly! Any idea whats the issue?

Also, could the ice thickness measurement be retained for the denoised image? And could the `topaz denoise` command be printed to the screen in the Appion loop?

Thanks!
-Alex

Actions

Copy link

#28

Updated by Alex Noble about 6 years ago

Hi Sargis,

Since Topaz Denoiser works best on non-frame aligned images, early return creates an issue: the enn images become a single frame rather than a sum. Since there is no flag in the database saying whether early return was used, could you put a flag n the Appion form asking the user if early return was used? If yes, then before denoising a normal summed enn.mrc image should be created to replace the original enn.mrc image that had one frame (using either whatever default method is used in Leginon for returning enn.mrc or motioncor2 with -Align set to 0).

This would be very useful because at least 1/3 of datasets use early return, from my estimation.

Thanks!
-Alex

Actions

Copy link

#29

Updated by Sargis Dallakyan about 6 years ago

Hi Alex,

Great thanks, I plan on making the following changes:

Print `topaz denoise` command to the screen in the Appion loop.
Add --device option and help text (positive integers for GPU id, or -1 for CPU). If we leave it blank, does topaz automatically picks the right device? I don't want to set the default to 0, in case someone else is already using device 0.

When I run on krios04buffer yesterday with no --device option, it was using GPU correctly. Do you mean that on semccatchup01 you asked for GPU and it used CPU instead?

I'll look into ice thickness measurement. I copy image info and change preset and file name in the Appion loop. The ice thickness measurement are taken from another table. I'll need to create new entries in that table or modify viewer to handle it.

I'll need to read up on what early return is and how to implement related changes.

Actions

Copy link

#30

Updated by Alex Noble about 6 years ago

Hi Sargis,

Awesome, thank you! You're the best!

When I run `topaz denoise --device 0 [...]` on either semccatchup01 or krios04buffer, it recognizes that it should use the GPU, but doesn't. ie. it says 'using device=0 with cuda=False', but it should say 'using device=0 with cuda=True'. When I monitor nvidia-smi using `watch -n 1 nvidia-smi`, it never has any processes running, but `top` shows high CPU usage by topaz and it takes about 1.5 minutes to denoise an image. When I run it on a cluster GPU node like node44, it says 'using device=0 with cuda=True', which is correct; nvidia-smi shows a topaz process running and it takes 13 seconds to denoise an image. I'm not sure what the problem here might be.

Early return is an option on the K2 camera to just return the first frame of a movie so that the camera takes less time to return control to Leginon, which speeds up collection overall. Leginon then simply writes the one frame to enn.mrc (or whatever the preset is). Since Topaz Denoiser works best on non-frame aligned images, it would be great to have the Topaz Denoiser Appion integration ask the user whether early return was used during collection - if it was, then the frames should be summed and gain corrected (not frame aligned) before denoising. It might be best to just leave the single-frame enn.mrc files alone and just add the summing+gain correction step to the denoising loop. Or maybe enn.mrc should also be replaced by the sum.

Let me know if I can explain further.

Best,
-Alex

Actions

Hi Sargis,

Sometimes the denoising loop skips images for no apparent reason. See the beginning of 19jul18g, for example. Do you know why this happens?

Thanks!
-Alex

Actions

Copy link

#36

Updated by Alex Noble about 6 years ago

I found the answer to my previous question. It was running out of memory for some images...

Sargis, could you change the default for 'patch size' from 2048 to 1536?

Thanks!
-Alex

Actions

Copy link

#37

Updated by Sargis Dallakyan about 6 years ago

Hi Alex,

Good find. I've changed the default 'patch size' to 1536. This will be live tomorrow after nightly updates.

Thanks.

Actions

Copy link

#38

Updated by Alex Noble about 6 years ago

Hi Sargis,

Denoising has been working great here! Thank you.

Has there been progress on making early return collections compatible so that the enn images can be denoised?

Thank you,
-Alex

Actions

Copy link

#39

Updated by Sargis Dallakyan about 6 years ago

Hi Alex,

Glad to hear that denoising has been working great! I've made some progress with early return option, but sidetracked into other projects recently.

I've read Leginon code and couldn't find any code that would sum and gain correct the frames. This made me think that early return option is something that is implemented in Gatan software.

I asked Anchi where to find a code to sum the frames and do gain correction. She directed me to a right place (apDDprocess.py); thanks Anchi. We have a code to do this for mrc files but not for tiff files. I recently wrote a code to read individual frames from LZW compressed tiff movies (#7713). I'll use that to sum the frames and do gain correction. Hope to have it working by the end of next week, unless something more urgent comes up.

Thank you,
Sargis

Actions

Copy link

#40

Updated by Alex Noble about 6 years ago

Wonderful, thank you Sargis! You're awesome.

Another way would be to use motioncor2 with the flag '-Align 0', and it should just sum and gain correct (I think).

Best,
-Alex

Actions

Copy link

#41

Updated by Sargis Dallakyan about 6 years ago

Added early return option. This will be available tomorrow after nightly updates.

I've used apK2process.GatanK2Processing.correctFrameImage to sum and gain correct the frames. Seems to be working fine.

Actions

File topaz.html topaz.html added

Could http://emgweb.nysbc.org/topaz.html be updated to the attached html file?

Thanks!

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Appion

Custom queries

Feature #6748

Provide native Appion support for Topaz particle picking

Updated by Neil Voss over 6 years ago

Updated by Sargis Dallakyan over 6 years ago

Updated by Neil Voss over 6 years ago

Updated by Sargis Dallakyan over 6 years ago

Updated by Alex Noble over 6 years ago

Updated by Neil Voss over 6 years ago

Updated by Neil Voss over 6 years ago

Updated by Neil Voss over 6 years ago

Updated by Alex Noble over 6 years ago

Updated by Neil Voss over 6 years ago

Updated by Alex Noble over 6 years ago

Updated by Alex Noble over 6 years ago

Updated by Sargis Dallakyan over 6 years ago

Updated by Bridget Carragher over 6 years ago

Updated by Sargis Dallakyan over 6 years ago

Updated by Bridget Carragher over 6 years ago

Updated by Sargis Dallakyan over 6 years ago

Updated by Alex Noble over 6 years ago

Updated by Sargis Dallakyan over 6 years ago

Updated by Alex Noble over 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Anchi Cheng about 6 years ago

Updated by Sargis Dallakyan about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble about 6 years ago

Updated by Alex Noble almost 6 years ago