Feature #397
closedprepFrealign should take specified parameter file in addition to database values
30%
Description
Currently prepFrealign gives you two choices for generating the jobs to run on a cluster: 1) you can import Eulers from a previous EMAN run, or 2) you can use Frealign to generate the Eulers. It would be very useful, however, to be able to specify a Frealign parameter file to use for the initial Eulers, and have prepFrealign prepare all the job files for you. Also, this would allow you to change other parameters from iteration to iteration like we do with EMAN.
Updated by Gabriel Lander over 14 years ago
I strongly agree with this idea - how should the input parameter file be defined?
Updated by Scott Stagg over 14 years ago
I think it should just be a command line parameter that is simply a path to the parameter file.
Updated by Gabriel Lander over 14 years ago
sorry, I meant, how would the parameters within the file be defined (particle numbers, comma/tab delimited, etc.)
Updated by Scott Stagg over 14 years ago
I guess it would follow the frealign format and prepFrealign wouldn't parse it at all. In other words, it would just take the results of a previous Frealign run or Eulers from a previously generated EMAN conversion and generate the job files from that. In other words, it would look like this:
C PSI THETA PHI SHX SHY MAG FILM DF1 DF2 ANGAST PRESA DPRES 1 353.86 92.00 115.65 -5.10 10.73 10000. 1 22888.3 22888.3 0.00 64.61 64.61 2 2.52 144.86 114.59 -1.68 5.96 10000. 1 22888.3 22888.3 0.00 67.08 67.08 3 354.14 99.45 117.08 -2.65 -0.67 10000. 1 22888.3 22888.3 0.00 67.14 67.14
Updated by Gabriel Lander over 14 years ago
I see what you mean. I thought you meant getting the initial eulers coming from some reconstruction package outside of appion & frealign.
Updated by Scott Stagg over 14 years ago
That's also a good idea. Separate out the conversion part from prepFrealign, and have another stand alone program that just generates the frealign parameter files. For that matter, another program could convert the initial model to what is read by frealign. Then all prepFrealign would need to do is generate the job files.
Updated by Neil Voss over 14 years ago
The program, as is, is separated into many subfunctions, so it should be easy to split the load as you suggest.
I was thinking of changing the way the program input parameters (not the particle input parameters) are given to frealign. I want to put them into a single file for each iteration that way it can be easily manipulated. Instead of the current system where we have a file for each processor.
What I hate is that Niko changes the format for the parameter files with each version of Frealign. Frealign v8.06 parameters are incompatible with Frealign v8.08 and vice versa. This is why tab or comma separated files are better.
Updated by Scott Stagg over 14 years ago
Neil, I completely agree with you about having the program input parameters in a single file for each iteration. That would make a huge difference if you're trying to tweak parameters without having to run prepFrealign every time.
Updated by Scott Stagg over 14 years ago
- runfrealign.py will live on the cluster.
- runfrealign.py when called will create a series of frealign jobs that split up the alignment and classification step so that it is spread over multiple processors (much like what is done now).
- It will then create and launch single processor PBS jobs that run the frealign scripts.
- It will monitor all those jobs and wait for them to finish.
- When classification jobs are finished, it will launch a single processor job to make the reconstruction. This is the slow step for frealign. I believe that new versions of frealign will allow this step to be parallelized, but that can be easily incorporated into the proposed script.
- It would require only a stack, an initial model, and the particle parameters. Thus it would be independent of the rest of appion. prepFrealign's job would only be to create those files.
- I would really want it to be a completely self contained script, so, I would want all defined functions to be contained in the runfrealign.py script.
What do you think? Do y'all know of anyone else who is running frealign who should be included in this discussion?
Updated by Scott Stagg over 14 years ago
I have a working version of the script that I outlined above. It works on my cluster, and I think I have set it up so that it will other clusters. Can I get you two to give it a try if you can? All you need is a stack, a volume, and a parameter file. The script will do the rest. First run:
runFrealign.py -h
to see the options. Then run the program with the options filled out and --setuponly to see the frealign scripts it will create. If you are feeling adventurous, you can actually try running it on the cluster. To do that, you will need to set --launcher, --ppn, --wallclock, and --maxprocs. It will launch "maxprocs" individual jobs for the refinement then 1 "ppn" job for the reconstruction. Please let me know if you have any questions.
Updated by Neil Voss over 14 years ago
It sounds redundant to prepFrealign.py, but probably a better solution mine is written for cluster without python. Does it work with uploadFrealign.py. uploadFrealign requires a particular file organization.
Updated by Scott Stagg over 14 years ago
It is redundant to prepFrealign, but the goal is to take out the script making functions of prepFrealign and only have it generate the initial files. If you, or whoever uses frealign like runFrealign, I will modify prepFrealign and uploadFrealign so that they are compatible with runFrealign.
Updated by Scott Stagg over 14 years ago
- Category set to Python scripting
- Status changed from New to In Code Review
- Assignee set to Scott Stagg
- % Done changed from 0 to 30
Updated by Amber Herold almost 14 years ago
- Assignee changed from Scott Stagg to Lauren Fisher
- Target version set to Appion/Leginon 2.2.0
Lauren will test out Scotts version of runFrealign.py.
Updated by Anchi Cheng over 13 years ago
Scott,
This is one of the few issues I was not on the watch list before :) I just asked Lauren about the status on this and it has not have new status. I would like to take a look at your version of this. Is it in the repository? If not, could you post it here?
Updated by Anchi Cheng over 13 years ago
- Assignee changed from Lauren Fisher to Scott Stagg
Scott,
We are doing an overall recon refactoring here now and the direction is quite opposite of what you propose here, so I want to get your input here.
Considering all different ways appion does 3D refinement, we've decided to have something like prepFrealign for all 3D refinement, done on local cluster or single computer where it has full access to the data appion has generated. The point is to separate the choice of the parallelization and cluster specific script from the generation of the refinement script and to allow some preprocessing of stack such as binning to be handled uniformly.
See #1336 for more description.
What I propose is that we use part of your runFrealign.py as an advanced feature for generating a modified job from what is already prepared if modifying these run parameters is what you'd like to achieve. The refinement script in the new prepRefineFrealign.py assumes only 1 processor so the modification or replacement should be easy. The prepParallelFrealign.py I wrote and committed will do the splitting (i.e., copying refine script in each iteration by particles) in the main job file before actual call to refinement of script. Dmitry and others having being using pickle to pack the parameters in the preparation. If we do the same in your modification script, it should be possible to pass back and the relevant info saved in the database without doing much work on the upload part.
Would this work?
Updated by Neil Voss over 13 years ago
I think Scott's biggest complaint (to which I agree) is that my prepFrealign scripts creates like 1000 different shell scripts each with their own parameters hard coded in. So, if one want to change a single parameter after running prepFrealign, they would have to modify 1000 files. Sadly, I did this so it would work on cluster without python installed, which was overkill.
Scott implemented a more elegant solution using a python script on the cluster node that spawns the 1000 different jobs, rather than creating it on the local side.
So, (correct me if I am wrong Scott), but just make it so it is easy to adjust parameters after the preparation is complete.
Updated by Scott Stagg over 13 years ago
Yes, Neil hit the nail on the head. There are some other advantages to how I wrote runFrealign, such as that it is faster and makes better use of cluster resources. For the Euler refinement part of an iteration, it runs many (as many as you request) single processor jobs. The reconstruction part of Frealign is unable to do parallelization on more than the cores on one node. Thus it launches a one node many core job. runFrealign expands and contracts with the needs of the refinement. There are many more aspects that I would like to discuss, but it's too hard to talk over redmine. Can we have a phone conference about this?
Updated by Anchi Cheng over 13 years ago
Part of the refactoring has a standardized way to set different parameters per iteration in the preparation step. so it is not a consideration. The prepParallelFrealign.py is of the same concept of proprogating refinement to many processors on the cluster side. I do see that runFrealign.py creates multiple thread during the alignment. It does make it easier to get resource to run them. I think that part can be put in place of the prepFrealign approach. It should work out with what we have in mind here, too.
From what I see, runFrealign.py does one equivalent of the prepFrealign,py job iteration each time. Is there further iteration that uses the output of the previous one?
Scott, if you have time to call, I should be at my desk the whole day after 11 am on Friday.
Updated by Amber Herold over 12 years ago
Anchi, what is the status on this? Can I move it to 3.0? I'm assuming there is no new code in the trunk for this at the moment.
Updated by Anchi Cheng almost 7 years ago
- Status changed from In Code Review to Won't Fix or Won't Do
has not happened all this time.