Bug #1081
closedFrealiagn job launcher not working
0%
Description
I tried to launch a Frealgn job - thought I did all the right things adn used a previous EMAN job as a starter. But the dmf commands are broken at that start as in: (see http://cronus3.scripps.edu/myamiweb/processing/runFrealign.php?expId=8152)
dmf mkdir -p /home/bcarr/appion/10oct21z/recon/frealign_recon26/
dmf put / /home/bcarr/appion/10oct21z/recon/frealign_recon26/
dmf put /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16/upload-emd_1571bin4-10nov04o17.mrc /home/bcarr/appion/10oct21z/recon/frealign_recon26/upload-emd_1571bin4-10nov04o17.mrc
echo done
Not sure waht to do next. Has anyone successfully run Frealign?
Files
Updated by Neil Voss almost 14 years ago
I stop using DMF, so it may not have been tested properly. I thought I told Pick-wei to try it out using DMF and it worked for him.
Updated by Neil Voss almost 14 years ago
You know its probably (most likely) that the garibaldi.php/guppy.php files are not setup correctly on whatever server you are running on.
The files in my home directory do not use DMF, but I thought I fixed the ones on cronus3 main install.
Updated by Bridget Carragher almost 14 years ago
Well I do not think anything to do with dmf is working right. Anyone have a pointer as to where to start on fixing this? Or should we just abandon dmf completely. How are people using Appion avoiding using dmf? Are the refines being submitted by hand? I woudl liekt o get this issue solved as a high priority.
Updated by Amber Herold almost 14 years ago
- Priority changed from Normal to High
- Target version set to Appion/Leginon 2.2.0
- Deliverable set to 2.2 Bug Reduction
Updated by Neil Voss almost 14 years ago
The way to run without DMF is to setup your account so you can log into amibox03 (or something similar) from garibaldi without using your password. If that is setup we can put commands in the garibaldi script to download the necessary file without having your password.
Unfortunately, Bridget, I tried to setup your account to do this, but I could not get it working. Christopher may have better luck, but I could not figure it out.
Updated by Lauren Fisher almost 14 years ago
DMF is working fine for me, but frealign job launcher is not. I think the problem is the php page is not finding the stack. On the form, the "Stack (img or hed):" default input is blank and even when I enter in the stack filename with the path the job file doesn't contain the stack information. Notice stackname1 and stackpath are blank in the jobfile excerpt below.
## frealign_recon29 #PBS -l nodes=4:ppn=4 #PBS -l walltime=240:00:00 #PBS -l cput=240:00:00 #PBS -l mem=4gb #PBS -m e #PBS -r n #PBS -j oe ~bcarr/appionbin/updateAppionDB.py 165 R 286 # jobId: 165 # projectId: 286 # refineStackId: 25 # reconStackId: # modelId: 9 #DEBUGGING INFO # clusterfull ~lfisher/appion/10oct21z/recon/frealign_recon29 # jobname frealign_recon29 # jobfile frealign_recon29.job # outfullpath /ami/data00/appion/10oct21z/recon/frealign_recon29/ # stackname1 # stackpath # modelname upload-emd_1571bin4-10nov04o17.mrc # modelpath /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16 # send file /ami/data00/appion/10oct21z/stacks/stack22/start.hed # send file /ami/data00/appion/10oct21z/stacks/stack22/start.img # send file /ami/data00/appion/10oct21z/recon/frealign_recon29/frealign_recon29.tar # get file results.tgz # get file models.tgz
In addition, the "put files in DMF" commands don't have any stack information. If the stacks are not copied to DMF then it can not run. Notice how line 2 is putting nothing into DMF, I believe this is where the stack path should be.
dmf mkdir -p /home/lfisher/appion/10oct21z/recon/frealign_recon29/ dmf put / /home/lfisher/appion/10oct21z/recon/frealign_recon29/ dmf put /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16/upload-emd_1571bin4-10nov04o17.mrc /home/lfisher/appion/10oct21z/recon/frealign_recon29/upload-emd_1571bin4-10nov04o17.mrc echo done
If you try to just copy and paste these commands it gives you this error:
storing / to dmf:/home/lfisher/appion/10oct21z/recon/frealign_recon29/ rcp: /: not a plain file Command failed to execute. Wed Dec 15 11:37:02 PST 2010
When I manually copy the stack using:
dmf put /ami/data00/appion/10oct21z/stacks/stack22/start.* /home/lfisher/appion/10oct21z/recon/frealign_recon29/
I get a new error in the job output file telling me:
scp: /home/lfisher/appion/10oct21z/recon/frealign_recon29/frealign_recon29.tar: No such file or directory
Therefore I manually copied this file to DMF as well using:
dmf put /ami/data00/appion/10oct21z/recon/frealign_recon29/frealign_recon29.tar /home/lfisher/appion/10oct21z/recon/frealign_recon29/
Now it is telling me:
mpirun: Command not found.
And that's where I get stuck because while logged into garibaldi I can run "which mpirun" and it is found.
Updated by Neil Voss almost 14 years ago
The missing stack info could come from a bad garibaldi.php file.
The part
mpirun: Command not found.
is related to your garibaldi setup. Are you sourcing the appion.csh (or whatever it is that sets up your environment) in Bridget's home directory?
Updated by Christopher Irving almost 14 years ago
I changed Lauren's .cshrc to add /lustre/people/applications/openmpi/current/gnu/bin to her path and /lustre/people/applications/openmpi/current/gnu/lib to her LD_LIBRARY_PATH. Mpirun should be found now when she runs.
-Christopher
Updated by Lauren Fisher almost 14 years ago
- Workaround updated (diff)
Adding "source ~bcarr/appion.csh" to my .cshrc also fixes the mpirun issue, but I think Christopher's fix gives me the latest version.
Frealign is now running on garibaldi for session 10oct21z (frealign_recon29.job) with the workaround. I'll try to find out why the stack info isn't being found. I think something isn't being set properly in runFrealign.php (maybe in summarytables.inc?).
Neil, what is garibaldi.php and where would that be located? I can't seem to find it.
Updated by Neil Voss almost 14 years ago
garibaldi.php is file created on the webserver, so it would be on cronus3.scripps.edu in the main web folder where you access myamiweb. A single file is used for all users, Anchi is probably your best bet to get it working.
Updated by Amber Herold almost 14 years ago
- Assignee set to Dmitry Lyumkis
Dmitry and Christopher are working on DMF issues.
Updated by Lauren Fisher almost 14 years ago
- Status changed from New to Assigned
- Assignee changed from Dmitry Lyumkis to Eric Hou
- Priority changed from High to Immediate
Updated by Eric Hou almost 14 years ago
- Status changed from Assigned to In Code Review
- Assignee changed from Eric Hou to Neil Voss
In cluster configure file (garibaldi.php), the function ‘post_data’ needs to have two variables (stackval and model). The runFrealign tool is missing the ‘stackval’ variable; therefore all the stack information got missing in the function call.
The other problem is, it also needs a hidden input call ‘stackval’ for cluster configure file to processing the stack information.
Neil:
Can you go over this, because I saw for some reason you change the variable name from ‘stackval’ to ‘refinestackvals’.
To test:
Please run through Frealign tool from beginning to the end.
Thanks.
Eric
Updated by Neil Voss almost 14 years ago
Make sure you consult the file default_cluster.php
for making changes, because who knows where garibaldi.php
came from -- it is not in subversion and likely needs upgrading. Sorry, I am not really in position to test this -- it is too much of a pain to do it way offsite.
Updated by Neil Voss almost 14 years ago
Oh, to answer your stackval question. Frealign has not one but two input stacks, so it is configured slightly different.
Updated by Eric Hou almost 14 years ago
- Status changed from In Code Review to In Test
- Assignee changed from Neil Voss to Lauren Fisher
After review the default_cluster, the post_data function required two POST variables 'stackval' and 'model' which are missing in the runFrealign tool.
Bug should fixed now, and ready for testing.
Thanks.
Eric
Updated by Lauren Fisher almost 14 years ago
I have two jobs running right now..one on Guppy and one on Garibaldi. Both look good so far! I believe everything got transferred to the proper directories successfully. As soon as I verify both jobs run to completion I will close out this bug. Thanks Eric!
Updated by Neil Voss almost 14 years ago
Good work guys, sorry I was not of more help.
Updated by Lauren Fisher almost 14 years ago
Frealign ran in my sandbox, but not on betamyami. I'm pretty sure this is still a DMF issue because I am set up to bypass DMF in my sandbox. There need to be a couple more files that get put into DMF in order for it to run. Right now the .hed stack file and the model are the only ones being transferred. The .img stack file and .tar file need to be transferred as well. The final "Put files in DMF" commands should be:
dmf mkdir -p /home/lfisher/appion/10oct21z/recon/frealign_recon26/ dmf put /ami/data00/appion/10oct21z/stacks/centered25/ali.hed /home/lfisher/appion/10oct21z/recon/frealign_recon26/ali.hed dmf put /ami/data00/appion/10oct21z/stacks/centered25/ali.img /home/lfisher/appion/10oct21z/recon/frealign_recon26/ali.img dmf put /ami/data00/appion/10oct21z/recon/frealign_recon26/frealign_recon26.tar /home/lfisher/appion/10oct21z/recon/frealign_recon26/frealign_recon26.tar dmf put /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16/upload-emd_1571bin4-10nov04o17.mrc /home/lfisher/appion/10oct21z/recon/frealign_recon26/upload-emd_1571bin4-10nov04o17.mrc echo done
Updated by Lauren Fisher almost 14 years ago
- File frealign_recon58.job.o18370 frealign_recon58.job.o18370 added
- Workaround updated (diff)
All stages of Frealign run properly (prep > launch > upload) through betamyamiweb (the trunk) on the garibaldi cluster. Myamiweb (the branch) is not set up to bypass dmf so it still requires the user to copy the dmf commands to the terminal. The commands to copy the .img and .tar files are missing from the "Put files in DMF" pop-up box. Until this is fixed, it will not run on garibaldi using Myamiweb. When running on guppy it creates links to all the right stack/tar files, but then it crashes during iteration 1 and gives a long output file that means nothing to me (see attachment).
Updated by Lauren Fisher almost 14 years ago
- Status changed from In Test to Assigned
- Assignee changed from Lauren Fisher to Eric Hou
Updated by Eric Hou almost 14 years ago
- Status changed from Assigned to Closed
Myamiweb won't get the newest change until we have a new branch.
Thanks.
Eric