Project

General

Profile

Actions

Bug #1081

closed

Frealiagn job launcher not working

Added by Bridget Carragher almost 14 years ago. Updated almost 14 years ago.

Status:
Closed
Priority:
Immediate
Assignee:
Eric Hou
Category:
-
Target version:
Start date:
12/13/2010
Due date:
% Done:

0%

Estimated time:
Affected Version:
Appion/Leginon 2.1.0
Show in known bugs:
No
Workaround:

Description

I tried to launch a Frealgn job - thought I did all the right things adn used a previous EMAN job as a starter. But the dmf commands are broken at that start as in: (see http://cronus3.scripps.edu/myamiweb/processing/runFrealign.php?expId=8152)

dmf mkdir -p /home/bcarr/appion/10oct21z/recon/frealign_recon26/

dmf put / /home/bcarr/appion/10oct21z/recon/frealign_recon26/

dmf put /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16/upload-emd_1571bin4-10nov04o17.mrc /home/bcarr/appion/10oct21z/recon/frealign_recon26/upload-emd_1571bin4-10nov04o17.mrc

echo done

Not sure waht to do next. Has anyone successfully run Frealign?


Files

frealign_recon58.job.o18370 (5.35 KB) frealign_recon58.job.o18370 Lauren Fisher, 01/19/2011 12:06 PM

Related issues 1 (0 open1 closed)

Related to Appion - Bug #1090: Extra dmf command in get files from dmf (after EMAN run on garibaldi) that is not neededWon't Fix or Won't Do Amber Herold12/15/2010

Actions
Actions #1

Updated by Neil Voss almost 14 years ago

I stop using DMF, so it may not have been tested properly. I thought I told Pick-wei to try it out using DMF and it worked for him.

Actions #2

Updated by Neil Voss almost 14 years ago

You know its probably (most likely) that the garibaldi.php/guppy.php files are not setup correctly on whatever server you are running on.

The files in my home directory do not use DMF, but I thought I fixed the ones on cronus3 main install.

Actions #3

Updated by Bridget Carragher almost 14 years ago

Well I do not think anything to do with dmf is working right. Anyone have a pointer as to where to start on fixing this? Or should we just abandon dmf completely. How are people using Appion avoiding using dmf? Are the refines being submitted by hand? I woudl liekt o get this issue solved as a high priority.

Actions #4

Updated by Amber Herold almost 14 years ago

  • Priority changed from Normal to High
  • Target version set to Appion/Leginon 2.2.0
  • Deliverable set to 2.2 Bug Reduction
Actions #5

Updated by Neil Voss almost 14 years ago

The way to run without DMF is to setup your account so you can log into amibox03 (or something similar) from garibaldi without using your password. If that is setup we can put commands in the garibaldi script to download the necessary file without having your password.

Unfortunately, Bridget, I tried to setup your account to do this, but I could not get it working. Christopher may have better luck, but I could not figure it out.

Actions #6

Updated by Lauren Fisher almost 14 years ago

DMF is working fine for me, but frealign job launcher is not. I think the problem is the php page is not finding the stack. On the form, the "Stack (img or hed):" default input is blank and even when I enter in the stack filename with the path the job file doesn't contain the stack information. Notice stackname1 and stackpath are blank in the jobfile excerpt below.

## frealign_recon29
#PBS -l nodes=4:ppn=4
#PBS -l walltime=240:00:00
#PBS -l cput=240:00:00
#PBS -l mem=4gb
#PBS -m e
#PBS -r n
#PBS -j oe

~bcarr/appionbin/updateAppionDB.py 165 R 286

# jobId: 165
# projectId: 286
# refineStackId: 25
# reconStackId: 
# modelId: 9

#DEBUGGING INFO
# clusterfull ~lfisher/appion/10oct21z/recon/frealign_recon29
# jobname     frealign_recon29
# jobfile     frealign_recon29.job
# outfullpath /ami/data00/appion/10oct21z/recon/frealign_recon29/
# stackname1  
# stackpath   
# modelname   upload-emd_1571bin4-10nov04o17.mrc
# modelpath   /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16
#   send file /ami/data00/appion/10oct21z/stacks/stack22/start.hed
#   send file /ami/data00/appion/10oct21z/stacks/stack22/start.img
#   send file /ami/data00/appion/10oct21z/recon/frealign_recon29/frealign_recon29.tar
#   get file  results.tgz
#   get file  models.tgz

In addition, the "put files in DMF" commands don't have any stack information. If the stacks are not copied to DMF then it can not run. Notice how line 2 is putting nothing into DMF, I believe this is where the stack path should be.

dmf mkdir -p /home/lfisher/appion/10oct21z/recon/frealign_recon29/

dmf put / /home/lfisher/appion/10oct21z/recon/frealign_recon29/

dmf put /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16/upload-emd_1571bin4-10nov04o17.mrc /home/lfisher/appion/10oct21z/recon/frealign_recon29/upload-emd_1571bin4-10nov04o17.mrc

echo done

If you try to just copy and paste these commands it gives you this error:

storing / to dmf:/home/lfisher/appion/10oct21z/recon/frealign_recon29/
rcp: /: not a plain file

Command failed to execute.
Wed Dec 15 11:37:02 PST 2010

When I manually copy the stack using:

dmf put /ami/data00/appion/10oct21z/stacks/stack22/start.* /home/lfisher/appion/10oct21z/recon/frealign_recon29/

I get a new error in the job output file telling me:

scp: /home/lfisher/appion/10oct21z/recon/frealign_recon29/frealign_recon29.tar: No such file or directory

Therefore I manually copied this file to DMF as well using:

dmf put /ami/data00/appion/10oct21z/recon/frealign_recon29/frealign_recon29.tar /home/lfisher/appion/10oct21z/recon/frealign_recon29/

Now it is telling me:

mpirun: Command not found. 

And that's where I get stuck because while logged into garibaldi I can run "which mpirun" and it is found.

Actions #7

Updated by Neil Voss almost 14 years ago

The missing stack info could come from a bad garibaldi.php file.

The part

mpirun: Command not found.

is related to your garibaldi setup. Are you sourcing the appion.csh (or whatever it is that sets up your environment) in Bridget's home directory?

Actions #8

Updated by Christopher Irving almost 14 years ago

I changed Lauren's .cshrc to add /lustre/people/applications/openmpi/current/gnu/bin to her path and /lustre/people/applications/openmpi/current/gnu/lib to her LD_LIBRARY_PATH. Mpirun should be found now when she runs.

-Christopher

Actions #9

Updated by Lauren Fisher almost 14 years ago

  • Workaround updated (diff)

Adding "source ~bcarr/appion.csh" to my .cshrc also fixes the mpirun issue, but I think Christopher's fix gives me the latest version.

Frealign is now running on garibaldi for session 10oct21z (frealign_recon29.job) with the workaround. I'll try to find out why the stack info isn't being found. I think something isn't being set properly in runFrealign.php (maybe in summarytables.inc?).

Neil, what is garibaldi.php and where would that be located? I can't seem to find it.

Actions #10

Updated by Neil Voss almost 14 years ago

garibaldi.php is file created on the webserver, so it would be on cronus3.scripps.edu in the main web folder where you access myamiweb. A single file is used for all users, Anchi is probably your best bet to get it working.

Actions #11

Updated by Amber Herold almost 14 years ago

  • Assignee set to Dmitry Lyumkis

Dmitry and Christopher are working on DMF issues.

Actions #12

Updated by Lauren Fisher almost 14 years ago

  • Status changed from New to Assigned
  • Assignee changed from Dmitry Lyumkis to Eric Hou
  • Priority changed from High to Immediate
Actions #14

Updated by Eric Hou almost 14 years ago

  • Status changed from Assigned to In Code Review
  • Assignee changed from Eric Hou to Neil Voss

In cluster configure file (garibaldi.php), the function ‘post_data’ needs to have two variables (stackval and model). The runFrealign tool is missing the ‘stackval’ variable; therefore all the stack information got missing in the function call.
The other problem is, it also needs a hidden input call ‘stackval’ for cluster configure file to processing the stack information.

Neil:
Can you go over this, because I saw for some reason you change the variable name from ‘stackval’ to ‘refinestackvals’.

To test:
Please run through Frealign tool from beginning to the end.

Thanks.

Eric

Actions #15

Updated by Neil Voss almost 14 years ago

Make sure you consult the file default_cluster.php for making changes, because who knows where garibaldi.php came from -- it is not in subversion and likely needs upgrading. Sorry, I am not really in position to test this -- it is too much of a pain to do it way offsite.

Actions #16

Updated by Neil Voss almost 14 years ago

Oh, to answer your stackval question. Frealign has not one but two input stacks, so it is configured slightly different.

Actions #17

Updated by Eric Hou almost 14 years ago

  • Status changed from In Code Review to In Test
  • Assignee changed from Neil Voss to Lauren Fisher

After review the default_cluster, the post_data function required two POST variables 'stackval' and 'model' which are missing in the runFrealign tool.

Bug should fixed now, and ready for testing.
Thanks.

Eric

Actions #18

Updated by Lauren Fisher almost 14 years ago

I have two jobs running right now..one on Guppy and one on Garibaldi. Both look good so far! I believe everything got transferred to the proper directories successfully. As soon as I verify both jobs run to completion I will close out this bug. Thanks Eric!

Actions #19

Updated by Neil Voss almost 14 years ago

Good work guys, sorry I was not of more help.

Actions #20

Updated by Lauren Fisher almost 14 years ago

Frealign ran in my sandbox, but not on betamyami. I'm pretty sure this is still a DMF issue because I am set up to bypass DMF in my sandbox. There need to be a couple more files that get put into DMF in order for it to run. Right now the .hed stack file and the model are the only ones being transferred. The .img stack file and .tar file need to be transferred as well. The final "Put files in DMF" commands should be:

dmf mkdir -p /home/lfisher/appion/10oct21z/recon/frealign_recon26/

dmf put /ami/data00/appion/10oct21z/stacks/centered25/ali.hed /home/lfisher/appion/10oct21z/recon/frealign_recon26/ali.hed

dmf put /ami/data00/appion/10oct21z/stacks/centered25/ali.img /home/lfisher/appion/10oct21z/recon/frealign_recon26/ali.img

dmf put /ami/data00/appion/10oct21z/recon/frealign_recon26/frealign_recon26.tar /home/lfisher/appion/10oct21z/recon/frealign_recon26/frealign_recon26.tar

dmf put /ami/data00/appion/10oct21z/models/accepted/file_10nov04o16/upload-emd_1571bin4-10nov04o17.mrc /home/lfisher/appion/10oct21z/recon/frealign_recon26/upload-emd_1571bin4-10nov04o17.mrc

echo done

Actions #21

Updated by Lauren Fisher almost 14 years ago

All stages of Frealign run properly (prep > launch > upload) through betamyamiweb (the trunk) on the garibaldi cluster. Myamiweb (the branch) is not set up to bypass dmf so it still requires the user to copy the dmf commands to the terminal. The commands to copy the .img and .tar files are missing from the "Put files in DMF" pop-up box. Until this is fixed, it will not run on garibaldi using Myamiweb. When running on guppy it creates links to all the right stack/tar files, but then it crashes during iteration 1 and gives a long output file that means nothing to me (see attachment).

Actions #22

Updated by Lauren Fisher almost 14 years ago

  • Status changed from In Test to Assigned
  • Assignee changed from Lauren Fisher to Eric Hou
Actions #23

Updated by Eric Hou almost 14 years ago

  • Status changed from Assigned to Closed

Myamiweb won't get the newest change until we have a new branch.

Thanks.

Eric

Actions

Also available in: Atom PDF