Setup job submission server » History » Revision 18
Revision 17 (Neil Voss, 05/27/2010 11:20 AM) → Revision 18/43 (Neil Voss, 05/27/2010 11:29 AM)
h1. Setup job submission server
In this case, we are setting up a job submission server that will have all of the data directories mounted and external packages installed (EMAN, Xmipp, etc.) on the system. Most institutions have a job submission server already, but the data is not accessible. Appion is not set up for this scenario except for large reconstruction jobs.
______
h2. PBS and the Torque Resource Manager
PBS stands for a "Portable Batch System":http://en.wikipedia.org/wiki/Portable_Batch_System. It is a job submission system meaning that users submit many jobs and the server prioritizes and executes each job as resources permit. Below we show how to install the popular open source PBS system called "TORQUE":http://en.wikipedia.org/wiki/TORQUE_Resource_Manager.
A TORQUE cluster consists of one head node and many compute nodes. The head node runs the *pbs_server daemon* and the compute nodes run the *pbs_mom daemon*. Client commands for submitting and managing jobs can be installed on any host (including hosts not running pbs_server or pbs_mom). More documentation about Torque is "available here.":http://www.clusterresources.com/products/torque/docs/
______
h2. Head node installation
h3. Install Torque-server
Torque available in with Fedora and CentOS 5.4 (through the EPEL). For YUM based systems type:
<pre>
sudo yum -y install torque-server torque-scheduler
</pre>
h3. Initialize Torque-server, because PATH setting you will need to become root
<pre>
sudo su
/usr/share/doc/torque-2.3.10/torque.setup root
exit
</pre>
h3. Activate Torque-server
Enable the torque pbs_mom daemon on reboot:
<pre>
sudo /sbin/chkconfig pbs_server on
sudo /sbin/service pbs_server restart
</pre>
h3. Add nodes to Torque-server nodes file: /var/torque/server_priv/nodes
The format is:
<pre>
node-name[:ts] [np=] [properties]
</pre>
To add the localhost with two processors as a node, you would add:
<pre>
localhost np=2
</pre>
You should add every *compute node* to this file, e.g.,
<pre>
node01.INSTITUTE.EDU np=2
node02.INSTITUTE.EDU np=4
node03.INSTITUTE.EDU np=2
</pre>
______
h2. Compute node installation
h3. Install Torque-mom
Torque available in with Fedora and CentOS 5.4 (through the EPEL). For YUM based systems type:
<pre>
sudo yum -y install torque-mom torque-client
</pre>
h3. Activate Torque-mom
Enable the torque pbs_mom daemon on reboot:
<pre>
sudo /sbin/chkconfig pbs_mom on
sudo /sbin/service pbs_mom start
</pre>
h3. Configure node to receive jobs from headnode:
bq. see http://www.clusterresources.com/products/torque/docs/1.2basicconfig.shtml#initializenode for more details
Edit the /var/torque/mom_priv/config file:
<pre>
$pbsserver headnode # hostname running pbs_server
</pre>
For the localhost add:
<pre>
$pbsserver localhost # hostname running pbs_server
</pre>
_________
[[Setup Remote Processing|^ Setup Remote Processing]] | [[Configure web server to submit jobs|Configure web server to submit jobs >]]
______