SPIDER's Pubsub System for Distributed Processing

This page illustrates techniques for running SPIDER using the Pubsub system on a cluster.

Introduction

With Pubsub, SPIDER procedures can be run in parallel on a distributed cluster of computers or within a single cluster. The user places his SPIDER jobs in a shared que. Each of the subscriber machines can take jobs from the que. Each subscriber machine can specify when it will take jobs and how many jobs it can take at a time. If the machines vary greatly in processing power, it is best to partition the SPIDER jobs so that they will take a reasonable length of time (e.g. 20...60 minutes) so that the subscription process is most efficient. We have provided several master SPIDER procedures that handle the preparation of the parallel SPIDER procedures.


Running SPIDER jobs using Pubsub

  1. Must have installed Pubsub and started the subscriber daemon as described elsewhere.
  2. cd YOUR_SPIDER_WORKING_DIR
    e.g. cd $HOME/spider/data
  3. Submit your SPIDER job to the publisher using publish e.g.
    publish "./spider pam/acn @pub_refine 17 x77=17"
  4. Instructions are available for use of PubSub in:
  5. To write SPIDER batches which run under PubSub see the above examples and info in: parallel.html

Source: spider/pubsub/pubsub.html     Page updated: 6 Mar. 2009     ArDean Leith