LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Neil Storer (Neil.Storer_at_[hidden])
Date: 2004-03-24 13:05:14


Hi,

You could try testing the return code for "215" (the exit code if the daemons
are not running).

e.g.
      EXIT_CODE=0
      if [ $npart -le 1 ]; then
        program.exe
      else
        tping >/dev/null 2>&1 || EXIT_CODE=$?
        if [ $EXIT_CODE -eq 215 ] ; then
           lamboot -v
        fi
        mpiexec -machinefile ... -n 1 master.exe <config
      fi

"tping" is part of the lam-mpi distribution. My system is down at present, so I
can't check this part.

The "|| EXIT_CODE=$?" construct should allow the job to carry on even if the
user is using a "set -e" to abort on errors.

Regards
        Neil

-- 
+-----------------+---------------------------------+------------------+
| Neil Storer     |    Head: Systems S/W Section    | Operations Dept. |
+-----------------+---------------------------------+------------------+
| ECMWF,          | email: neil.storer_at_[hidden]    |    //=\\  //=\\  |
| Shinfield Park, | Tel:   (+44 118) 9499353        |   //   \\//   \\ |
| Reading,        |        (+44 118) 9499000 x 2353 | ECMWF            |
| Berkshire,      | Fax:   (+44 118) 9869450        | ECMWF            |
| RG2 9AX,        |                                 |   \\   //\\   // |
| UK              | URL:   http://www.ecmwf.int/    |    \\=//  \\=//  |
+--+--------------+---------------------------------+----------------+-+
    | ECMWF is the European Centre for Medium-Range Weather Forecasts |
    +-----------------------------------------------------------------+