LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Anthony J. Ciani (aciani1_at_[hidden])
Date: 2005-02-08 15:32:08


Hi Everyone,

On Tue, 8 Feb 2005, Michael Arndt wrote:
> Hello *
>
> When i use lam 7.0 my applications are not able to connect the lamd:
>
> cae1 FAU micha 71 (~/PAM/1): /usr/local/lam-7.0/bin/lamnodes
> -----------------------------------------------------------------------------
> It seems that there is no lamd running on the host cae1.
>
> This indicates that the LAM/MPI runtime environment is not operating.
>
> but:
>
> cae1 FAU micha 72 (~/PAM/1): ps auxww | grep -i lamd
> micha 10080 0.0 0.0 10844 1208 ? SN 17:40 0:00 /usr/local/lam-7.0/bin/lamd -H 192.168.100.1 -P 33488 -n 0 -o 0 -d -sessionsuffix lsf-284-0
>
> The Platform is a Quad Opteron running Redhat ES 3.0
> Lam Versions 7.0 and 7.06
> connection types i have tried: usysv and tcp
>
> Q: Are there any known problems related to Quad Opterons and / or RH ES 3.0 ?
> Q: Any proposal what i ca do for debugging ?

There are no known probs running on Opteron. The "lamd" command appears
to have been started under the LSF batch system (job 284 in your example).
If you ran "lamnodes" and "ps" outside LSF, then you wouldn't see the LAM,
but you shouldn't see the lamd running from a dead job either. If you ran
"lamnodes" from inside LSF, then it should have shown the LAM. My best
guess is that either you are not using the queues properly, and/or LSF has
been improperly configured for LAM.

Try running "lamnodes" (as a parallel job) and "ps" inside your LSF
submitted job, along with "env" and/or "set" (depending on your shell).
If "lamnodes" says there is no LAM, but "ps" shows a "lamd", then check
the environment variables for LSF variables (i.e. LSB_JOBID,
LSB_JOBINDEX, etc). If the LSF variables don't exist inside the LSF job,
then something is wrong with LSF, otherwise, something is wrong with lam.
Also, "ls /tmp" to check that the extra session info on LAM's temporary
directory matches the LSF job ID.

------------------------------------------------------------
                Anthony Ciani (aciani1_at_[hidden])
             Computational Condensed Matter Physics
     Department of Physics, University of Illinois, Chicago
                http://ciani.phy.uic.edu/~tony
------------------------------------------------------------