Hello,
We have a group of users that uses "lamexec" along with an application
schema file to execute MPMD jobs. Since we upgraded to "lam-7.1.2.8",
"lamexec" fails with the messages shown below. Interesting, the users guide
for "lam 7.1.2" has the following text:
"The lamexec command is similar to mpirun but is used for non-MPI programs
"
So, the question is this: "What version(s) of LAM support "lamexec"? Our
earlier version of LAM, "lam-6.5.8-4", worked just fine using "lamexec".
Thanks,
Pat O'Bryant
Code that generated Error Messages
*********************************************
.......
lamboot -v /tmp/lam_boot.$PBS_JOBID
lamexec -w -v schema1
Error Messages
*********************
n-1<24782> ssi:boot:base:linear: booting n0 (xxxxxxxxxx)
n-1<24782> ssi:boot:base:linear: booting n1 (yyyyyyyyy)
n-1<24782> ssi:boot:base:linear: finished
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).
mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------
J.W. (Pat) O'Bryant,Jr.
Business Line Infrastructure
Technical Systems, HPC
Office: 713-431-7022
|