LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Sam Steingold (sds_at_[hidden])
Date: 2007-08-21 09:38:27


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I am trying to run lamboot twice on the same computer, first time as
user foo, the second time as user bar, with different bhost file (slave
nodes have 4 CPUs each, so I want to share them between users foo and bar).
then I run a program using mpirun as user foo and another program using
 mpirun as user bar.
my logs clearly indicate that the client processes are started and do
receive and complete the initial task, but it appears that the server
process never receives the results and thus never sends out any further
tasks - i.e., everything just sits there and nothing happens.
ps reports these lamd processes on the master node:
foo /usr/bin/lamd -H 10.10.0.1 -P 51685 -n 0 -o 0
bar /usr/bin/lamd -H 10.10.0.1 -P 36057 -n 0 -o 0
and on a slave node:
foo /usr/bin/lamd -H 10.10.0.1 -P 51685 -n 2 -o 0
bar /usr/bin/lamd -H 10.10.0.1 -P 36057 -n 2 -o 0

so, what am I doing wrong?
what does "-o" mean? (I guess "-n" is "rank", but the lamd manual does
not mention any command line options)
is it at all possible to do what I want?

thanks.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGyurTPp1Qsf2qnMcRAlM6AJ4s81VA76MUkdbYZYHc0FGN/R6c/QCeNgqH
zufKfWU3Ogkd1YM9dW//H7g=
=HrR4
-----END PGP SIGNATURE-----