Jeff Squyres (jsquyres) wrote:
> MPI_COMM_SELF
> actually requires the lam_basic coll module; the shmem and smp modules
> are not capable of being used on COMM_SELF.
>
> So you really need to:
>
> mpirun -ssi coll lam_basic,shmem ...
> and mpirun -ssi coll lam_basic,smp ...
Jeff,
Thanks for the clarification. smp works now, but I'm still having
trouble with shmem. On a single quad-processor node, here's what I get
from the verbose output (from one of the four processes -- the other
three are similar):
n0<28337> ssi:coll:open: Opening
n0<28337> ssi:coll:open:crossover: 4 processes
n0<28337> ssi:coll:open:associative: 0
n0<28337> ssi:coll:open: opening coll module lam_basic
n0<28337> ssi:coll:open: opened coll module lam_basic
n0<28337> ssi:coll:open: opening coll module shmem
n0<28337> ssi:coll:open: opened coll module shmem
n0<28337> ssi:coll:open: skipping non-selected module smp
n0<28337> ssi:coll:query: querying coll module shmem
n0<28337> ssi:coll:query: coll module shmem available
n0<28337> ssi:coll:query: querying coll module lam_basic
n0<28337> ssi:coll:query: coll module lam_basic available
n0<28337> ssi:coll:init_comm: new communicator: MPI_COMM_SELF (cid 1)
n0<28337> ssi:coll:init_comm: module not available: shmem, priority: -1
n0<28337> ssi:coll:init_comm: module available: lam_basic, priority: 100
n0<28337> ssi:coll:lam_basic: init communicator MPI_COMM_SELF
n0<28337> ssi:coll:init_comm: Selected coll module lam_basic
n0<28337> ssi:coll:init_comm: new communicator: MPI_COMM_WORLD (cid 0)
n0<28337> ssi:coll:init_comm: module not available: shmem, priority: -1
n0<28337> ssi:coll:init_comm: module available: lam_basic, priority: 0
n0<28337> ssi:coll:lam_basic: init communicator MPI_COMM_WORLD
n0<28337> ssi:coll:init_comm: Selected coll module lam_basic
n0<28337> ssi:coll:init_comm: new communicator: <no name> (cid 2)
n0<28337> ssi:coll:init_comm: module not available: shmem, priority: -1
n0<28337> ssi:coll:init_comm: module available: lam_basic, priority: 0
n0<28337> ssi:coll:lam_basic: init communicator
n0<28337> ssi:coll:init_comm: Selected coll module lam_basic
n0<28337> ssi:coll:finalize_comm: communicator: MPI_COMM_SELF (cid 1)
n0<28337> ssi:coll:lam_basic: finalize communicator MPI_COMM_SELF
n0<28337> ssi:coll:finalize_comm: communicator: MPI_COMM_WORLD (cid 0)
n0<28337> ssi:coll:lam_basic: finalize communicator MPI_COMM_WORLD
n0<28337> ssi:coll:finalize_comm: communicator: <no name> (cid 2)
n0<28337> ssi:coll:lam_basic: finalize communicator
n0<28337> ssi:coll:close: Closing
Is there some option that will show me why shmem is not being selected?
I tried increasing the coll_verbose level from 1000 to 10000, but the
output was the same.
A few more details: 4 GB of physical memory on the box, 8 GB of virtual
memory, maximum shared memory segment size (a tunable Solaris parameter)
is set to 256 MB, 20 segments available.
I'm using "usysv" as the RPI; should I be using something different?
-Tom
--
Tom Crockett
College of William and Mary email: tom_at_[hidden]
Computational Science Cluster phone: (757) 221-2762
Savage House fax: (757) 221-2023
P.O. Box 8795
Williamsburg, VA 23187-8795
|