LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Lei_at_[hidden]
Date: 2005-07-16 19:56:07


I found this in the LAM/MPI Archives:
"As a point of clarification, the behavior of MPI_Comm_spawn (and
MPI_Comm_spawn_multiple) is such that it does not grow the MPI Universe
size, which is set when you lamboot. If you request N more processes by
using MPI_Comm_spawn then those processes will be distributed over the
resources in the universe essentially oversubscribing them, but will
not try to expand the universe by requesting more resources from PBS."

This is exactly what I was worried about. So does this mean:
1) from a pure sequential program (say, a.out), there is no way to
grow the MPI universe from a function that is computationally intensive;
2) if this sequential program is Matlab scripts + C code (via MEX),
the only way is to have Matlab create the MPI universe through the
use of an MPI toolbox MPITB from Univ. of Granada, Spain?

I just started reading about MPITB. If anybody has a simple "hello world"
example that shows how to create an MPI_COMM_WORLD in
Matalb and pass it to a MEX C code, could you please post it or
point me to the webpage? It sounds like I will need to have a
Matlab process/Matlab license on each and every node of my cluster.
Is there a way around this? This is really what I want to avoid.

Thanks a lot,

-Lei

Lei_at_ICS wrote:

>Hi Jeff:
>
>I am trying to construct a very simple prototype based on your
>suggestion #1.
>And I have another question now. In a normal MPI run, the LAM daemon
>network is started before mpirun, and mpirun will specify how many and
>which
>PEs to use by using, e.g., mpirun -np 3 a.out or mpirun n0-2 a.out.
>
>In the following quoted design, how does the master spawn a bunch of
>slaves on the PEs that I specify? In
>MPI_COMM_SPAWN(command, argv, maxprocs, info, root, comm, intercomm,
>array_of_errcodes)
>there is way to specify the max number of procs, but if the master is not
>started by mpirun, wouldn't all maxprocs processes be spawned to the
>local PE?
>Is there a way to start a LAM daemon network among a list of IPs
>using MPI_Init(int *argc, char ***argv) from a sequential program like
>a.out?
>
>Hope my question makes sense. BTW, your design seems to be exactly
>what I wanted.
>
>Thanks,
>
>-Lei
>
>
>-------------------- quoted msg ------------------------
>
>Actually, my ordering wasn't exactly right. Try this:
>
>
>
>> - your matlab script launches
>> - it calls MPI_Init
>> - check for a published name
>> - if the published name does not exist
>> - spawn a master (i.e., a new, independant process)
>> - the master spawns a bunch of slaves to do the work
>> - the master publishes a name
>> - if the published name does exist
>> - MPI_Comm_connect to the master
>> - the matlab script sends a bunch of work to the master
>> - the master farms it out to all the slaves
>> - the slaves do all the work and eventually send the result(s) to the
>> master
>> - the master sends the result(s) to the matlab script
>> - the matlab script disconnects from the master
>> - the matlab script finishes
>>
>>
>
>
>