On Thu, 15 May 2003, Pak, Anne O wrote:
> [snipped]
> so far, i've only managed to get the following working:
> 1. matlab script checks for a published name
> 2. it spawns a master (using MPI_Comm_spawn)
> 3. In the spawned code, the master publishes a name (lobelia)
>
> questions:
> 1. i am using mpi_comm_spawn to spawn this independent master process,
> but mpi_comm_spawn doesn't seem to like it when i specify a maxproc < 2,
> which is why i have
>
> MPI_Comm_spawn("/home/Galadriel/matlab/anne/uplink/update_master3",
> MPI_ARGV_NULL,
> 2, MPI_INFO_NULL, 0, MPI_COMM_SELF,
> &client,MPI_ERRCODES_IGNORE);
>
> in my code, where i spawn off 2 processes. 'lobelia' is the one i treat
> as master out of the two spawned. is there another way of spawning only
> one process?
There shouldn't be a problem with spawning less than 2 processes -- what
exactly happens when you spawn 1?
> 2. because my update_master3 code exits upon completion, when i run
> MEX_master.c a second time, with the intention of just connecting to
> 'lobelia' instead of having to spawn it again, MATLAB script does see
> the published name 'lobelia', but because i exited update_master3.c upon
> completion of the previous invokation, update_master3.c is no longer
> running on 'lobelia'. how can i go about keeping 'lobelia' alive so
> that in subsequent calls to MEX_master.c, it can still connect to
> 'lobelia'. do i just not exit the function update_master3.c? does that
> meani should put in a loop somewhere? not clear on how to implement
> this!!! please help!
Yes -- your master3 program will need to loop instead of exit. Hence, it
will become an "event driven" program, and the matlab script tells it what
to do. I would suggest that your slaves use the same model -- it would be
a lot more efficient to do this rather than re-spawn everything every time
(spawning is not a fast process).
So the main loop of master3 would be something like (typed off the top
of my head -- pardon typos):
/* Publish the name */
MPI_Publish_name(...);
/* Spawn all the slaves */
MPI_Comm_spawn(...);
while (1) {
/* Get a connection from the master */
MPI_Comm_accept(...);
/* Receive the commands from the matlab script */
MPI_Recv(&command, 1, MPI_INT, 0, COMMAND_TAG, accept_comm);
if (command == DIE_COMMAND) {
send_slave_die_message();
break;
} else if (command == INITIAL_DATA_COMMAND) {
receive_data_from_master();
scatter_data_to_slaves();
start_slaves_working();
receive_slave_outputs();
send_outputs_to_matlab();
} else if (command == UPDATE_DATA_COMMAND) {
receive_updates_from_master();
scatter_updates_to_slaves();
start_slaves_working();
receive_slave_outputs();
send_outputs_to_matlab();
} else {
printf("Master got unrecognized command! (%d)\n", command);
MPI_Abort(MPI_COMM_WORLD, 0);
}
/* Disconnect from the matlab script */
MPI_Comm_disconnect(...);
}
MPI_Unpublish_name(...);
MPI_Finalize();
It'll probably need to be a little more elaborate than that (indeed, I
made some assumptions about how Matlab works, etc.), but you get the
basic idea.
>
> 3. how do i have MEX_master.c tell 'lobelia' to PLEASE DIE NOW?
Connect and send it a message with the command == DIE_COMMAND. This
will cause the logic above to quit the loop, unpublish the name, and
call MPI_Finalize to quit an in orderly fashion.
Your slave code will operate on the same principle, but you won't need
to connect/accept to them all the time. You can see above that the
Master will spawn all the slaves at once. They'll have a similar
while(1) loop to receive commands from the master. Probably 4 main
commands:
- receive scatter of initial A values
- receive updates of A values
- perform an iteration and send the results to the master
- time to die
Again, these are general suggestions -- I'm sure that you'll have to
tweak these ideas to fit your specific requirements, but I think it
should be a good enough general framework to let you do what you want.
The whole point here is that since Matlab doesn't play nicely with MPI
(i.e., it unloads your C code when the function finishes), you can't
have any persistent state in your Matlab C code. Hence, you have to
spawn a standalone master (who, in turn, spawns a set of persistent
slaves) that persists through the entire matlab job, even though
Matlab unloads your C code after every call (that's an assumption, but
that's what it sounds like Matlab is doing).
Hope that helps...
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|