LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2003-06-24 20:43:59


On Mon, 23 Jun 2003, Pak, Anne O wrote:

> i have a configuration where a single master spawns about 15 slave nodes
> and runs the same piece of code on all 15 slaves . in this piece of
> code, i have something like: (pseudo-code)
>
> call subroutine_A(arguments)
> do some stuff
> call subroutine_B(arguments)
>
> about half of the slave nodes execute A and B
> about 2-3 of them only execute A only
> the rest of them execute A twice and not B

How are you establishing this? Are they doing printf's, or outputting to
a file or something? Be aware that because of timing issues, it may be
difficult to establish exactly who did what.

We have found that in the most difficult of situations, we'll have each
MPI process write to a *separate* file (e.g.,
"mpi-slave-<MPI_COMM_WORLD_RANK>.out"), and essentially do an fflush()
immediately after every fprintf(). This usually makes the output "good
enough" to read, and you don't have clashes of multiple processes writing
the same output stream.

> is there any reason why this would be, being that all slaves are
> supposed to be running the same C code??

No. Well, no *good* reason. :-(

> i am using MPI_Comm_spawn to spawn these slave processes, and I am using
> a schema file (specifed by MPI_Info variable as one of the arguments for
> MPI_Comm_spawn).
>
> the slave routine is was an excerpt from another program and it worked
> in the other program and no changes were made to the code when i copied
> it, so it should still work:(

I don't suppose that you have access to a parallel debugger, do you? I
don't want to plug, but since we [finally] got around to including support
for the Etnus TotalView parallel debugger in LAM 7.0, we have started
using TotalView on a regular basis. It is *extremely* helpful for
debugging in parallel.

> could it be a memory limitation at some of the slave nodes? the calling
> routine for this C program is a MEX function and this MEX function is
> being sent about 50 arguments. Perhaps there is a limitations on the
> number of arguments that can be sent to a C function??

That would be a MEX limitation, not a C limitation (I'm guessing). As I
understand your architecture, though, the MEX function is only talking to
a single dispatcher function, and that dispatcher is talking to the
slaves, right? So if your dispatcher is getting the 50 args properly,
this should not have any effect on the slaves.

You might want to double check that your slaves are actually getting the
messages that the dispatcher (or whoever is giving the work to them) is
sending. i.e., if you send abc123, ensure that they are receiving abc123.
If the slaves do work depending on what commands/messages they receive, if
you have a mismatch in communication, this may result in "wrong"
behavior like you are seeting.

> lastly, i can't seem to run a program over a cluster two consecuative
> times. i seem to always need to execute a 'wipe' and 'lamboot' between
> calls. does it have to do with me calling mpi_disconnect at the
> conclusion of the first run and then MPI_Init at the start of the second
> one?

Can you describe more fully what is happening? Why can't you run the code
a second time? You should be able to call disconnect and not be forced to
call lamhalt (or wipe) and lamboot again.

Is there any chance that you could send a tarball of your code?
(probably just to me; no need to send a huge tarball to everyone on the
list)

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/