LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Pak, Anne O (anne.o.pak_at_[hidden])
Date: 2003-08-08 15:58:53


hello:

i have a piece of code which works perfectly fine on one cluster but doesn't, on another cluster, even though both clusters have been loaded up with the save versions of linux and LAM-MPI and are running the same exact code.

i've pinpointed the line of code that's causing the code not to work on the second cluster. its an MPI_scatter call on a variable. When i comment this MPI_scatter call out on both the master and slave side, the simulation runs to completion. When that call IS there, the simulation stalls and when you do mpitask, you notice that all the slave nodes have disappeared and the only thing remaining in mpitask is an attempt to do Comm_connect by the master node.

I've tried to use printf's to try to discern why the simulation is stalling when the MPI_scatter call is in the code. right before I call MPI_scatter, i write out the contents of the variable, which is an array that i've allocated space for using malloc (i have plenty of variables that i've malloced and scattered successfully both before AND after this fatal mpi_scatter call).

What i've noticed is that the contents of the variable is being printed out two times in a row, even though i only have code to print it out once. what does this mean? in the same position as this MPI_scatter in the code, i've tried mpi-scattering other variables and the simulation runs fine, so it seems like there's something specifically wrong with this variable. but i don't know how to go about pinpointing what the problem is...it doesn't seeem to be the size of the variable nor the length of the variable name..what else could it be? and why would it be killing off the slaves?

any help would be most appreciated as i'm basically spinning my wheels here..

thank you,

anne

___________________________________________________
Anne Pak, L1-50
Building 153 2G8
1111 Lockheed Martin Way
Sunnyvale, CA 94089
(408) 742-4369 (W)
(408) 742-4697 (F)