Sorry for taking so long to reply. What this error means is that one
of the processes has died. This can happen if one of the processes seg
faults, or has some other kind of error.
Do you get corefiles, or have any other kind of indication that a
process has died?
On Mar 25, 2005, at 4:37 PM, Barry J Mcinnes wrote:
> We are trying to get a model operational on a Mac G5 cluster, but
> model code that runs on Lintel with mpich1.2.x gives the following
> errors, running on a local MacOS 10.3.8 box.
> Eventually we want to run under SGE6u3 on a Mac cluster.
>
> fortran code compiled with xlf 8.1 and lam-mpi compiled under MacOS
>
> + mpiexec -boot -np 2
> /Volumes/Disk/jsw/gfsdist/cfs6264/cfs.25797/cfs6228
>
>
> Deleted startup output....
>
> PROGRAM gsm HAS BEGUN. COMPILED 0.00 ORG: np23
> STARTING DATE-TIME MAR 25,2005 12:50:49.379 84 FRI 2453455
>
>
> &NAM_MRF
> FHMAX=6.00000000000000000, FHOUT=6.00000000000000000,
> FHRES=6.00000000000000000, FHZER=6.00000000000000000,
> FHSEG=0.000000000000000000E+00, FHROT=0.000000000000000000E+00,
> DELTIM=1200.00000000000000, IGEN=82, FHDFI=3.00000000000000000,
> FHSWR=1.00000000000000000, FHLWR=3.00000000000000000,
> FHCYC=0.000000000000000000E+00, RAS=F, LDIAG3D=F
> /
> From compns : iret= 0 nsout= 18 nsswr= 3 nslwr= 9 nszer= 18
> nsres= 18 nsdfi= 9 nscyc= 0 ras= F
> Reduced grid, nb points= 6536 full= 9024
> nfile,fhour,idate= 11 0.0000000000E+00 0 10 9 2003 ntozi= 1 ntcwi=
> 2 ncldi= 1 ntraci= 2 tracers= 3.000000000 vtid= 21.00000000
> 1.000000000 xgf= 0.0000000000E+00
> in fixio nread= 14 HOUR= 0.00 IDATE= 0 10 9 2003
> lonsfc,latsfc,ivssfc= 192 94 200004
> MPI_Recv: process in local group is dead (rank 1, comm 4)
> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
> Rank (1, MPI_COMM_WORLD): - MPI_Recv()
> Rank (1, MPI_COMM_WORLD): - MPI_Scatter()
> Rank (1, MPI_COMM_WORLD): - main()
>
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|