Qing BAO ha scritto:
> Dear all,
>
> I use 'mpirun c0-3 -np 4 my_program' to run mpi-job. But I find it only works by invoking 'c0' ( for example : c0-3 or c0,3-5). If I use 'mpirun c3-6 -np 4 my_program' (without c0), I get the following output ('echam4' is my executable program):
>
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image PC Routine Line Source
> echam4 000000000069C2D1 Unknown Unknown Unknown
> echam4 000000000069E05A Unknown Unknown Unknown
> echam4 0000000000706BA1 Unknown Unknown Unknown
> echam4 0000000000706C20 Unknown Unknown Unknown
> echam4 0000000000590694 Unknown Unknown Unknown
> echam4 00000000004B63E3 Unknown Unknown Unknown
> echam4 00000000004412AA Unknown Unknown Unknown
> libc.so.6 0000002A95EC74BB Unknown Unknown Unknown
> echam4 00000000004411EA Unknown Unknown Unknown
>
> Could you help me to figure it out?
>
Hi Qing,
the problem is likely to depend on your program or cluster setup, rather than
on LAM; without other information besides program name (I suppose you are
running Echam global atmospheric model) I would guess that MPI process with rank
0, which runs on login node with c0-3, requires some resources (files, network?)
available only on login node and not on the others, so make sure that the
files(ystems) required by your program are available (either shared through nfs
or on a local disk) on all the nodes where you run you parallel program.
greetings, Davide
--
__________________________________________________________
Davide Cesari ARPA-Servizio Idro Meteorologico __
tel (39) 051/525926 ||\
fax (39) 051/6497501 |||\
e-mail dcesari_at_[hidden] |||/
www http://www.arpa.emr.it/sim ---
Address: ARPA-SIM, Viale Silvani 6, 40122 Bologna, Italy
__________________________________________________________
|