yong.li_at_[hidden] wrote:
> lam:
> hi,I am a engineer from beijing china,recently I am in trouble when I was setting up my HPC beowulf last week. I am trying to run LS-DYNA demo on a 8 nodes cluster for demo license .The problem is following.I need your help for the problem ,thanks;
> [hpc_at_node1 ~]$ lamboot -v
>
> LAM 6.5.9/MPI 2 C++/ROMIO - Indiana University
>
> Executing hboot on n0 (node1 - 1 CPU)...
> Executing hboot on n1 (node2 - 1 CPU)...
> Executing hboot on n2 (node3 - 1 CPU)...
> Executing hboot on n3 (node4 - 1 CPU)...
> Executing hboot on n4 (node5 - 1 CPU)...
> Executing hboot on n5 (node6 - 1 CPU)...
> Executing hboot on n6 (node7 - 1 CPU)...
> Executing hboot on n7 (node8 - 1 CPU)...
> topology done
> [hpc_at_node1 ~]$ mpirun -np 8 mpp970.exe info
> -----------------------------------------------------------------------------
> It seems that [at least] one of processes that was started with mpirun
> did not invoke MPI_INIT before quitting (it is possible that more than
> one process did not invoke MPI_INIT -- mpirun was only notified of the
> first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> -----------------------------------------------------------------------------
> forrtl: info: Fortran error message number is 78.
> forrtl: warning: Could not open message catalog: ifcore_msg.cat.
> forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/ifcore_msg.cat.
> [hpc_at_node1 ~]$
>
> ps: lam version is lam-6.5.9
> dyna version is mpp970_s_intelsse_linux_lam659.exe
> OS is red hat fedora core 4 on each nodes
Your forrtl messages indicate that your application has encountered a
failure, but the file which translates error 78 into English text is
missing. Error 78 most likely is "process received a signal requesting
termination of this process." There is a range of possibilities. If
your run produced the expected lstc.log file, this termination is
normal. Otherwise, you may have to consult LSTC for assistance.
I think FC4 is much newer than lam-6.5.9, yet already obsolete, and
never officially supported by Intel Fortran or by LSTC, so you are
taking on some risks with this combination. No doubt, LSTC also
supports a more recent lam version.
|