Wow. I've been having one heck of a time getting LAM/MPI to work with LS-DYNA. It's been one problem after another. I just finished reloading the OS (Redhat 9.0) on all 3 systems and I have installed a fresh copy of LAM/MPI 6.5.9 on the servers. I was unable to get this far with 7.0.3 so, I am now using 6.5.9. The current problem is when I execute the mpirun command for mpp970 (LS-DYNA's mpp version). I excecute the command "mpirun -np 4 mpp970 i=Main.k memory=200000000" This starts up mpp970 on two of the systems and it loads two processes per system. I am able to view this with top. After exactly 60 seconds of mpp970 running I get the following error message.
becker@~/test $ mpirun -np 4 mpp970 i=Main.k memory=200000000
p0_26787: p4_error: Could not gethostbyname for host edms-dyna1; may be invalid name
: 61
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).
mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------
p0_26786: p4_error: Could not gethostbyname for host edms-dyna1; may be invalid name
: 61
The weird thing is, I have all host names in the /etc/hosts file and I also have a DNS server setup resolving these hosts. On top of that in my hostfile I have used IP addresses instead of hostnames thinking that might solve the problem, but it still comes back saying 'edms-dyna1', and I am unsure of where it is even getting that hostname. Ideas?
Rob Becker
Unix Administrator
Battelle.
11-1-029D
505 King Ave.
Columbus, OH 43201
614-424-6378
|