hi,
Some process is sending a SIGTERM signal to your lam-mpi process.
signal 15=SIGTERM. this is the default signal value for the kill
command.
for more info, do "man kill" and "cat /usr/include/bits/signum.h" if
you're on a linux system. More info on what exact command sequence you
are running
might help.
_____
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
Of TLehr_at_[hidden]
Sent: Thursday, August 24, 2006 1:45 PM
To: lam_at_[hidden]
Subject: LAM: failed due to signal 15
Dear all,
I'm new to lam/mpi and have unfortunately only a limited background in
informatics.
However, I've tried to run a MPI application on a cluster system (5
nodes with 4 CPUs) and the application usually 'stocks' at one CPU.
After termination I get the following error message back:
------------------------------------------------------------------------
-----
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
PID 468 failed on node n0 (145.117.075.157) due to signal 15.
------------------------------------------------------------------------
-----
I talked to other people using the software and they never had this or
other problems before.
My question is now:
- What is signal 15? It's very hard to find information about this
signal :-(
- Has anybody an idea what I can do avoid this error?
Many thanks in advance!
Thorsten
|