LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: McCalla, Mac (macmccalla_at_[hidden])
Date: 2006-08-24 17:47:47


hi,
    Some process is sending a SIGTERM signal to your lam-mpi process.
signal 15=SIGTERM. this is the default signal value for the kill
command.
for more info, do "man kill" and "cat /usr/include/bits/signum.h" if
you're on a linux system. More info on what exact command sequence you
are running
might help.

  _____

From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
Of TLehr_at_[hidden]
Sent: Thursday, August 24, 2006 1:45 PM
To: lam_at_[hidden]
Subject: LAM: failed due to signal 15

Dear all,

I'm new to lam/mpi and have unfortunately only a limited background in
informatics.

However, I've tried to run a MPI application on a cluster system (5
nodes with 4 CPUs) and the application usually 'stocks' at one CPU.

After termination I get the following error message back:

------------------------------------------------------------------------
-----

One of the processes started by mpirun has exited with a nonzero exit

code. This typically indicates that the process finished in error.

If your process did not finish in error, be sure to include a "return

0" or "exit(0)" in your C code before exiting the application.

PID 468 failed on node n0 (145.117.075.157) due to signal 15.

------------------------------------------------------------------------
-----

I talked to other people using the software and they never had this or
other problems before.

My question is now:

- What is signal 15? It's very hard to find information about this
signal :-(
- Has anybody an idea what I can do avoid this error?

Many thanks in advance!

Thorsten