LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Bartlomiej Balcerek (Bartlomiej.Balcerek_at_[hidden])
Date: 2005-04-11 04:45:06


Hello,
I have strange problem when trying to run my MPI app:

bartol_at_n13:~/cpmd/r3$ /usr/local/lam-7.1.1-intel81/bin/mpirun -v -np 4 /usr/local/CPMD/bin/cpmd.x-lammpi ./cls_R2F.inp
15420 /usr/local/CPMD/bin/cpmd.x-lammpi running on n0 (o)
15070 /usr/local/CPMD/bin/cpmd.x-lammpi running on n1
15430 /usr/local/CPMD/bin/cpmd.x-lammpi running on n2
15421 /usr/local/CPMD/bin/cpmd.x-lammpi running on n0 (o)
-----------------------------------------------------------------------------
It seems that some error has occurred during MPI_INIT. This will
cause your process to abort. These kinds of errors are usually
system-related, such as running out of disk space, running out of
memory, or something more serious such as data not being passed
between processes properly. That is, you should not be seeing this
error message; if you are, something is likely Very Wrong with your
system. :-(

Perhaps this Unix error message will help:

        Unix errno: 14
        Bad address

-----------------------------------------------------------------------------

This situation appears after kernel upgrade, from Debian 2.6.8.1-mckinley
to 2.6.11.6 (Vanilla). I used the same config for both kernel versions.
I've done some attempts on a few Intel Tiger2 (Itanium 2) machines.
LAM is configured with defaults, I use rsh for boot and lamd as rpi.

How can I further investigate the problem ?

regards

--
Bartlomiej Balcerek
Technical University of Wroclaw, Poland
Wroclaw Centre of Networking and Supercomputing
phone: +48 (71) 320-20-43 mail: bartol_at_[hidden]