LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-05-27 16:54:03


On May 27, 2005, at 5:32 PM, Riju John wrote:

> I am running lam-7.0 on a Opteron cluster running SuSe 8.1.
>
> I noticed that mpirun sometimes hangs when running multiple MPI jobs.
> These jobs run on 64 slave nodes, and keep the system resources fairly
> busy. The first job is doing a fair amount of disk i/o when the second
> job starts. The second job sometimes hangs. This happens before even
> getting to MPI_init. Has anyone seen this kind of problem before. Is
> there any option in mpirun that can help with this problem.

No, I have not seen this before. Do you know if the processes start at
all? I.e., do they reach main()? Or are they stuck somewhere between
the beginning of main() and the beginning of MPI_INIT()?

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/