LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-06-11 07:46:06


The first thing that I would recommend is upgrading (if possible); the
version of LAM that you have is *extremely* old. Many bug fixes and
improvements have been included in LAM since 6.5.4 (note that 6.5.9 is the
last stable release of the 6.5 series). Indeed, we no longer formally
support the 6.5 series -- I haven't looked at 6.5.4 in several years.
The current version is 7.0.6.

Can you upgrade and see if this fixes your problem?

On Thu, 10 Jun 2004, Yu-Cheng Chou wrote:

>
> LAM is successfully booted on node 0 , which has 2 processors.
> The code works when it is executed with single processor.
> An error occurred when it is executed with 2 processors on node 0, but it
> worked before.
> According to the error message, this error might be something system
> related.
> How to fix this problem ?
>
> Error message:
>
> [cycchou_at_matrx demos]$ make test
> lamboot -v hostfile
>
> LAM 6.5.4/MPI 2 C++/ROMIO - University of Notre Dame
>
> Executing hboot on n0 (matrx.engr.ucdavis.edu - 2 CPUs)...
> topology done
> mpirun c0-1 calpi
> --------------------------------------------------------------------------
> ---
> It seems that some error has occurred during MPI_INIT. This will
> cause your process to abort. These kinds of errors are usually
> system-related, such as running out of disk space, running out of
> memory, or something more serious such as data not being passed
> between processes properly. That is, you should not be seeing this
> error message; if you are, somethings is likely Very Wrong with your
> system. :-(
>
> Perhaps this Unix error message will help:
>
> Unix errno: 1252
> Unknown error 1252
>
> --------------------------------------------------------------------------
> ---
> --------------------------------------------------------------------------
> ---
>
> One of the processes started by mpirun has exited with a nonzero exit
> code. This typically indicates that the process finished in error.
> If your process did not finish in error, be sure to include a "return
> 0" or "exit(0)" in your C code before exiting the application.
>
> PID 20156 failed on node n0 with exit status 228.
> --------------------------------------------------------------------------
> ---
> bufferd (getroute): invalid node
> make: *** [test] Broken pipe
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/