LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-06-25 06:45:47


This looks like a $PATH problem -- there must be both an installation
that has modules of version 7.0.0 and 7.1.0 on at least one of your
machines. When you login interactively, your $PATH is apparently being
set properly to find the 7.1.0 version, but when you login
non-interactively, it looks like it is finding the 7.0.0 version --
resulting in the problem that you are seeing.

For example, try:

shell$ which lamboot

and compare that to:

shell$ ssh localhost which lamboot
shell$ ssh othernode which lamboot

And see if there are discrepancies about where lamboot is found.

Check the LAM FAQ in the "Booting LAM" section for information about
interactive and non-interactive logins, and how to setup your shell
startup files appropriately.

On Jun 24, 2005, at 12:38 PM, smiler21_at_[hidden] wrote:

> Hi, I'm trying to run a code between Mac and Linux and I'm getting
> either of these two errors:
>
> mpirun -ssi rpi tcp C ./sphu
> -----------------------------------------------------------------------
> ------
> It seems that [at least] one of the processes that was started with
> mpirun chose a different RPI than its peers. For example, at least
> the following two processes mismatched in their RPI selections:
>
> MPI_COMM_WORLD rank 1: tcp (v7.1.0)
> MPI_COMM_WORLD rank 0: tcp (v7.0.0)
>
> All MPI processes must choose the same RPI module and version when
> they start. Check your SSI settings and/or the local environment
> variables on each node.
> -----------------------------------------------------------------------
> ------
>
> mpirun -ssi rpi usysv C ./sphu
> -----------------------------------------------------------------------
> ------
> It seems that [at least] one of the processes that was started with
> mpirun chose a different RPI than its peers. For example, at least
> the following two processes mismatched in their RPI selections:
>
> MPI_COMM_WORLD rank 1: usysv (v7.1.0)
> MPI_COMM_WORLD rank 0: usysv (v7.0.0)
>
> All MPI processes must choose the same RPI module and version when
> they start. Check your SSI settings and/or the local environment
> variables on each node.
> -----------------------------------------------------------------------
> ------
>
> When I do laminfo on both machines, they both include:
> SSI rpi: tcp (API v1.0, Module v7.1)
> SSI rpi: usysv (API v1.0, Module v7.1)
>
> It looks like rank 0 should have (v7.1.0) instead of (v7.0.0), but
> maybe I just don't understand the error.
>
> Thanks,
> Eric
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/