LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-09-30 14:16:04


All I can recommend is that 7.0 is pretty ancient -- the last version of LAM/MPI was 7.1.4, IIRC.

If you're going to upgrade, is there any chance you can upgrade to Open MPI? That's where all the activity is these days.

On Sep 30, 2010, at 1:47 PM, David Shochat wrote:

> I will send config.log if necessary, but will need someone else to do
> it (I cannot for non-technical reasons).
> We are using LAM 7.0 on Sun Sparc Solaris 8 for interprocess messaging
> within a single node. We have a test which sends 10 messages of about
> 49 kbytes each in rapid succession. The send is done using MPI_Bsend
> (MPI_Buffer_attach was called during initialization, providing an 8
> Mbyte buffer). There are no errors returned from the MPI_Bsend calls.
> In the receiving process, several of the 10 messages are received
> without error (the first 2, 3, or 4 messages) but then on the next
> call to MPI_Recv, we get an error code 21=MPI_ERR_LOCALDEAD (the
> sender is not really dead at this point). We have had no problems when
> using "less stressing" messaging. This is with the tcp RPI. The
> tunable rpi_tcp_short is at its default value of 64 kb. We assume that
> is sufficient since the message is only 49 kb.
>
> Does this sound like any known bug fixed somewhere between 7.0 and
> 7.0.6? Should we be using a diffierent RPI? It seems odd to be using
> TCP when there is no actual networking involved, but this appears to
> be the default.
> -- David
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/