LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2003-08-26 08:35:37


A quick look at this code (transport.cc) shows that it is not safe -- it
loops over MPI_Send's without any corresponding MPI_Recv's. More
specifically, it assumes buffering of the MPI layer of which the sysv and
usysv RPI's do not provide much/any. It works with TCP because TCP
provides (by default) 64k of buffering. I'd suggest mailing the DaSSF
author(s) and let them know about this problem.

Troll through the mailing lists for past discussions on this topic -- a
quick check to ensure that an MPI program is safe/portable is to replace
all MPI_Send's with MPI_Ssends.

Also, as you correctly determined, DaSSF's helper scripts assume MPICH's
mpirun format.

On Fri, 22 Aug 2003, Ferris McCormick wrote:

> I have just installed lammpi-7.0 on a complex of two dual
> sparc-SMP linux systems, each of which identifies itself
> from 'uname -srvmpio' as
>
> Linux 2.4.21 #1 SMP Thu Jul 17 20:39:05 UTC 2003 sparc64 sun4u TI
> UltraSparc II (BlackBird) GNU/Linux
>
> I am currently working with DaSSF+MPICH on these systems, and I
> am hoping to compare LAMMPI and MPICH as communications engines
> for this simulation package (it uses MPI-any implementation).
>
> Here is what I am seeing, and I am asking for hints addressing the
> noted lammpi problems.
>
> 1. As a baseline, DaSSF+MPICH works fine;
> 2. lammpi seems to build and install fine;
> 3. I have taken care not to confuse mpirun&friends between the
> two mpi implementations;
> 4. With MODES="lamd tcp crtcp" the lamtests suite runs with 100%
> success;
> 5. With MODES=sysv, all tests immediately go to 'S' state and hang
> forever, no matter what the <bhost> configuration.
> 6. With MODES=usysv, and, say using topo/cart, half the tests
> 'redline', and half the tests stay in 'S' state.
> 7. This is using the 'sched_yield()' call; I have not tried the
> select() alternative yet, nor --with-pthread-lock. Those are
> next.
> 8. DaSSF+LAMMPI works (if you fix up the initiation call to use
> 'mpiexec -machinefile ...' instead of 'mpirun -np xxx -machine...'
> UNTIL everything terminates successfully. At that point, the
> master task redlines forever.
>
> I realize that this all might be old questions, but I haven't found
> anything from a search.
>
> --
> Ferris McCormick (P44646, MI) <fmccor_at_[hidden]>
> Phone: (703) 392-0303
> Fax: (703) 392-0401
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/