LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: jess michelsen (jam_at_[hidden])
Date: 2003-10-20 19:05:20


Hi All!

In order to test, whether I have the right latency and bandwidth in my
bi-directional isend/irecv communications (Gigabit), I've put together a
simple fortran program, as seen below. For small packet sizes, I get
exactly the same timings (2*latency) as seen with NetPipe. For larger
packets (up to 64Kb), I get almost (95%) the same bandwidth as seen with
NetPipe (isn't NetPipe sending the packets uni-directionally?).

However, once in a while during the test, one of the execution nodes
'hang'. It's even impossible to ssh to the node - so the power button is
the only means of communication(!)
My question is now: could this be a buffer issue (buffered send with a
really big buffer didn't work better - only slower) - or could there be
a hardware flaw - or should I do the communication in another fashion?

BTW: since I do not read C very well - could an explanation to all the
tweeks in NetPipe be found anywhere - supposedly, there are some gold
nuggets inthere?

The LAM configuration was done using the following command:
./configure --with-rsh="ssh -x" --with-boot=rsh --with-tcp-short=131072
--prefix=/usr/lam > configure.out
TCP buffers on the RH 8.0 are all set to 262140 bytes.
The Intel 7.1 compilers are employed with LAM 7.0.2.

The job is run by:
lamboot -b -v hostfile
mpirun -np 2 MPItest

The fortran codes is as follows:

      program MPItest
      implicit none
      include 'mpif.h'
      integer,parameter::Count=8192,Cycles=10000
      real(kind=8),dimension(:),allocatable::Snd,Rec
      integer nprocs,MyProcNum,RecProcNum,ierr
      integer request(2),status(MPI_STATUS_SIZE,2)
      integer i
C-----------------------------------------------------------------------
      call MPI_INIT(ierr)
      call MPI_COMM_SIZE(MPI_COMM_WORLD,nprocs,ierr)
      call MPI_COMM_RANK(MPI_COMM_WORLD,MyProcNum,ierr)
      MyProcNum=MyProcNum+1
      RecProcNum=mod(MyProcNum,2)+1
C-----------------------------------------------------------------------
       do i=1,Cycles
        allocate(Snd(Count),Rec(Count));Snd=1
        call MPI_ISEND(Snd,Count,MPI_DOUBLE_PRECISION,RecProcNum-1,
     & MyProcNum,MPI_COMM_WORLD,request(1),ierr)
        call MPI_IRECV(Rec,Count,MPI_DOUBLE_PRECISION,RecProcNum-1,
     & RecProcNum,MPI_COMM_WORLD,request(2),ierr)
        call MPI_WAITALL(2,request,status,ierr)
        deallocate(Snd,Rec)
       end do
C-----------------------------------------------------------------------
      call MPI_FINALIZE(ierr)
C-----------------------------------------------------------------------
      end

Best regards, Jess Michelsen