LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: feldy_at_[hidden]
Date: 2005-01-26 15:43:55


I'm trying to get Pallas AlltoAll to work on a cluster of SMPs using LAM-7.1.1
(same behavior with 7.0.6). This is using standard tcp/ip via intel e1000
on-board NICs. Using dual-processor 3GHz Xeon HT disabled.
Linux FC3
2.6.9-1.667smp #1 SMP Tue Nov 2 14:59:52 EST 2004 i686 i686 i386 GNU/Linux

What happens is if I run using
mpirun c0-3 -ssi rpi usysv PMB-MPI1 AlltoAll
or
mpirun c0-3 -ssi rpi sysv PMB-MPI1 AlltoAll
or
mpirun c0-3 PMB-MPI1 AlltoAll

the PMB-MPI1 test hangs after completing the 4Meg message size.
It nearly always hangs, maybe one out of 10 times it will get past this
and fail on a larger test. I presume it is stuck in the barrier between tests.

I scrounged around on the net for any similar reports, but saw nothing.

#----------------------------------------------------------------
# Benchmarking Alltoall
# ( #processes = 2 )
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
       #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
            0 1000 3.36 3.36 3.36
            1 1000 3.86 3.86 3.86
            2 1000 3.87 3.87 3.87
            4 1000 3.82 3.83 3.82
            8 1000 3.76 3.76 3.76
           16 1000 3.76 3.76 3.76
           32 1000 3.78 3.78 3.78
           64 1000 6.30 6.31 6.30
          128 1000 3.90 3.90 3.90
          256 1000 4.41 4.41 4.41
          512 1000 4.85 4.85 4.85
         1024 1000 5.62 5.62 5.62
         2048 1000 7.63 7.63 7.63
         4096 1000 11.73 11.73 11.73
         8192 1000 19.39 19.39 19.39
        16384 1000 37.31 37.31 37.31
        32768 1000 108.18 108.19 108.18
        65536 640 181.80 181.81 181.81
       131072 320 424.69 424.73 424.71
       262144 160 1350.60 1350.72 1350.66
       524288 80 3317.64 3317.79 3317.71
      1048576 40 6576.52 6577.45 6576.99
      2097152 20 13094.00 13095.95 13094.97
      4194304 10 26039.20 26042.01 26040.60
[waits here forever]