LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-11-16 07:51:33


The sysv timing is perhaps the only one that makes sense. But it does
show that the Origin is about 10-15 times slower than your Altrix
machine (but there's a lot of other factors here -- I assume the
machines were idle, you didn't oversubscribe the CPUs, etc.).

But you're right; clearly, a 3 second job on the Altrix should not take
over an hour on the Origin. It certainly suggests a tcp problem, but
anything is suspect for a difference that egregious. A few random
questions to look into: Are there any messages in the system logs that
indicate that things were going wrong? Can you run tcp ping-pong tests
(without MPI) to verify bandwidth and latency (e.g., Netpipe)? Was
someone else using the Origin machine at the time? Was memory full,
and therefore continually swapping?

On Nov 15, 2004, at 11:35 AM, morten w. pedersen wrote:

> I'm experinecing some problems with tcp throuput using tcp on one of
> our IRIX machines (dual cpu Origin 200)
>
> If i use the pingpong test program from the linux performance test
> from the lam site (slighly modified to be able to compile it),
> I observeve the following runtimes when the i run the program on the
> same machine (I repeated the tests on our Altrix box)
> IRIX O200 LINUX Altrix
> tcp +1 hour 0:03.26
> lamd 3:44.96 0:27.94
> sysv 0:31.74 0:02.54
>
> As seen the overall runtime of the program varies from a few seconds
> on the altrix, tot plus an hour using tcp
> on the O200.
>
> Of course the altrix (64 bit itanium processor) is much faster than
> the old Origin 200 machine, but
> I'm a bit worried about why the tcp performace is so poor on IRIX
> machine ( I would have expected it to be
> somewhat faster than the lamd mode).
>
> Does anyone experienced a similar pattern with tcp pingponging of
> large packets is extremly or have any ideas on what could cause it.
>
> The test has been performed using lam 7.0.6 on both machines
>
> -Morten
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/