Thank you very much. I downloaded the new version of the driver from the
Skyconnect site and installed it and now works perfectly fine. The kernel's
driver did not work at all.
Gkikas
-----Original Message-----
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf Of
Neil Storer
Sent: Thursday, August 05, 2004 4:00 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: LAM freezes, please help!
Gkikas,
When we had a similar problem some years ago (certain packet sizes being OK,
others not) it did in fact turn out to be hardware. Though there is still
the chance that the network is congested, but that should affect other
packet sizes equally.
I would first try reseating the adapters, to make sure you don't have a
loose connection.
You could try doing a "netstat -s", run your test (in fact "ping" also takes
a "-s packetsize" option, then do a "netstat -s", to see how the statistics
changed over the run.
See:
http://homepage.ntlworld.com/robin.d.h.walker/cmtips/loss.html
For an explanation.
---------------------------
I don't know if it is related, but I found an item on the web that was not
very complimentary to the "OFFICIAL 3com 3c2000/3c940 DRIVER". You can see
for yourself at:
http://www.linuxquestions.org/questions/archive/3/2004/02/2/143348
Regards
Neil
Gkikas Magiorkinis wrote:
> I think that you are probably right. I found out that specific udp packets
> are rejected:
> Using ttcp-t only 5, 11 and >16 length words are passed to the ttcp-r.
> Moreover using traceroute only 40 byte and >45 byte length packets are
> resolved.
> Isn't this weird? Now i am almost sure that this is not a LAM problem. But
i
> do not know what to do. I know that this is not a network mailing list,
but
> if there are any ideas i would be grateful if you could help me.
> My network card is 3com 3c2000 family (1 Gbit)
>
> Gkikas
>
> -----Original Message-----
> From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
Of
> Jeff Squyres
> Sent: Thursday, August 05, 2004 1:30 PM
> To: General LAM/MPI mailing list
> Subject: Re: LAM: LAM freezes, please help!
>
> Let me throw my $0.02 in here..
>
> LAM effectively uses UDP for process control and other "out of band"
> kinds of communications (e.g., tping heavily uses UDP). So if your
> network is having UDP problems, LAM will probably not work properly.
> The "switch lights kept blinking" issues you mentioned were probably
> due to the LAM daemon trying to re-send UDP packets that were never
> ack'ed. You also mentioned that tping would work a few times and then
> stop -- that's not good at all. It means that *some* UDP packets are
> getting through, but not all.
>
> We won't rule out a LAM bug, but I would check your network
> configuration with other programs that use UDP.
>
> I don't have much to offer as suggestions as to *why* this would
> happen, perhaps faulty hardware...?
>
>
> On Aug 5, 2004, at 4:49 AM, Gkikas Magiorkinis wrote:
>
>
>>Neil,
>>
>>I tried ttcp and tcp works OK but UDP is problematic. Do you have any
>>idea
>>why this could be happening?
>>
>>Gkikas
>>
>>-----Original Message-----
>>From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On
>>Behalf Of
>>Neil Storer
>>Sent: Wednesday, August 04, 2004 3:43 PM
>>To: General LAM/MPI mailing list
>>Subject: Re: LAM: LAM freezes, please help!
>>
>>Gkikas,
>>
>>If "ping" is working without any packets lost. e.g.
>>
>> ping -c 100 host-abc
>>
>> 100 packets transmitted, 100 received, 0% loss
>>
>>Then your network is probably OK. You should do the "pings" from each
>>node
>>to each other node to fully test the system.
>>
>>TTCP is a simple client-server application that uses SOCKETs to send
>>data.
>>It doesn't use MPI however, MPI uses SOCKETs (by default), so if TTCP
>>also
>>works OK, you have eliminated hardware and the underlying TCP/IP and
>>routing
>>
>>as sources of the problem.
>>
>>For information on TTCP + how to get the source, use "google" or see:
>>
>> http://www.nps.navy.mil/cs/su/cs4550/proj1.1.htm
>>
>>The MAN page on my system is:
>>NAME
>> ttcp - test TCP and UDP performance
>>
>>SYNOPSIS
>> ttcp -t [-lbuflen] [-s] [-nnumbufs] [-pport] [-u] [-D] [-L]
>>[-Aalign]
>> [-Ooffset] [-v] host [<in]
>> ttcp -r [-lbuflen] [-s] [-pport] [-B] [-Aalign] [-Ooffset]
>>[-u] [-v]
>> [>out]
>>
>>DESCRIPTION
>> Ttcp times the transmission and reception of data between
>>two
>> systems using the UDP or TCP protocols. It differs from
>> common "blast" tests, which tend to measure the remote inetd
>>as much
>> as the network performance, and which usually do not
>> allow measurements at the remote end of a UDP transmission.
>>
>> Ttcp.c should be compiled for both ends of the path to be
>>test.
>> It uses sockets and is easy to port to most machines
>> based on 4.3BSD.
>>
>> The transmitter should be started with -t after the receiver
>>has
>>been
>> started with -r. Test lasting at least tens of
>> seconds should be used to obtain accurate measurements.
>> Graphical presentations of throughput versus buffer size for
>> buffers ranging from tens of bytes to several "pages" can
>>illuminate
>> bottlenecks.
>>
>> Options
>> -t Transmit mode.
>>
>> -r Receive mode.
>>
>> -u Use UDP instead of TCP.
>>
>> -llength Length of buffers in bytes (default 8192).
>>
>> -nnumbufs Number of source buffers transmitted (default 2048).
>>
>> -pport Port number to send to or listen on (default 5001).
>>
>> -D If transmitting using TCP, do not buffer data when
>>sending
>> (sets the TCP_NODELAY socket option).
>>
>> -s If transmitting, do not source a data pattern to
>>network;
>> use stdin instead. If receiving, do not sink or
>>discard,
>> but print all data to stdout.
>>
>> -B When receiving and using the -s option, only
>>output
>>full
>> blocks, using the block size specified by -l. This
>> option is useful for programs that require complete
>>blocks,
>> like tar(1).
>>
>> -Aalign Align the start of buffers to this modulus (default
>>16384).
>>
>> -Ooffset Align the start of buffers to this offset (default
>>0).
>>For
>> example, "-A8192 -O1" causes buffers to start at
>> the second byte of an 8192-byte page.
>>
>> -v Verbose: print more statistics.
>>
>> -d Set the SO_DEBUG socket option.
>>
>>--
>>+-----------------+---------------------------------
>>+------------------+
>>| Neil Storer | Head: Systems S/W Section | Operations Dept.
>>|
>>+-----------------+---------------------------------
>>+------------------+
>>| ECMWF, | email: neil.storer_at_[hidden] | //=\\ //=\\
>>|
>>| Shinfield Park, | Tel: (+44 118) 9499353 | // \\// \\
>>|
>>| Reading, | (+44 118) 9499000 x 2353 | ECMWF
>>|
>>| Berkshire, | Fax: (+44 118) 9869450 | ECMWF
>>|
>>| RG2 9AX, | | \\ //\\ //
>>|
>>| UK | URL: http://www.ecmwf.int/ | \\=// \\=//
>>|
>>+--+--------------+---------------------------------+----------------
>>+-+
>> | ECMWF is the European Centre for Medium-Range Weather Forecasts |
>> +-----------------------------------------------------------------+
>>
>>_______________________________________________
>>This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>>_______________________________________________
>>This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>
>
--
+-----------------+---------------------------------+------------------+
| Neil Storer | Head: Systems S/W Section | Operations Dept. |
+-----------------+---------------------------------+------------------+
| ECMWF, | email: neil.storer_at_[hidden] | //=\\ //=\\ |
| Shinfield Park, | Tel: (+44 118) 9499353 | // \\// \\ |
| Reading, | (+44 118) 9499000 x 2353 | ECMWF |
| Berkshire, | Fax: (+44 118) 9869450 | ECMWF |
| RG2 9AX, | | \\ //\\ // |
| UK | URL: http://www.ecmwf.int/ | \\=// \\=// |
+--+--------------+---------------------------------+----------------+-+
| ECMWF is the European Centre for Medium-Range Weather Forecasts |
+-----------------------------------------------------------------+
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|