LAM/MPI logo

LAM/MPI Development Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-12-15 10:07:31


On Dec 15, 2005, at 9:00 AM, Samuel Richard wrote:

>> Perhaps i can give an answer for this question, but i am not sure ;)
>>
>> For TCP RPI of LAM-MPI, it is the TCP/IP protocol itself but not
>> the LAM-MPI which is corresponds for the buffering of the incoming
>> message. LAM-MPI's TCP RPI is only use tcp/ip SOCKET interface to
>> achieve the point-to-point message transfer, the actual packet
>> receiving or sending is managed by the underlying tcp/ip protocol.
>> In the tcp/ip protocol, when a packet is incoming, the NIC dirver
>> will malloc a sk_buff structure and place the packet into the
>> structure, and then the network layer protocol have the change to
>> deal with it. I think perhaps this is the buffering mechanism you
>> referred to.

This is some of the buffering (it's what I referred to as the kernel
buffering). LAM also uses rendezvous protocols so that little
additional buffering on the receiver is necessary for long message.
Please see my first response: http://www.lam-mpi.org/MailArchives/lam-
devel/2005/12/0431.php.

> If what you say is correct, can anybody tell me what the difference
> between sendR and sendB sends using the TCP RPI?

I assume you mean the "ready" and "buffered" MPI sending modes.

> I think, according to some experiments that when I send messages using
> sendR method, messages are buffered by TCP and read from this
> buffer at
> the speed of my application (each time a read is posted).

In LAM, a "ready" send is identical to a "standard" send. So the
buffering is the same as what has already been described.

> What I wonder is what happens with sendB method?
> Reading Lam documentation it seems to me that the messages should be
> buffered in Lam buffer (attached with MPI_Buffer_attach) ?

Yes. Although there is some murkiness in the definition of a
buffered send in the MPI standard, this is pretty much what happens.

Keep in mind that the buffered send mode is a pretty special case,
and we strongly discourage its use. It forces most MPI
implementations to make a secondary memcpy, and therefore simply adds
overhead.

> (in this case
> at which speed are they collected)

Messages are copied into the buffer when you call MPI_BSEND. See the
MPI definition of MPI_BSEND -- by definition, the user's buffer has
to be re-usable when it returns.

> and then the lam buffer should be
> emptied by my application each time a recieve is posted.

Not entirely. Once the messages are copied into the buffer, they are
treated with normal progression rules. I.e., they are progressed the
same as any non-blocking request (e.g., via MPI_ISEND), except that
there is no user-level MPI_Request handle that can tell you when the
send has completed. In particular, if the message is small enough,
it will be sent eagerly. If it is too large, it will be send with a
rendezvous protocol.

> Is that correct?
> If this is correct, how are the messages copied from the TCP buffer to
> the LAM buffer :
> are there two threads of execution?, one for the mpi application
> and one
> for Lam which manage lam requests?

LAM is single threaded. Progress only occurs when the user calls
into the MPI library. The main progression engine uses select() to
see what sockets have data available, and calls read() on the ones
that are ready. The exact state machine is part of
ssi_rpi_tcp_low.c, and is quite complex because TCP allows for
partial reads and writes -- so the entire state machine may be
interrupted at any time and have to resume at the next call to the
progression engine.

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/