LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-01-24 08:14:04


LAM 7.1.2 has been around for a while and is quite mature. I'll
never claim that it is bug free :-), but if it were truncating
messages over 130 characters long, it is likely that someone else
would have run into this same problem by now.

Can you run your program through valgrind or some other memory-
checking debugger to ensure that you don't have some kind of memory
badness going on? Remember that the MPI portion of your program is
not the only place where memory badness can occur.

Additionally, you might want to double check for race conditions in
your receive logic. For example, you might want to check:

- that the iprobe has completed before you receive
- that you receive the string exactly who you received the iprobe from
- that no other messages have arrived from the sender who sent the
iprobe before the string arrives

On Jan 24, 2007, at 3:01 AM, Frédéric Menou wrote:

> I wrote a C++ class for the entire MPI calls.
> I send messages the same way every time : MPI::Comm.Send(...) by
> passing a Message (my own class, see below) wrapping a std::string and
> his length as MPI::CHAR. So there's no problem this side.
> To the reception I use IProbe + Recv to build std::string.
>
> Messages are designed as a specific class containing all the
> data/metadata. Instances are passed as const references, so memory
> problems are excluded.
>
> I don't think be able to build a demo code to reproduce this
> behaviour.
>
> I've solved my problem with a trick : I found the way to reduce length
> of my messages.
> But it does not solve the problem...
>
> On 1/23/07, Elie Choueiri <elie.choueiri_at_[hidden]> wrote:
>> Could be you're sending (or receiving) some specified size, which may
>> not be correct.
>>
>> Like sending 10 bytes, receiving 20, allocating 30, and
>> initializing 5.
>>
>>
>>
>> On 1/22/07, Frédéric Menou <frederic.menou_at_[hidden]> wrote:
>>> oups! I forgot giving my local configuration :
>>>
>>> " LAM/MPI: 7.1.2
>>> Prefix: /usr/local
>>> Architecture: i686-pc-linux-gnu
>>> Configured by: fmenou
>>> Configured on: Fri Jan 12 18:21:45 CET 2007
>>> Configure host: fmenou-laptop
>>> Memory manager: ptmalloc2
>>> C bindings: yes
>>> C++ bindings: yes
>>> Fortran bindings: no
>>> C compiler: gcc
>>> C++ compiler: g++
>>> Fortran compiler: false
>>> Fortran symbols: none
>>> C profiling: yes
>>> C++ profiling: yes
>>> Fortran profiling: no
>>> C++ exceptions: no
>>> Thread support: yes
>>> ROMIO support: yes
>>> IMPI support: no
>>> Debug support: no
>>> Purify clean: no
>>> SSI boot: globus (API v1.1, Module v0.6)
>>> SSI boot: rsh (API v1.1, Module v1.1)
>>> SSI boot: slurm (API v1.1, Module v1.0)
>>> SSI coll: lam_basic (API v1.1, Module v7.1)
>>> SSI coll: shmem (API v1.1, Module v1.0)
>>> SSI coll: smp (API v1.1, Module v1.2)
>>> SSI rpi: crtcp (API v1.1, Module v1.1)
>>> SSI rpi: lamd (API v1.0, Module v7.1)
>>> SSI rpi: sysv (API v1.0, Module v7.1)
>>> SSI rpi: tcp (API v1.0, Module v7.1)
>>> SSI rpi: usysv (API v1.0, Module v7.1)
>>> SSI cr: self (API v1.0, Module v1.0)"
>>>
>>> On 1/22/07, Frédéric Menou <frederic.menou_at_[hidden]> wrote:
>>>> Hi everyone!
>>>>
>>>> (I'm not very experienced in MPI and I didn't check all bugs of
>>>> lam7.1.2 so maybe the problem I want to explain to you is already
>>>> solved)
>>>>
>>>> So,
>>>>
>>>> I have 1 master MPI software, called M, which spawns some slaves
>>>> of 4
>>>> types A,B,C,D, with card(A) = n and card(B)=card(C)=card(D) = p,
>>>> therefore Size(intercomm) = 1+n+3*p.
>>>>
>>>> After spawning M sends some initialization messages to A's, B's,
>>>> C's
>>>> and D's depending on their kind. The fact is that As are
>>>> supposed to
>>>> receive 'long' messages (over 100 chars) while the others only
>>>> receive
>>>> small messages (<10 chars).
>>>>
>>>> If p >= 5, 'long messages' are truncated with a strange
>>>> character at
>>>> position 130.
>>>> Messages shorter than 130 are not truncated.
>>>>
>>>> This "bug" (maybe it's my fault but I really don't where I could
>>>> come
>>>> from) is extremely reproducible. It only appears when p>=5.
>>>>
>>>> One solution would be to cut messages, and therefore bypass bogus
>>>> truncation limit of 130 but I'll be bored to have to...
>>>>
>>>>
>>>> Any help or feedback would be great :)
>>>> --
>>>> Frédéric Menou
>>>>
>>>> _______________________________________________
>>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>>
>>>
>>>
>>> --
>>> Frédéric Menou
>>>
>>> _______________________________________________
>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>
>>
>>
>> --
>> (N)E
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>
>
> --
> Frédéric Menou
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems