LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2006-01-26 08:33:27


This thread moved to the lam-devel list.

On Jan 25, 2006, at 4:36 PM, wchao_at_[hidden] wrote:

> Hi, there,
>
> I am adding a feature to lam, and the new feature is running as a
> single
> process of lamd. So, I defined a priority for it:
> #define PRNEW PRDAEMON
> and call kinit(PRNEW) in the new process.
>
> In the new code, I used nsend()/ntry_recv() to communicate among them:
>
> LAM_ZERO_ME(outgoing);
> outgoing.nh_node = destination; //0, 1, 2, or 3
> // on 4 nodes test environment
> outgoing.nh_event = EVNEW;
> outgoing.nh_type = 0;
> outgoing.nh_flags = 0;
> outgoing.nh_length = strlen(msg) + 1;
> outgoing.nh_msg = msg;
>
> nsend(&outgoing);
>
> ...
>
> LAM_ZERO_ME(incoming);
> memset((void*) msg, 0, 256);
> incoming.nh_event = EVNEW;
> incoming.nh_flags = 0;
> incoming.nh_msg = msg;
> incoming.nh_length = 256;
> incoming.nh_type = 0;
>
> while(ntry_recv(&incoming) == 0){
>
> Then, sometimes nsend()/ntry_recv() works, and all messages between
> the 4
> nodes are sent and received.
>
> But most of the time, during the messages communication, some
> message would
> be sent and the receiver didn't receive it, or some message was
> suspending
> on nsend() but the receiver is reachable with tping.
>
> I tried to adjust the priority of the new process, to update the
> nh_type
> and nh_event, and to use nrecv() instead of ntry_recv(), but it
> didn't fix
> the issue. Seems something is wrong with the event queue, the message
> sent from the new process is got by other process of the lamd, but the
> nh_event should have avoid such case. I'm really confused here.
>
> So, what's wrong with it? Is my using of nsend()/nrecv() right? or
> anything is missed?
>
> Any comments and suggests are welcome! Thanks!
>
> Chao
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/

-- 
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/