Hi, there,
I am adding a feature to lam, and the new feature is running as a single
process of lamd. So, I defined a priority for it:
#define PRNEW PRDAEMON
and call kinit(PRNEW) in the new process.
In the new code, I used nsend()/ntry_recv() to communicate among them:
LAM_ZERO_ME(outgoing);
outgoing.nh_node = destination; //0, 1, 2, or 3
// on 4 nodes test environment
outgoing.nh_event = EVNEW;
outgoing.nh_type = 0;
outgoing.nh_flags = 0;
outgoing.nh_length = strlen(msg) + 1;
outgoing.nh_msg = msg;
nsend(&outgoing);
...
LAM_ZERO_ME(incoming);
memset((void*) msg, 0, 256);
incoming.nh_event = EVNEW;
incoming.nh_flags = 0;
incoming.nh_msg = msg;
incoming.nh_length = 256;
incoming.nh_type = 0;
while(ntry_recv(&incoming) == 0){
Then, sometimes nsend()/ntry_recv() works, and all messages between the 4
nodes are sent and received.
But most of the time, during the messages communication, some message would
be sent and the receiver didn't receive it, or some message was suspending
on nsend() but the receiver is reachable with tping.
I tried to adjust the priority of the new process, to update the nh_type
and nh_event, and to use nrecv() instead of ntry_recv(), but it didn't fix
the issue. Seems something is wrong with the event queue, the message
sent from the new process is got by other process of the lamd, but the
nh_event should have avoid such case. I'm really confused here.
So, what's wrong with it? Is my using of nsend()/nrecv() right? or
anything is missed?
Any comments and suggests are welcome! Thanks!
Chao
|