LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: McCalla, Mac (macmccalla_at_[hidden])
Date: 2003-07-24 12:12:19


Jeff Squyres wrote:
> I'm not entirely clear what you're showing us in the output, above.
>The only output that looks like it's comming from LAM is the "invalid
>address tag" error.

Sorry for my lack of clarity. I'll try again. What I am looking for is
the cause of the invalid address tag errors. These occur sporadically,
with many successful executions of mpirun occurring in between, and
they occur on different nodes as well, although in the previously
mentioned 3 occurrences, 2 were on the same node. In all cases,
retrying the exact same lamboot or mpirun command was successful. The
pbind and ypbindproc errors were mentioned because I thought they
indicated that there possibly were some nis problems which might be
helpful to know about.

>The invalid address tag is an odd one -- it's actually a lamd error
>indicating that there was some kind of problem in the LAM

>sessiondirectory.

Could there be a connection between an NIS error and accessing the
session directory (which is local by the way)?

>Is the user's job script running mpirun a large number of times?

Yes, hundreds. Is there some inherent limit on number of mpirun
executions per lamboot?

thanks for your time,
mac mccalla