Hi, Jeff
Thanks for the reply, here is the information:
1. Yeah, I ran the test on the same node I start lamboot from.
2. Yeah, /lamtests-6.5.9 is NFS mounted on to all nodes, it's exactly the
same path on all nodes, including the nodes it exported from.
3. /lamtests-6.5.9/reporting/collector exists on the node I ran the test
from, have the permission 750.
4. I can't manually run the collector, I even change the mode to 755,
still no luck.
Really confusing me!
Thanks very much for your time!
Best
Yu
On Thu, 22 May 2003, Jeff Squyres wrote:
> On Thu, 22 May 2003, Yu Chen wrote:
>
> > But when I run the lam-tests-6.5.9, it gives out:
> > ........
> > *** Testing -lamd mode ***
> > make[2]: Entering directory `/lamtests-6.5.9/ccl'
> > mpirun -x TEST -np 2 -s h -lamd -O /lamtests-6.5.9/ccl/allgather
> > mpirun: cannot start ../reporting/collector on n0 (o): No such file or
> > directory
> > mpirun -x TEST -np 2 -s h -lamd -O /lamtests-6.5.9/ccl/allreduce
> > mpirun: cannot start ../reporting/collector on n0 (o): No such file or
> > directory
> > mpirun -x TEST -np 2 -s h -lamd -O /lamtests-6.5.9/ccl/alltoall
> > mpirun: cannot start ../reporting/collector on n0 (o): No such file or
> > directory
> > .......
>
> That's quite odd, and clearly shouldn't be happening.
>
> > While the file is actually there, with the right permission, and on NFS.
> > I am really lost here, I would highly appreciate if anyone could give me
> > some advices.
>
> Note that the tests are actually running to completion (apparently
> successfully) -- the collector is simply a program that runs to collect
> any errors that may have occurred on remote nodes. I'm guessing that this
> is simply a minor error in the testing harness; I'm sure that it does not
> indicate that your LAM/MPI installation is broken.
>
> But I don't know why this is happening offhand, so let me ask a few
> questions:
>
> - are you running the test suite on the same node that you lambooted from?
>
> - is /lamtests-6.5.9 really NFS exported to, and mounted on all nodes?
>
> - does /lamtests-6.5.9/reporting/collector exist on the node that you ran
> the test suite on, and have (at least) permissions 555? (I know you
> mentioned this, but I want to nail down what the "right permissions" are)
>
> - can you manually run the collector? e.g.,
>
> cd /lamtests-6.5.9/reporting
> mpirun -s h N collector
> cd ../ccl
> mpirun -ssh N ../reporting/collector
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|