Neil
The files exist and are readable. I stepped back a bit and tried running a
very simple hello.f file as written in the LAM manual. Even that program has
problems connecting to by cluster node.
Here is the text of another message I recently posted based on the outcome
of this simple test:
******************
I wrote the little test program hello.f and ran it on my dual-G5 (rotorx),
dual Xserve (xbot0) mini-cluster.
Lamboot works:
rotorx:/Volumes/bigscratch/runs/test jnt7$ lamboot -v lamhosts
LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
n-1<9804> ssi:boot:base:linear: booting n0 (rotorx)
n-1<9804> ssi:boot:base:linear: booting n1 (xbot0)
n-1<9804> ssi:boot:base:linear: finished
But when I try to launch a job I get the output:
mpirun: cannot start hello on n1: No such file or directory
Running 4 instances on my head node only gives:
Hello, world! I am 0 of 4
Hello, world! I am 3 of 4
Hello, world! I am 1 of 4
Hello, world! I am 2 of 4
Passwordless rsh works both directions.
***************
I looked at my /private/tmp and on both the head node and the cluster node,
a direcxtory is created according to the usual convention, i.e.
lam-jnt7_at_xbot0 and lam-jnt7_at_rotorx.
One issue might be that my Xserve is running OSX Server while my head node
is running regular OSX The user on the Xserve also has the same UID and GID
as the head node.
Johannes
Johannes,
I realise this may be stating the obvious, but it looks like there is a
problem with the 2 "grid" files in "od_scratch". Can you first check that
they exist and are readable by you, both on your MAC and on your XSERVE
system?
Johannes Theron wrote:
>
> When the computational job actually starts, (the Xserve (xbot0) needs to
> read these files from the head node (rotorx)), I get the following error:
>
> ***************
> ** ERROR ** UNABLE TO OPEN GRID FILE od_scratch/grid.14
>
> STOP_ALL called from routine GRID_READ, group 3
>
>
> ** ERROR ** UNABLE TO OPEN GRID FILE od_scratch/grid.15
>
> STOP_ALL called from routine GRID_READ, group 4
> ****************
>
Regards
Neil
--
+-----------------+---------------------------------+------------------+
| Neil Storer | Head: Systems S/W Section | Operations Dept. |
+-----------------+---------------------------------+------------------+
| ECMWF, | email: neil.storer_at_[hidden] | //=\\ //=\\ |
| Shinfield Park, | Tel: (+44 118) 9499353 | // \\// \\ |
| Reading, | (+44 118) 9499000 x 2353 | ECMWF |
| Berkshire,
|