On May 9, 2005, at 8:11 AM, Jiri Barnat wrote:
> I make a progress with IMPI_HOST_NAME, thanks. However, I still have
> some questions ...
>
> 1) I am able now to start a computation on 2 computers (just the
> access-points of both clusters). However, the computation does not
> initializes properly, which I guess is due to the bug you mentioned.
> Am I right? I am using lam-7.2b1r10122, because using option
> --with-impi in the last stable release produces compile errors.
Yes, you are right. I haven't had the time to fix the IMPI startup bug
yet. :-( It's a quirk with the MPI collective algorithms and how they
initialize.
> 2) I am still not able to start a computation if I want local nodes to
> participate. I quess this is because local nodes of one cluster cannot
> access directly the access point of the other cluster where
> impi_server is running. So my question is which computer in the
> cluster runs impid, which represents IMPI client, and whether there is
> a way how to change it. (Note that in my case impid must be run on the
> access-point of the cluster).
It's run on the same node as rank 0 in MPI_COMM_WORLD. This will
likely be on the first node listed in your boot schema file (that you
lambooted with). So you probably want to make your access point node
be the first one in the file, and then mpirun with the first process on
that node (which is the default). So something like:
access-point$ cat hosts
access-point.cluster1.example.com
node1.cluster1.example.com
node2.cluster1.example.com
access-point$ lamboot hosts
LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
access-point$ mpirun C --client <rank> <server> my_application
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|