On Wed, 18 Jun 2003, Andrey Slepuhin wrote:
> I'm also interested in running lam-7.x with gm rpi. It compiles just
> fine, but test application hang in MPI_Init. The same thing happens for
> mpich over gm, so I think a problem is with gm-2.0. BTW, does anybody
> tested lam with gm-2.0 on other architectures?
I know that gm 2.0 is not quite stable yet. You might want to contact
Myricom and ask them about it directly (perhaps help_at_[hidden] -- that's
who I always ask :-).
Also, note that we still have a few remaining, persistent little buglets
in the gm RPI code (even in the most recent CVS snapshot -- #@%#$%#$%!!).
Hope to have those solved soon, but for the moment, it's not entirely
working. :-\
> And one more question: I'm thinking a lot of utilizing both onboard
> gigabit ethernet interfaces on Opteron motherboards. I can assign them
> differenent IP addresses and run MPI application as if I have separate
> nodes instead of one dual-CPU node. But in such case we loose shmem
> communication between MPI processes sharing this node. How hard would be
> patching lam to enable binding separate MPI processes inside one nodes
> to different IP addresses?
My first reaction is that instead of tweaking LAM itself, you might want
to play with your IP routes similar to what the University of Toronto did
with their cluster (similar to yours: 2 NICs in each box). In such a
configuration, the OS is responsible for figuring out which NIC to talk
though, and LAM doesn't need to know anything about it. Toronto did a
variation on the University of Kentucky's work on Flat Neighborhood
Networks (FNNs). You might want to look up their work (www.aggregate.org)
for some pointers.
Hope this helps...
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|