Hi;
I was looking for the paper:
Brian Barret, Jeff Sqyres, Andrew Lumsdaine. LAM/MPI Design
Document. Open Systems laboratory. Pervasive Technology Labs.
Indiana University.
Mentionned on some LAM publications but I couldn't find it.
I would like to answer two questions about LAM RTE (Run Time
Environment):
(1) At lamboot command, a set of n lambd deamons are started
on nodes described on hostfile defining a multiprocessor
virtual machine (isn´t it?). My question is: the lamd
stablishes a fully connected mesh among them? This is
done using TCP connections?
(2) A MPI process communicates with another MPI process using
lamd as intermediate element? I mean a MPI process does or
not a TCP connection with another MPI on remote (even local)
node? Each MPI process communicate with lamd using a unix
pipe and lamd communicates among then using TCP ? Is this
correct?
In fact, I have a third question: when I use the Checkpointing
Restart support, mpirun loads two additional modules: CRLAM and
CRMPI. These modules coordinates their behavior among the nodes
using UDP or TCP? They make another TCP connections pairs dedicated
to this function or they communicates using lamd?
If someone could help me to answer theses questions or giving me
pointers to it, I´ll appreciate. For the moment, I´m a little bit
in rush 'to deep inside" MPI sources to look for these details. Any
hits will be helpful.
Thanks a lot.
ASC
___________________________________________________________________
CARISSIMI, Alexandre Universidade Federal do Rio Grande do Sul
asc_at_[hidden] Instituto de Informática
Tel: +55.51.33.16.61.69 Caixa Postal 15064
Fax: +55.51.33.16.73.08 CEP:91501-970 Porto Alegre - RS - Brasil
___________________________________________________________________
|