On 9/22/06, Chuming Chen <chumingchen_at_[hidden]> wrote:
> Dear LAM list.
>
> I am trying to set up R SNOW on Rocks cluster with LAM/MPI installed.
>
> cl <- makeCluster(2, type = "MPI") was the R command I used to create a
> cluster. When /opt/lam/gnu/etc/lam-bhost.def contains only "localhost",
> everything is fine. But once I add compute nodes to that file, it just
> would return back to R prompt.
> I can see lamboot successfully and lamd running on the compute nodes
> when I run this command. I wonder whether there is something related to
> the lam's configuration. Would the firewall on compute nodes block the
> further communication. I think lamboot use ssh, but later communication
> may not be the ssh traffic.
>
> Do I need to have lam-bhost.def on all the compute nodes with the same
> entries?
Dear Chuming,
I have no experience with Rocks, but I've been using R with lam/mpi
for over a year, and it works flawlessly. Firewalls (or some firewall
configurations) can give problems. I'd suggest in the first trials to
disable the firewall completely. After everything is up and running,
you can try setting the firewall up again (I think that you must not
restrict UDP between nodes, but I am not sure about this; my firewall
setups allow unrestricted everything between the concerned nodes for
specified interfaces).
Here are a few things to try:
- if you ssh as the user that runs lam, can you write into that
filesystem /on the right directory? suppose your nodes are nodeA and
nodeB; can you do
nodeA > ssh nodeB 'touch dummyfile'
and
nodeB > ssh nodeA 'touch anotherdummyfile'
- lamboot before launching R (of course, using the lamb-host.def file
with the other nodes in there); did it work?
- what happens if you do "lamnodes"? does it show what it should?
- try running some very simple command with lam; something like
"lamexec C hostname"; what happens?
- now, start R
- do not start snow, but rather use Rmpi (this give a little bit more control)
- now (I do not have it with me) look at the Rnews article that
explains Rmpi; there are a few very simple examples to try, like
computing the mean of a vecto on each node. What happens?
HTH,
R.
>
> Thank you for your help.
>
> Chuming
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/51
>
--
Ramon Diaz-Uriarte
Bioinformatics Unit
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz
|