Try some simple tests:
- Does "tping -c 3" run successfully? (It should ping all the lamd's)
[ter@uftoscar test]$ tping -c 3 n0-13
1 byte from 13 remote nodes and 1 local node: 0.006 secs
1 byte from 13 remote nodes and 1 local node: 0.005 secs
1 byte from 13 remote nodes and 1 local node: 0.005 secs
3 messages, 3 bytes (0.003K), 0.016 secs (0.368K/sec)
roundtrip min/avg/max: 0.005/0.005/0.006
- Does "lamexec N hostname" run successfully? (It should run
"hostname" on all the booted nodes)
No, it doesn't work. It only show headnode's hostname. See below:
[ter@uftoscar ~]$ lamexec N hostname
uftoscar.latech
<freeze>
I, however, can execute "cexec hostname" with no problem.
- When you "mpirun -np 15 ring.out", do you see ring.out executing on
all the nodes? (i.e., if you ssh into each of the nodes and run ps,
do you see it running?
I only see one ring.out run on headnode, no ring.out running on other nodes.
Thanks
Kulathep