Hi, there
There is a error messages when I run mpirun on my 16 CPU cluster.
the command : mpirun -np 16 -tcp_buffers 262144 -tcp_long 131094 /home/......
After it is running about 5 hours it abort and show the error message. When I increase the value of tcp_buffer, the problem is still there.
By the way, my switch is 3Com 16 port dual speed 3C16735B.
Anyone know how slove this problem? thanks.
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|