Dear All,
Â
I successfully construct a mpi environment in our cluster using the lam-7.1.4 package. After I use the "lamboot" command to invoke the computing nodes, I excecute the parrallel program "VASP" like this: "mpirun -np 16 ./vasp". When the computing has finished, I always get such an error : "bufferd (getroute): invalid node" . I don't know what can cause this problem ?
Â
Another serious problem that may be related with the setting of lam-mpi is : when I use one node (with 8 cpu cores inside), the computing become faster with the increase of the cpu core.  But when I use over one node, the computing become slower with the increase of the node. This is really a headache problem.
Â
The hardware and software configurations one each node are listed below:
Intel Xeon E5420 2.5G CPU (2*4 cores), 4G Memory,146G Diskspace, 1G Networkcard and 1G Exchanger
Suse Linux 10.0
Intel fortran compiler 10.1.021
Lam-mpi 7.1.4
BLAS: Supplied by Intel MKL 10.1.0.015
LAPACK: Supplied by Intel MKL 10.1.0.015
Here is a time result of a VASP bench file:
Â
one node (8 cores):
Total CPU time used (sec): 19.897
User time (sec): 19.717
System time (sec): 0.180
Elapsed time (sec): 19.916
Two nodes (16cores):
Total CPU time used (sec): 15.069
User time (sec): 11.309
System time (sec): 3.760
Elapsed time (sec): 91.696
It is obvious that the total cpu time is decreased, but the elapsed time is largely increased. I have check the occupied ratio of each CPU: when running on one node, the value is almost 100% ; while on two nodes, the value is less than 20%.
Â
Why computing with two nodes is slower than one node ? Can anyone give me a solution ?
Â
I really need your help. Thanks in advance!
Â
Â
With best regards
Â
Fanghz
Â
___________________________________________________________
好ç©è´ºå¡çä½ åï¼é®ç®±è´ºå¡å
¨æ°ä¸çº¿ï¼
http://card.mail.cn.yahoo.com/
|