LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: SCIPIONI Roberto (SCIPIONI.Roberto_at_[hidden])
Date: 2009-02-04 01:27:30


If I undersatnd correctly you are using only mpi tasks
so for example you would have

8 mpitasks in one node

or 8 mpitasks 1 node + 8 mpitasks 2 node

unfortunately the tasks have to communicate between the nodes and that communication time is what creates
the delay (Elapsed time) as you have already noticed instead the CPU time decreases.

The solution to that is to use different systems of parallelization inside the nodes (internode) and between the nodes
(intranode)

Inside the nodes you have basically a shared memory situation and to have many mpitasks is normally not a good idea you should rather have the Threads processes inside nodes and mpi tasks in different nodes

Therefore if you have two nodes with 8 cores each you should have:

node 1 : 1 MPI task 8 threads

node 2 1 MPI task 8 threads

LAM MPI does support the installation with threads but the best solution would be to install OpenMP that has extended and reliable Threads support

Hope this helps

Regards

Roberto Scipioni
ICYS Researcg Fellow
ICYS-CLUSTER Manager

Dear All,
 
I successfully construct a mpi environment in our cluster using the lam-7.1.4 package. After I use the "lamboot" command to invoke the computing nodes, I excecute the parrallel program "VASP" like this: "mpirun -np 16 ./vasp". When the computing has finished, I always get such an error : "bufferd (getroute): invalid node" . I don't know what can cause this problem ?
 
Another serious problem that may be related with the setting of lam-mpi is : when I use one node (with 8 cpu cores inside), the computing become faster with the increase of the cpu core.  But when I use over one node, the computing become slower with the increase of the node. This is really a headache problem.
 
The hardware and software configurations one each node are listed below:

Intel Xeon E5420 2.5G CPU (2*4 cores), 4G Memory,146G Diskspace, 1G Networkcard and 1G Exchanger

Suse Linux 10.0
Intel fortran compiler 10.1.021
Lam-mpi 7.1.4
BLAS: Supplied by Intel MKL 10.1.0.015
LAPACK: Supplied by Intel MKL 10.1.0.015

Here is a time result of a VASP bench file:
 
one node (8 cores):

Total CPU time used (sec): 19.897
User time (sec): 19.717
System time (sec): 0.180
Elapsed time (sec): 19.916

Two nodes (16cores):

Total CPU time used (sec): 15.069
User time (sec): 11.309
System time (sec): 3.760
Elapsed time (sec): 91.696

It is obvious that the total cpu time is decreased, but the elapsed time is largely increased. I have check the occupied ratio of each CPU: when running on one node, the value is almost 100% ; while on two nodes, the value is less than 20%.
 
Why computing with two nodes is slower than one node ? Can anyone give me a solution ?
 
I really need your help. Thanks in advance!
 
 
With best regards
 
Fanghz

  ___________________________________________________________ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/


_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/