In parallel program we define the acceleration which is the qution between
the sequential execution time (ts) of alogorithm and parallel execution time
(tp) then we note : S=ts/tp
More S is big more parallelism is good, but S never depasses p (processors
number)
tp is composed from communication time (tcom), iddle time (tiddle) and
execution time (texe).
tp=(tcom+tidle+texe)/p
Ofcourse texe=ts
Then S=p*ts/(tcomm+tidle+texe)=p*ts/(tcomm+tidle+ts)
Now we want to know which is performant parallel or sequetial algorithm ? I
think we cant respond immediately, all depends on the behaviour of S with
the number of processor and the problem size, if S>1 then parallel algortihm
is more performant than sequential algorithm, if S<1 then sequential
algorithm is more performant than parallel algortihm (that means that: it is
better to execute sequential algorithm than parallel beacause tcom and
tiddle are bigger)
Exemple:
We have parallel sommation, suppose that tcomm=0.01*p (when we want to somme
the partial somms we use MPI_Reduce, the execution time is proportionnal
with the number of processes), ts= 0.0001*n (n: number of loop) suppose that
tiddle=0, then
S=0.0001*n/(0.01+0+0.0001*n/p) with a fixed p=10, we find that for n=111 we
have a balance. If n<111 sequential is good else if n>111 parallel is good
(but not enough good), if n>>>111 (parallel is more good)
----- Original Message -----
From: <dburbano_at_[hidden]>
To: <lam_at_[hidden]>
Sent: Tuesday, March 30, 2004 7:50 AM
Subject: Re: LAM: is that possible to reduce the communication cost
byassigning processes on the same node
> When you run multiple processes in a processor and they want to
> communicate between them, the only way or the only form that they can
> communicate or do some computation is with the same processor, for that
> reason take more time than in many processors.
>
> for example, I have 4 processes and only one processor, they have to do
> some computation and some communication (reduce the information to process
> 0). The processes can not communicate betwem them after do their
> computation; so, each process (p0, p1, p2, p3) is executed in the
> processor. The processor can not execute 2 processes at same time, it can
> do by time slice. For that reason, the last processes that are in the
> queue have to wait until the first processes free the processor.
>
> This happens in the communications, the processes need the processor to
> send or receive information, if you are going to use MPI_Reduce, and the
> process p0 is the last to use the processor, the first processes should
> send the information to the process p0 but the process 0 should not ready.
>
> now, what happen if you have four processors and four processes? There is
> one processor for each process; then, the computation is executed at same
> time, they don't share their processors, and they don't wait to use a
> processor. When they finish their computation, they start to send the
> information to other. Using MPI_Reduce, they start to send the
> information to one process (for example p0) that belong to a processor.
> In this case the the process that is receiving the information is ready to
> receive the data from the others (some times is not ready but depend on
> many characteristics);and the others are sending the information at same
> time (depend on hardware configuration).
>
> This one reason from many that the communication and computation time in
> an environment of many processes and processors is less than many
> processes in one processor.
>
>
> This is good link:
>
> http://www.cs.rit.edu/~ncs/parallel.html#books
>
> thanks
>
>
>
> > Hi,
> >
> > I run them on a cluster of machines when I say processes assigned to
> > multiple machines. It is obvious that the computation cost will be
> > increased if multiple processes running on the same host. But why the
> > communication cost is also increased a lot? Could you please give a
> > detail explaination?
> >
> > thanks
> >
> > ----- Original Message -----
> > From: Roberto Pasianot <pasianot_at_[hidden]>
> > Date: Monday, March 29, 2004 1:25 pm
> > Subject: Re: LAM: is that possible to reduce the communication cost by
> > assigning processes on the same node
> >
> >>
> >> Hi,
> >>
> >> This might be a stupid question but anyway : are you running on
> >> a multiprocessor ?. Otherwise what you get is exactly what should be
> >> expected.
> >>
> >> Bye. Roberto.
> >>
> >>
> >> On Mon, 29 Mar 2004, Ming Wu wrote:
> >>
> >> > Hi,
> >> >
> >> > I assigned multiple processes on one machine instead of several
> >> machines.
> >> > In this way, I expect the communication cost will be reduced
> >> compared> with assigning them to several machines. However, the
> >> result is weird.
> >> > both computation cost and communication cost are increased sharply,
> >> especially MPI_Reduce, MPI_Sendrecv. It seems that the
> >> underlying
> >> > implementation of lam_mpi doesn't favor multiple processes on
> >> the same
> >> > host.
> >> >
> >> > Your help will be greatly appreciated.
> >> >
> >> > thanks
> >> >
> >> > _______________________________________________
> >> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >> >
> >>
> >> _______________________________________________
> >> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >>
> >
> > _______________________________________________
> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|