This is not a good mechanism for measuring the difference. In this
case, you're going to be running 4 processes on one node (2 CPUs, in
your case). I'm guessing that all 4 processes will attempt to fully
utilize their CPUs, leading to the OS having to schedule all 4 on 2
resources -- leading to process swappage, and potentially unnecessary
data movement between your processors (e.g., if process A moves between
both CPUs). You also need to consider how much memory each process
takes -- will the sum of all 4 processes exceed the physical memory of
your machine? If so, you'll also incur a lot of memory thrashing.
This will potentially be a *lot* of overhead.
The general rule of thumb is: running N processes on M processors
(where N > M) simultaneously will always take more time than running M
processes at a time (until you have run a total of N processes) because
of process and/or memory thrashing.
If you want to compare serial performance vs. parallel performance, you
really need to have a serial version of your code -- one that can run
in a single process (or, if you're comparing by node and not by CPU,
one that can run in 2 processes since you have 2 CPUs in a node -- but
be sure to take memory constraints into consideration!). Then compare
the performance of that vs. your parallel runs.
Hope this helps.
On Jun 15, 2005, at 6:15 AM, Slawomir Kubacki wrote:
> Dear Sir, Madam,
>
> I want to measure the efficiency of the parallel program using domain
> decomposition aproach
> The efficiency (speedup=t(1)/t(p)) can be measured as the ratio of
> executing time
> on one processor to executing time on p processors.
> I order to measure the time t(1) it is necessary to run the parallel
> program
> (formally decomposed into p subdomains) on one processor.
> For example the problem decomposed into four subdomains
> I am running on one node n1:
>
> mpirun -c 4 n1 program_name
>
> As we have 2 processors on one node I am not certain that
> running the parallel program on one node the one or two processors are
> used in fact.
> Please indicate how to force that the problem can be run on one
> processor only?
>
> regards,
>
> Slawomir Kubacki
> <kubacki.vcf>_______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|