I my original post I used "CPU" to mean process, and "CPUs" - processes
(where each one runs on a different node, i.e. using the -c argument of
mpirun).
----- Original Message -----
From: "Prabhanjan Kambadur" <pkambadu_at_[hidden]>
To: "General LAM/MPI mailing list" <lam_at_[hidden]>
Sent: Wednesday, June 23, 2004 1:58 AM
Subject: Re: LAM: multiple runs - different results
>
> Sorry for the delay. Ideally, it should not matter as to whether you are
> running on 1, 2 or 4 CPUs. Increasing the number of CPUs is usually done
> for the purposes increasing the turnaround time and/or make better use of
> the available hardware (only in well written applications). When you say
> that you are getting errorneous results with 2 or 4 CPUs with the largest
> problem size, are you still running the same number of processes as you
> did in the single CPU run? This would rule out application coding errors.
> Could you please give more details regarding this?
>
> Regards,
> Anju
>
> On Sun, 20 Jun 2004, Angel Tsankov wrote:
>
> > The discrete size of a problem that I'm trying to solve can take 4
different
> > values (let's say these are 16, 32, 64, 128). I'm written a C++ program
to
> > perform the appropriate computations. The program is expected to be run
on
> > 1, 2 or 4 CPUs.
> >
> > When the app is run to solve either the 16-, 32- or 64-size problem it
> > returns the expected results no matter the number of CPUs used. When the
> > program is run on a single CPU to solve the 128-size problem it also
return
> > the expected results. Surprisingly, I get unexpected results only with
the
> > largest problem size on 2 and 4 CPUs.
> > The program transfers arrays of doubles using MPI_Irecv, MPI_Issend and
> > MPI_Waitany.
> >
> > Does anyone have an idea what the problem could be?
> >
> > Since the cluster is homogeneous, I've also tried transferring the
arrays of
> > doubles as arrays of bytes (with as many more elements as is the value
of
> > sizeof( double )). This was to check if LAM performs some conversions
that
> > could result in loss of precision. Unfortunately, this did not help.
> > I'm investigating the issue further, but its is a bit difficult to debug
a
> > program that solve a problem of that size. In fact, I've tried to
implement
> > the steepest descent algorithm for a block-tridiagonal matrix of size
> > 128x128, where each element is a 128x128 matrix of doubles.
> >
> > _______________________________________________
> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|