LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-07-03 08:11:04


This kind of error can indicate that you're mixing and matching MPI
implementations (e.g., start a LAM job with an mpirun that doesn't
belong to LAM, or perhaps a different version of LAM is installed on
some of your nodes...?), and therefore the individual LAM/MPI
processes don't recognize that they're part of a larger job (i.e.,
they each think that they are the one and only MPI process because the
environmental markers that LAM's libmpi expect are not present).

Ensure that you're using LAM/MPI on all nodes and that it's exactly
the same version everywhere.

On Jul 3, 2008, at 2:40 AM, Aditya Vasal wrote:

> Sorry forgot add subject line
>
> Best Regards,
> Aditya Vasal
> Software Engg | Semiconductor Solutions Group |KPIT Cummins
> Infosystems Ltd. | +91 99 70 168 581 |Aditya.Vasal_at_[hidden] |www.kpitcummins.com
>
> From: Aditya Vasal
> Sent: Thursday, July 03, 2008 12:29 PM
> To: lam_at_[hidden]
> Subject: LAM: (no subject)
>
> Hi,
>
> I want to execute Linpack test on my m/c which has 12 processing
> element, therefore I want 12 parallel processes of Linpack (1 for
> each element).
> I am also limited to 1 GB RAM on my m/c.
>
> I have modified the HPL.dat as per the documentation:
> P = 3
> Q = 4
> N = 3072 (using <80% of available memory)
> NB = 1152
> Swap threshold = 1152
>
> Create a machine.list file, which contains my m/c’s IP address 12
> times (all elements are present in same node).
> Executing Linpack using:
> Mpirun –np 12 –machinefile machine.list xhpl
>
> However, it gives me following error:
>
> HPL ERROR from process # 0, on line 419 of function
> HPL_pdinfo:
> >>> Need at least 12 processes for these tests <<<
>
> I guess it suggests, there is a problem with my HPL.dat.
> Can someone please help me figure out what exactly is the problem
> with this
>
>
> Best Regards,
> Aditya Vasal
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/

-- 
Jeff Squyres
Cisco Systems