LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Krzysztof Bandurski (kb_at_[hidden])
Date: 2008-05-19 15:16:14


Hi,

Thanks a lot, it seems that you're right, mprun points to
/etc/alternatives/mpi-run, which in turn points to /usr/bin/orterun, and
that, from what I see in the manual, is OpenMP's thing... I guess I have
to find the correct lam's executable... any idea where that might be?

kris

McCalla, Mac wrote:
> Hi,
> This looks like a mixed LAM and OpenMPI environment. what does
> a "which mpirun" command show you?
>
> Cheers,
>
> Mac McCalla
>
> -----Original Message-----
> From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
> Of Krzysztof Bandurski
> Sent: Monday, May 19, 2008 1:30 PM
> To: lam_at_[hidden]
> Subject: LAM: problem with mpirun - processes don't 'see' one another
>
> Hi All,
>
> I used lam before, but I upgraded my system and installed fedora 8 from
> scratch. I have a dual-core athlon 64 on an nforce chipset. I wanted to
> install some mpi environment quickly to test my parallel programs on my
> machina at home before submitting them to the cluster that I use, so I
> just "yummed" lam to my machine. Lamboot seems to work fine, but I have
> a strange problem with mpirun/mpiexec.
>
> When I run a program using mpirun, e.g. like this:
>
> mpirun -np 4 testpopmpi_release <and then follow the command line
> arguments...>
>
> I do get 4 processes running, but each of them sees only itself in
> MPI_COMM_WORLD. When I run it with --display-map, I get something like
> this at the beginning of the output:
>
> [kris_at_nothing nnworkshop]$ mpirun --display-map -np 4 testpopmpi_release
> -packley -d300 -T0f -v1 -Dcgpr -P256 -Mdesa-best2bin [nothing:05733]
> Map for job: 1 Generated by mapping mode: byslot
> Starting vpid: 0 Vpid range: 4 Num app_contexts: 1
> Data for app_context: index 0 app: testpopmpi_release
> Num procs: 4
> Argv[0]: testpopmpi_release
> Argv[1]: -packley
> Argv[2]: -d300
> Argv[3]: -T0f
> Argv[4]: -v1
> Argv[5]: -Dcgpr
> Argv[6]: -P256
> Argv[7]: -Mdesa-best2bin
> Env[0]: OMPI_MCA_rmaps_base_display_map=1
> Env[1]:
> OMPI_MCA_orte_precondition_transports=444a2d3c430e64ba-6534b32b337c12e7
> Env[2]: OMPI_MCA_rds=proxy
> Env[3]: OMPI_MCA_ras=proxy
> Env[4]: OMPI_MCA_rmaps=proxy
> Env[5]: OMPI_MCA_pls=proxy
> Env[6]: OMPI_MCA_rmgr=proxy
> Working dir: /home/kris/nnworkshop (user: 0)
> Num maps: 0
> Num elements in nodes list: 1
> Mapped node:
> Cell: 0 Nodename: nothing Launch id: -1
> Username: NULL
> Daemon name:
> Data type: ORTE_PROCESS_NAME Data Value: NULL
> Oversubscribed: True Num elements in procs list: 4
> Mapped proc:
> Proc Name:
> Data type: ORTE_PROCESS_NAME Data Value:
> [0,1,0]
> Proc Rank: 0 Proc PID: 0 App_context
> index: 0
>
> Mapped proc:
> Proc Name:
> Data type: ORTE_PROCESS_NAME Data Value:
> [0,1,1]
> Proc Rank: 1 Proc PID: 0 App_context
> index: 0
>
> Mapped proc:
> Proc Name:
> Data type: ORTE_PROCESS_NAME Data Value:
> [0,1,2]
> Proc Rank: 2 Proc PID: 0 App_context
> index: 0
>
> Mapped proc:
> Proc Name:
> Data type: ORTE_PROCESS_NAME Data Value:
> [0,1,3]
> Proc Rank: 3 Proc PID: 0 App_context
> index: 0
>
> and then follows the output of my program. As you can see, lam thinks
> that all the processes are in the same communicator (they all have
> different ranks), but when I call MPI_Comm_rank and MPI_Comm_size in my
> program, I always get rank == 0 and size == 1in each single process -
> needless to say, the processes can't communicate and I just have 4
> independent copies of my program running (and printin exactly the same
> output on the terminal....). Does anyone have any idea what might be
> going on? This is really driving me nuts, I will appreciate any hints.
>
> best regards,
>
> kris.
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>