Sorry, you will need to reconfigure and recompile. :-(
The LAM install docs have information about this.
On Feb 16, 2006, at 5:34 PM, rtichy_at_[hidden] wrote:
> Hi again,
>
> Thanks for the hints.
>
> I removed mpich and lam and reinstalled just lam, and it is now
> working. Each
> process has a different rank corresponding to its lamrank. But...
> lam did not
> install with c++ exception support as the kde package manager
> (adept) installed
> it automatically... Can I enable exception support now or do I need to
> reconfigure and recompile... which I would have to do manually.
>
> Again, thanks for all the tips.
>
> -- Rich
>
>
> Quoting Jeff Squyres <jsquyres_at_[hidden]>:
>
>> Please see my earlier post -- it looks like you are compiling with
>> MPICH and running with LAM.
>>
>> http://www.lam-mpi.org/MailArchives/lam/2006/02/11913.php
>>
>>
>> On Feb 16, 2006, at 3:55 PM, rtichy_at_[hidden] wrote:
>>
>>> Hi,
>>>
>>> I used the mpirun command from /usr/bin and then mpirun.lam in
>>> /usr/lib/lam/bin,
>>> also I installed this version of mpich and lam from adept (kde
>>> package manager)
>>> using kubuntu linux. It was intalled more or less automatically and
>>> on my home
>>> machine. If you think it necessary I could uninstall both the lam
>>> and mpich
>>> versions downloaded with adept and recompile from the source on the
>>> web-site.
>>>
>>> rtichy_at_darwin:~/mpi/platypus_mercer$ /usr/lib/lam/bin/mpirun.lam
>>> --------------------------------------------------------------------
>>> --
>>> -------
>>> Synopsis: mpirun [options] <app>
>>> mpirun [options] <where> <program> [<prog args>]
>>>
>>> Description: Start an MPI application in LAM/MPI.
>>>
>>> Notes:
>>> [options] Zero or more of the options listed
>>> below
>>> <app> LAM/MPI appschema
>>> <where> List of LAM nodes and/or CPUs
>>> (examples
>>> below)
>>> <program> Must be a LAM/MPI program that
>>> either
>>> invokes MPI_INIT or has exactly
>>> one of
>>> its children invoke MPI_INIT
>>> <prog args> Optional list of command line
>>> arguments
>>> to <program>
>>>
>>> Options:
>>> -c <num> Run <num> copies of <program> (same
>>> as -np)
>>> -c2c Use fast library (C2C) mode
>>> -client <rank> <host>:<port>
>>> Run IMPI job; connect to the IMPI
>>> server <host>
>>> at port <port> as IMPI client
>>> number <rank>
>>> -D Change current working directory of
>>> new
>>> processes to the directory where the
>>> executable resides
>>> -f Do not open stdio descriptors
>>> -ger Turn on GER mode
>>> -h Print this help message
>>> -l Force line-buffered output
>>> -lamd Use LAM daemon (LAMD) mode
>>> (opposite of -c2c)
>>> -nger Turn off GER mode
>>> -np <num> Run <num> copies of <program> (same
>>> as -c)
>>> -nx Don't export LAM_MPI_* environment
>>> variables
>>> -O Universe is homogeneous
>>> -pty / -npty Use/don't use pseudo terminals when
>>> stdout is
>>> a tty
>>> -s <nodeid> Load <program> from node <nodeid>
>>> -sigs / -nsigs Catch/don't catch signals in MPI
>>> application
>>> -ssi <n> <arg> Set environment variable
>>> LAM_MPI_SSI_<n>=<arg>
>>> -toff Enable tracing with generation
>>> initially off
>>> -ton, -t Enable tracing with generation
>>> initially on
>>> -tv Launch processes under TotalView
>>> Debugger
>>> -v Be verbose
>>> -w / -nw Wait/don't wait for application to
>>> complete
>>> -wd <dir> Change current working directory of
>>> new
>>> processes to <dir>
>>> -x <envlist> Export environment vars in <envlist>
>>>
>>> Nodes: n<list>, e.g., n0-3,5
>>> CPUS: c<list>, e.g., c0-3,5
>>> Extras: h (local node), o (origin node), N (all nodes), C
>>> (all CPUs)
>>>
>>> Examples: mpirun n0-7 prog1
>>> Executes "prog1" on nodes 0 through 7.
>>>
>>> mpirun -lamd -x FOO=bar,DISPLAY N prog2
>>> Executes "prog2" on all nodes using the LAMD RPI.
>>> In the environment of each process, set FOO to the
>>> value
>>> "bar", and set DISPLAY to the current value.
>>>
>>> mpirun n0 N prog3
>>> Run "prog3" on node 0, *and* all nodes. This
>>> executes *2*
>>> copies on n0.
>>>
>>> mpirun C prog4 arg1 arg2
>>> Run "prog4" on each available CPU with command line
>>> arguments of "arg1" and "arg2". If each node has a
>>> CPU count of 1, the "C" is equivalent to "N". If at
>>> least one node has a CPU count greater than 1, LAM
>>> will run neighboring ranks of MPI_COMM_WORLD on that
>>> node. For example, if node 0 has a CPU count of
>>> 4 and
>>> node 1 has a CPU count of 2, "prog4" will have
>>> MPI_COMM_WORLD ranks 0 through 3 on n0, and ranks 4
>>> and 5 on n1.
>>>
>>> mpirun c0 C prog5
>>> Similar to the "prog3" example above, this runs
>>> "prog5"
>>> on CPU 0 *and* on each available CPU. This executes
>>> *2* copies on the node where CPU 0 is (i.e., n0).
>>> This is probably not a useful use of the "C"
>>> notation;
>>> it is only shown here for an example.
>>>
>>> Defaults: -c2c -w -pty -nger -nsigs
>>> --------------------------------------------------------------------
>>> --
>>> -------
>>> rtichy_at_darwin:~/mpi/platypus_mercer$ mpirun
>>> --------------------------------------------------------------------
>>> --
>>> -------
>>> Synopsis: mpirun [options] <app>
>>> mpirun [options] <where> <program> [<prog args>]
>>>
>>> Description: Start an MPI application in LAM/MPI.
>>>
>>> Notes:
>>> [options] Zero or more of the options listed
>>> below
>>> <app> LAM/MPI appschema
>>> <where> List of LAM nodes and/or CPUs
>>> (examples
>>> below)
>>> <program> Must be a LAM/MPI program that
>>> either
>>> invokes MPI_INIT or has exactly
>>> one of
>>> its children invoke MPI_INIT
>>> <prog args> Optional list of command line
>>> arguments
>>> to <program>
>>>
>>> Options:
>>> -c <num> Run <num> copies of <program> (same
>>> as -np)
>>> -c2c Use fast library (C2C) mode
>>> -client <rank> <host>:<port>
>>> Run IMPI job; connect to the IMPI
>>> server <host>
>>> at port <port> as IMPI client
>>> number <rank>
>>> -D Change current working directory of
>>> new
>>> processes to the directory where the
>>> executable resides
>>> -f Do not open stdio descriptors
>>> -ger Turn on GER mode
>>> -h Print this help message
>>> -l Force line-buffered output
>>> -lamd Use LAM daemon (LAMD) mode
>>> (opposite of -c2c)
>>> -nger Turn off GER mode
>>> -np <num> Run <num> copies of <program> (same
>>> as -c)
>>> -nx Don't export LAM_MPI_* environment
>>> variables
>>> -O Universe is homogeneous
>>> -pty / -npty Use/don't use pseudo terminals when
>>> stdout is
>>> a tty
>>> -s <nodeid> Load <program> from node <nodeid>
>>> -sigs / -nsigs Catch/don't catch signals in MPI
>>> application
>>> -ssi <n> <arg> Set environment variable
>>> LAM_MPI_SSI_<n>=<arg>
>>> -toff Enable tracing with generation
>>> initially off
>>> -ton, -t Enable tracing with generation
>>> initially on
>>> -tv Launch processes under TotalView
>>> Debugger
>>> -v Be verbose
>>> -w / -nw Wait/don't wait for application to
>>> complete
>>> -wd <dir> Change current working directory of
>>> new
>>> processes to <dir>
>>> -x <envlist> Export environment vars in <envlist>
>>>
>>> Nodes: n<list>, e.g., n0-3,5
>>> CPUS: c<list>, e.g., c0-3,5
>>> Extras: h (local node), o (origin node), N (all nodes), C
>>> (all CPUs)
>>>
>>> Examples: mpirun n0-7 prog1
>>> Executes "prog1" on nodes 0 through 7.
>>>
>>> mpirun -lamd -x FOO=bar,DISPLAY N prog2
>>> Executes "prog2" on all nodes using the LAMD RPI.
>>> In the environment of each process, set FOO to the
>>> value
>>> "bar", and set DISPLAY to the current value.
>>>
>>> mpirun n0 N prog3
>>> Run "prog3" on node 0, *and* all nodes. This
>>> executes *2*
>>> copies on n0.
>>>
>>> mpirun C prog4 arg1 arg2
>>> Run "prog4" on each available CPU with command line
>>> arguments of "arg1" and "arg2". If each node has a
>>> CPU count of 1, the "C" is equivalent to "N". If at
>>> least one node has a CPU count greater than 1, LAM
>>> will run neighboring ranks of MPI_COMM_WORLD on that
>>> node. For example, if node 0 has a CPU count of
>>> 4 and
>>> node 1 has a CPU count of 2, "prog4" will have
>>> MPI_COMM_WORLD ranks 0 through 3 on n0, and ranks 4
>>> and 5 on n1.
>>>
>>> mpirun c0 C prog5
>>> Similar to the "prog3" example above, this runs
>>> "prog5"
>>> on CPU 0 *and* on each available CPU. This executes
>>> *2* copies on the node where CPU 0 is (i.e., n0).
>>> This is probably not a useful use of the "C"
>>> notation;
>>> it is only shown here for an example.
>>>
>>> Defaults: -c2c -w -pty -nger -nsigs
>>> --------------------------------------------------------------------
>>> --
>>> ------
>>>
>>> They both look the same to me.
>>>
>>> -- Rich
>>>
>>>
>>> Quoting Esteban Fiallos <erf008_at_[hidden]>:
>>>
>>>> As Jeff mentioned earlier this might be caused by using other MPI
>>>> implementation for mpirun.
>>>>
>>>> I had the exact same problem a month ago and I found out that my
>>>> PATH
>>>> variable was pointing to the MPICH version of mpirun. I changed my
>>>> path so
>>>> that it first pointed to the LAM/MPI directory and that fixed the
>>>> problem.
>>>>
>>>> What is the output in the command prompt if you just type mpirun?
>>>>
>>>> Esteban Fiallos
>>>> Data Mining Research Laboratory
>>>> Louisiana Tech University
>>>> http://dmrl.latech.edu/
>>>>
>>>> ----- Original Message -----
>>>> From: <rtichy_at_[hidden]>
>>>> To: "General LAM/MPI mailing list" <lam_at_[hidden]>
>>>> Sent: Thursday, February 16, 2006 9:33 AM
>>>> Subject: Re: LAM: trouble testing mpi on one processor
>>>>
>>>>
>>>>> Hi again Jeff,
>>>>>
>>>>> First off, and this is a little late, thank you so much for the
>>>>> help!
>>>>>
>>>>> I tried the getenv("LAMRANK") idea with a simple little hello
>>>>> world type
>>>>> thing
>>>>> and sure enough Get_rank was returning 0 for both processes but
>>>>> lamrank
>>>>> was
>>>>> different (0 and 1). Just to be sure you know what is going on I
>>>>> will post
>>>>
>>>>> code
>>>>> and output from the run:
>>>>>
>>>>> #include <iostream>
>>>>> #include <cstdlib>
>>>>> #include "mpi.h"
>>>>>
>>>>> using namespace std;
>>>>>
>>>>> int main(int argc, char *argv[]){
>>>>>
>>>>> MPI::Init(argc, argv );
>>>>> int rank, size;
>>>>> const int BUFFER_SIZE = 34;
>>>>>
>>>>> size = MPI::COMM_WORLD.Get_size();
>>>>> rank = MPI::COMM_WORLD.Get_rank();
>>>>>
>>>>> cout << "MPI::COMM_WORLD.Get_size(): " << size << endl;
>>>>> cout << "MPI::COMM_WORLD.Get_rank(): " << rank << endl;
>>>>> string lamrank(getenv("LAMRANK"));
>>>>> cout << "lamrank: " << lamrank << endl;
>>>>>
>>>>>
>>>>> if(rank == 0){
>>>>> string foo("Hello world from rank 0 to rank 1.");
>>>>> MPI::COMM_WORLD.Send(foo.c_str(), foo.length(), MPI::CHAR,
>>>>> 1, 1);
>>>>> }
>>>>> if(rank == 1){
>>>>> char buffer[BUFFER_SIZE];
>>>>> MPI::COMM_WORLD.Recv(buffer, BUFFER_SIZE, MPI::CHAR, 0, 1);
>>>>> string foo(buffer);
>>>>> cout << foo << endl;
>>>>> }
>>>>>
>>>>> MPI::Finalize();
>>>>> return 0;
>>>>> }
>>>>>
>>>>> ... and the commands I used from starting the lam daemon,
>>>>> compiling to
>>>>> mpirun.lam:
>>>>>
>>>>> rtichy_at_darwin:~/mpi/hello_world$ lamboot
>>>>>
>>>>> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
>>>>>
>>>>> rtichy_at_darwin:~/mpi/hello_world$ /etc/alternatives/mpiCC main.cc -
>>>>> o foo
>>>>> rtichy_at_darwin:~/mpi/hello_world$ /usr/lib/lam/bin/mpirun.lam -np
>>>>> 2 ./foo
>>>>> MPI::COMM_WORLD.Get_size(): 1
>>>>> MPI::COMM_WORLD.Get_rank(): 0
>>>>> lamrank: 0
>>>>>
>>>>> 0 - MPI_SEND : Invalid rank 1
>>>>> [0] Aborting program !
>>>>> [0] Aborting program!
>>>>> p0_9967: p4_error: : 8262
>>>>> MPI::COMM_WORLD.Get_size(): 1
>>>>> MPI::COMM_WORLD.Get_rank(): 0
>>>>> lamrank: 1
>>>>>
>>>>> 0 - MPI_SEND : Invalid rank 1
>>>>> [0] Aborting program !
>>>>> [0] Aborting program!
>>>>> p0_9968: p4_error: : 8262
>>>>>
>>>> -------------------------------------------------------------------
>>>> --
>>>> --------
>>>>> It seems that [at least] one of the processes that was started
>>>>> with
>>>>> mpirun did not invoke MPI_INIT before quitting (it is possible
>>>>> that
>>>>> more than one process did not invoke MPI_INIT -- mpirun was only
>>>>> notified of the first one, which was on node n0).
>>>>>
>>>>> mpirun can *only* be used with MPI programs (i.e., programs that
>>>>> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec"
>>>>> program
>>>>> to run non-MPI programs over the lambooted nodes.
>>>>>
>>>> -------------------------------------------------------------------
>>>> --
>>>> --------
>>>>>
>>>>> ...so you were right about LAMRANK. What next?
>>>>>
>>>>> --Rich
>>>>>
>>>>>
>>>>> Quoting Jeff Squyres <jsquyres_at_[hidden]>:
>>>>>
>>>>>> On Feb 15, 2006, at 5:08 PM, rtichy_at_[hidden] wrote:
>>>>>>
>>>>>>>> Can you verify that you're invoking LAM's mpirun command?
>>>>>>>
>>>>>>> I tried using every mpirun command on my machine. Including
>>>>>>> mpirun.lam in /usr/lib/lam and I still have the same problem,
>>>>>>> all
>>>>>>> processes created by lam believe they have rank one...
>>>>>>> MPI::COMM_WORLD.Get_rank() returns 0. I have always used lam
>>>>>>> over a
>>>>>>> network but was told it can be used to debug on a single
>>>>>>> machine.
>>>>>>> Is this really the case?
>>>>>>
>>>>>> Yes, I run multiple processes on a single machine all the time.
>>>>>>
>>>>>> I'm not familiar with your local installation, so I cannot verify
>>>>>> that /usr/lib/lam/mpirun.lam is the Right mpirun for the LAM
>>>>>> installation that you're using (it sounds like it, but it
>>>>>> depends on
>>>>>> how your sysadmins set it up).
>>>>>>
>>>>>> When you run in the form:
>>>>>>
>>>>>> mpirun -np 4 myapp
>>>>>>
>>>>>> Then the lamd's should set an environment variable in each
>>>>>> process
>>>>>> that it forks named LAMRANK that indicates that process' rank in
>>>>>> MPI_COMM_WORLD. Hence, each of the 4 should get different (and
>>>>>> unique) values. Try calling getenv("LAMRANK") in your
>>>>>> application to
>>>>>> verify this. If you get NULL back, then you're not being
>>>>>> launched by
>>>>>> a LAM daemon, and this is your problem (LAM assumes that it if
>>>>>> gets
>>>>>> NULL back from getenv("LAMRANK") that it's running in "singleton"
>>>>>> mode, meaning that it wasn't launched via LAM's mpirun and is the
>>>>>> only process in MPI_COMM_WORLD, and therefore assumes that it is
>>>>>> MCW
>>>>>> rank 0).
>>>>>>
>>>>>> If you *are* getting valid (and unique) values from getenv
>>>>>> ("LAMRANK")
>>>>>> and MPI::COMM_WORLD.Get_rank() is still returning 0 from all your
>>>>>> processes, then we need to probe a little deeper to figure out
>>>>>> what's
>>>>>> going on.
>>>>>>
>>>>>> --
>>>>>> {+} Jeff Squyres
>>>>>> {+} The Open MPI Project
>>>>>> {+} http://www.open-mpi.org/
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>> --
>> {+} Jeff Squyres
>> {+} The Open MPI Project
>> {+} http://www.open-mpi.org/
>>
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/
|