Hi again,
Quoting Jeff Squyres <jsquyres_at_[hidden]>:
> Sorry, you will need to reconfigure and recompile. :-(
>
No problem. If I understood the docs correctly I just run:
./configure --with-exceptions
Does that look right?
--Rich
> The LAM install docs have information about this.
>
>
> On Feb 16, 2006, at 5:34 PM, rtichy_at_[hidden] wrote:
>
> > Hi again,
> >
> > Thanks for the hints.
> >
> > I removed mpich and lam and reinstalled just lam, and it is now
> > working. Each
> > process has a different rank corresponding to its lamrank. But...
> > lam did not
> > install with c++ exception support as the kde package manager
> > (adept) installed
> > it automatically... Can I enable exception support now or do I need to
> > reconfigure and recompile... which I would have to do manually.
> >
> > Again, thanks for all the tips.
> >
> > -- Rich
> >
> >
> > Quoting Jeff Squyres <jsquyres_at_[hidden]>:
> >
> >> Please see my earlier post -- it looks like you are compiling with
> >> MPICH and running with LAM.
> >>
> >> http://www.lam-mpi.org/MailArchives/lam/2006/02/11913.php
> >>
> >>
> >> On Feb 16, 2006, at 3:55 PM, rtichy_at_[hidden] wrote:
> >>
> >>> Hi,
> >>>
> >>> I used the mpirun command from /usr/bin and then mpirun.lam in
> >>> /usr/lib/lam/bin,
> >>> also I installed this version of mpich and lam from adept (kde
> >>> package manager)
> >>> using kubuntu linux. It was intalled more or less automatically and
> >>> on my home
> >>> machine. If you think it necessary I could uninstall both the lam
> >>> and mpich
> >>> versions downloaded with adept and recompile from the source on the
> >>> web-site.
> >>>
> >>> rtichy_at_darwin:~/mpi/platypus_mercer$ /usr/lib/lam/bin/mpirun.lam
> >>> --------------------------------------------------------------------
> >>> --
> >>> -------
> >>> Synopsis: mpirun [options] <app>
> >>> mpirun [options] <where> <program> [<prog args>]
> >>>
> >>> Description: Start an MPI application in LAM/MPI.
> >>>
> >>> Notes:
> >>> [options] Zero or more of the options listed
> >>> below
> >>> <app> LAM/MPI appschema
> >>> <where> List of LAM nodes and/or CPUs
> >>> (examples
> >>> below)
> >>> <program> Must be a LAM/MPI program that
> >>> either
> >>> invokes MPI_INIT or has exactly
> >>> one of
> >>> its children invoke MPI_INIT
> >>> <prog args> Optional list of command line
> >>> arguments
> >>> to <program>
> >>>
> >>> Options:
> >>> -c <num> Run <num> copies of <program> (same
> >>> as -np)
> >>> -c2c Use fast library (C2C) mode
> >>> -client <rank> <host>:<port>
> >>> Run IMPI job; connect to the IMPI
> >>> server <host>
> >>> at port <port> as IMPI client
> >>> number <rank>
> >>> -D Change current working directory of
> >>> new
> >>> processes to the directory where the
> >>> executable resides
> >>> -f Do not open stdio descriptors
> >>> -ger Turn on GER mode
> >>> -h Print this help message
> >>> -l Force line-buffered output
> >>> -lamd Use LAM daemon (LAMD) mode
> >>> (opposite of -c2c)
> >>> -nger Turn off GER mode
> >>> -np <num> Run <num> copies of <program> (same
> >>> as -c)
> >>> -nx Don't export LAM_MPI_* environment
> >>> variables
> >>> -O Universe is homogeneous
> >>> -pty / -npty Use/don't use pseudo terminals when
> >>> stdout is
> >>> a tty
> >>> -s <nodeid> Load <program> from node <nodeid>
> >>> -sigs / -nsigs Catch/don't catch signals in MPI
> >>> application
> >>> -ssi <n> <arg> Set environment variable
> >>> LAM_MPI_SSI_<n>=<arg>
> >>> -toff Enable tracing with generation
> >>> initially off
> >>> -ton, -t Enable tracing with generation
> >>> initially on
> >>> -tv Launch processes under TotalView
> >>> Debugger
> >>> -v Be verbose
> >>> -w / -nw Wait/don't wait for application to
> >>> complete
> >>> -wd <dir> Change current working directory of
> >>> new
> >>> processes to <dir>
> >>> -x <envlist> Export environment vars in <envlist>
> >>>
> >>> Nodes: n<list>, e.g., n0-3,5
> >>> CPUS: c<list>, e.g., c0-3,5
> >>> Extras: h (local node), o (origin node), N (all nodes), C
> >>> (all CPUs)
> >>>
> >>> Examples: mpirun n0-7 prog1
> >>> Executes "prog1" on nodes 0 through 7.
> >>>
> >>> mpirun -lamd -x FOO=bar,DISPLAY N prog2
> >>> Executes "prog2" on all nodes using the LAMD RPI.
> >>> In the environment of each process, set FOO to the
> >>> value
> >>> "bar", and set DISPLAY to the current value.
> >>>
> >>> mpirun n0 N prog3
> >>> Run "prog3" on node 0, *and* all nodes. This
> >>> executes *2*
> >>> copies on n0.
> >>>
> >>> mpirun C prog4 arg1 arg2
> >>> Run "prog4" on each available CPU with command line
> >>> arguments of "arg1" and "arg2". If each node has a
> >>> CPU count of 1, the "C" is equivalent to "N". If at
> >>> least one node has a CPU count greater than 1, LAM
> >>> will run neighboring ranks of MPI_COMM_WORLD on that
> >>> node. For example, if node 0 has a CPU count of
> >>> 4 and
> >>> node 1 has a CPU count of 2, "prog4" will have
> >>> MPI_COMM_WORLD ranks 0 through 3 on n0, and ranks 4
> >>> and 5 on n1.
> >>>
> >>> mpirun c0 C prog5
> >>> Similar to the "prog3" example above, this runs
> >>> "prog5"
> >>> on CPU 0 *and* on each available CPU. This executes
> >>> *2* copies on the node where CPU 0 is (i.e., n0).
> >>> This is probably not a useful use of the "C"
> >>> notation;
> >>> it is only shown here for an example.
> >>>
> >>> Defaults: -c2c -w -pty -nger -nsigs
> >>> --------------------------------------------------------------------
> >>> --
> >>> -------
> >>> rtichy_at_darwin:~/mpi/platypus_mercer$ mpirun
> >>> --------------------------------------------------------------------
> >>> --
> >>> -------
> >>> Synopsis: mpirun [options] <app>
> >>> mpirun [options] <where> <program> [<prog args>]
> >>>
> >>> Description: Start an MPI application in LAM/MPI.
> >>>
> >>> Notes:
> >>> [options] Zero or more of the options listed
> >>> below
> >>> <app> LAM/MPI appschema
> >>> <where> List of LAM nodes and/or CPUs
> >>> (examples
> >>> below)
> >>> <program> Must be a LAM/MPI program that
> >>> either
> >>> invokes MPI_INIT or has exactly
> >>> one of
> >>> its children invoke MPI_INIT
> >>> <prog args> Optional list of command line
> >>> arguments
> >>> to <program>
> >>>
> >>> Options:
> >>> -c <num> Run <num> copies of <program> (same
> >>> as -np)
> >>> -c2c Use fast library (C2C) mode
> >>> -client <rank> <host>:<port>
> >>> Run IMPI job; connect to the IMPI
> >>> server <host>
> >>> at port <port> as IMPI client
> >>> number <rank>
> >>> -D Change current working directory of
> >>> new
> >>> processes to the directory where the
> >>> executable resides
> >>> -f Do not open stdio descriptors
> >>> -ger Turn on GER mode
> >>> -h Print this help message
> >>> -l Force line-buffered output
> >>> -lamd Use LAM daemon (LAMD) mode
> >>> (opposite of -c2c)
> >>> -nger Turn off GER mode
> >>> -np <num> Run <num> copies of <program> (same
> >>> as -c)
> >>> -nx Don't export LAM_MPI_* environment
> >>> variables
> >>> -O Universe is homogeneous
> >>> -pty / -npty Use/don't use pseudo terminals when
> >>> stdout is
> >>> a tty
> >>> -s <nodeid> Load <program> from node <nodeid>
> >>> -sigs / -nsigs Catch/don't catch signals in MPI
> >>> application
> >>> -ssi <n> <arg> Set environment variable
> >>> LAM_MPI_SSI_<n>=<arg>
> >>> -toff Enable tracing with generation
> >>> initially off
> >>> -ton, -t Enable tracing with generation
> >>> initially on
> >>> -tv Launch processes under TotalView
> >>> Debugger
> >>> -v Be verbose
> >>> -w / -nw Wait/don't wait for application to
> >>> complete
> >>> -wd <dir> Change current working directory of
> >>> new
> >>> processes to <dir>
> >>> -x <envlist> Export environment vars in <envlist>
> >>>
> >>> Nodes: n<list>, e.g., n0-3,5
> >>> CPUS: c<list>, e.g., c0-3,5
> >>> Extras: h (local node), o (origin node), N (all nodes), C
> >>> (all CPUs)
> >>>
> >>> Examples: mpirun n0-7 prog1
> >>> Executes "prog1" on nodes 0 through 7.
> >>>
> >>> mpirun -lamd -x FOO=bar,DISPLAY N prog2
> >>> Executes "prog2" on all nodes using the LAMD RPI.
> >>> In the environment of each process, set FOO to the
> >>> value
> >>> "bar", and set DISPLAY to the current value.
> >>>
> >>> mpirun n0 N prog3
> >>> Run "prog3" on node 0, *and* all nodes. This
> >>> executes *2*
> >>> copies on n0.
> >>>
> >>> mpirun C prog4 arg1 arg2
> >>> Run "prog4" on each available CPU with command line
> >>> arguments of "arg1" and "arg2". If each node has a
> >>> CPU count of 1, the "C" is equivalent to "N". If at
> >>> least one node has a CPU count greater than 1, LAM
> >>> will run neighboring ranks of MPI_COMM_WORLD on that
> >>> node. For example, if node 0 has a CPU count of
> >>> 4 and
> >>> node 1 has a CPU count of 2, "prog4" will have
> >>> MPI_COMM_WORLD ranks 0 through 3 on n0, and ranks 4
> >>> and 5 on n1.
> >>>
> >>> mpirun c0 C prog5
> >>> Similar to the "prog3" example above, this runs
> >>> "prog5"
> >>> on CPU 0 *and* on each available CPU. This executes
> >>> *2* copies on the node where CPU 0 is (i.e., n0).
> >>> This is probably not a useful use of the "C"
> >>> notation;
> >>> it is only shown here for an example.
> >>>
> >>> Defaults: -c2c -w -pty -nger -nsigs
> >>> --------------------------------------------------------------------
> >>> --
> >>> ------
> >>>
> >>> They both look the same to me.
> >>>
> >>> -- Rich
> >>>
> >>>
> >>> Quoting Esteban Fiallos <erf008_at_[hidden]>:
> >>>
> >>>> As Jeff mentioned earlier this might be caused by using other MPI
> >>>> implementation for mpirun.
> >>>>
> >>>> I had the exact same problem a month ago and I found out that my
> >>>> PATH
> >>>> variable was pointing to the MPICH version of mpirun. I changed my
> >>>> path so
> >>>> that it first pointed to the LAM/MPI directory and that fixed the
> >>>> problem.
> >>>>
> >>>> What is the output in the command prompt if you just type mpirun?
> >>>>
> >>>> Esteban Fiallos
> >>>> Data Mining Research Laboratory
> >>>> Louisiana Tech University
> >>>> http://dmrl.latech.edu/
> >>>>
> >>>> ----- Original Message -----
> >>>> From: <rtichy_at_[hidden]>
> >>>> To: "General LAM/MPI mailing list" <lam_at_[hidden]>
> >>>> Sent: Thursday, February 16, 2006 9:33 AM
> >>>> Subject: Re: LAM: trouble testing mpi on one processor
> >>>>
> >>>>
> >>>>> Hi again Jeff,
> >>>>>
> >>>>> First off, and this is a little late, thank you so much for the
> >>>>> help!
> >>>>>
> >>>>> I tried the getenv("LAMRANK") idea with a simple little hello
> >>>>> world type
> >>>>> thing
> >>>>> and sure enough Get_rank was returning 0 for both processes but
> >>>>> lamrank
> >>>>> was
> >>>>> different (0 and 1). Just to be sure you know what is going on I
> >>>>> will post
> >>>>
> >>>>> code
> >>>>> and output from the run:
> >>>>>
> >>>>> #include <iostream>
> >>>>> #include <cstdlib>
> >>>>> #include "mpi.h"
> >>>>>
> >>>>> using namespace std;
> >>>>>
> >>>>> int main(int argc, char *argv[]){
> >>>>>
> >>>>> MPI::Init(argc, argv );
> >>>>> int rank, size;
> >>>>> const int BUFFER_SIZE = 34;
> >>>>>
> >>>>> size = MPI::COMM_WORLD.Get_size();
> >>>>> rank = MPI::COMM_WORLD.Get_rank();
> >>>>>
> >>>>> cout << "MPI::COMM_WORLD.Get_size(): " << size << endl;
> >>>>> cout << "MPI::COMM_WORLD.Get_rank(): " << rank << endl;
> >>>>> string lamrank(getenv("LAMRANK"));
> >>>>> cout << "lamrank: " << lamrank << endl;
> >>>>>
> >>>>>
> >>>>> if(rank == 0){
> >>>>> string foo("Hello world from rank 0 to rank 1.");
> >>>>> MPI::COMM_WORLD.Send(foo.c_str(), foo.length(), MPI::CHAR,
> >>>>> 1, 1);
> >>>>> }
> >>>>> if(rank == 1){
> >>>>> char buffer[BUFFER_SIZE];
> >>>>> MPI::COMM_WORLD.Recv(buffer, BUFFER_SIZE, MPI::CHAR, 0, 1);
> >>>>> string foo(buffer);
> >>>>> cout << foo << endl;
> >>>>> }
> >>>>>
> >>>>> MPI::Finalize();
> >>>>> return 0;
> >>>>> }
> >>>>>
> >>>>> ... and the commands I used from starting the lam daemon,
> >>>>> compiling to
> >>>>> mpirun.lam:
> >>>>>
> >>>>> rtichy_at_darwin:~/mpi/hello_world$ lamboot
> >>>>>
> >>>>> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
> >>>>>
> >>>>> rtichy_at_darwin:~/mpi/hello_world$ /etc/alternatives/mpiCC main.cc -
> >>>>> o foo
> >>>>> rtichy_at_darwin:~/mpi/hello_world$ /usr/lib/lam/bin/mpirun.lam -np
> >>>>> 2 ./foo
> >>>>> MPI::COMM_WORLD.Get_size(): 1
> >>>>> MPI::COMM_WORLD.Get_rank(): 0
> >>>>> lamrank: 0
> >>>>>
> >>>>> 0 - MPI_SEND : Invalid rank 1
> >>>>> [0] Aborting program !
> >>>>> [0] Aborting program!
> >>>>> p0_9967: p4_error: : 8262
> >>>>> MPI::COMM_WORLD.Get_size(): 1
> >>>>> MPI::COMM_WORLD.Get_rank(): 0
> >>>>> lamrank: 1
> >>>>>
> >>>>> 0 - MPI_SEND : Invalid rank 1
> >>>>> [0] Aborting program !
> >>>>> [0] Aborting program!
> >>>>> p0_9968: p4_error: : 8262
> >>>>>
> >>>> -------------------------------------------------------------------
> >>>> --
> >>>> --------
> >>>>> It seems that [at least] one of the processes that was started
> >>>>> with
> >>>>> mpirun did not invoke MPI_INIT before quitting (it is possible
> >>>>> that
> >>>>> more than one process did not invoke MPI_INIT -- mpirun was only
> >>>>> notified of the first one, which was on node n0).
> >>>>>
> >>>>> mpirun can *only* be used with MPI programs (i.e., programs that
> >>>>> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec"
> >>>>> program
> >>>>> to run non-MPI programs over the lambooted nodes.
> >>>>>
> >>>> -------------------------------------------------------------------
> >>>> --
> >>>> --------
> >>>>>
> >>>>> ...so you were right about LAMRANK. What next?
> >>>>>
> >>>>> --Rich
> >>>>>
> >>>>>
> >>>>> Quoting Jeff Squyres <jsquyres_at_[hidden]>:
> >>>>>
> >>>>>> On Feb 15, 2006, at 5:08 PM, rtichy_at_[hidden] wrote:
> >>>>>>
> >>>>>>>> Can you verify that you're invoking LAM's mpirun command?
> >>>>>>>
> >>>>>>> I tried using every mpirun command on my machine. Including
> >>>>>>> mpirun.lam in /usr/lib/lam and I still have the same problem,
> >>>>>>> all
> >>>>>>> processes created by lam believe they have rank one...
> >>>>>>> MPI::COMM_WORLD.Get_rank() returns 0. I have always used lam
> >>>>>>> over a
> >>>>>>> network but was told it can be used to debug on a single
> >>>>>>> machine.
> >>>>>>> Is this really the case?
> >>>>>>
> >>>>>> Yes, I run multiple processes on a single machine all the time.
> >>>>>>
> >>>>>> I'm not familiar with your local installation, so I cannot verify
> >>>>>> that /usr/lib/lam/mpirun.lam is the Right mpirun for the LAM
> >>>>>> installation that you're using (it sounds like it, but it
> >>>>>> depends on
> >>>>>> how your sysadmins set it up).
> >>>>>>
> >>>>>> When you run in the form:
> >>>>>>
> >>>>>> mpirun -np 4 myapp
> >>>>>>
> >>>>>> Then the lamd's should set an environment variable in each
> >>>>>> process
> >>>>>> that it forks named LAMRANK that indicates that process' rank in
> >>>>>> MPI_COMM_WORLD. Hence, each of the 4 should get different (and
> >>>>>> unique) values. Try calling getenv("LAMRANK") in your
> >>>>>> application to
> >>>>>> verify this. If you get NULL back, then you're not being
> >>>>>> launched by
> >>>>>> a LAM daemon, and this is your problem (LAM assumes that it if
> >>>>>> gets
> >>>>>> NULL back from getenv("LAMRANK") that it's running in "singleton"
> >>>>>> mode, meaning that it wasn't launched via LAM's mpirun and is the
> >>>>>> only process in MPI_COMM_WORLD, and therefore assumes that it is
> >>>>>> MCW
> >>>>>> rank 0).
> >>>>>>
> >>>>>> If you *are* getting valid (and unique) values from getenv
> >>>>>> ("LAMRANK")
> >>>>>> and MPI::COMM_WORLD.Get_rank() is still returning 0 from all your
> >>>>>> processes, then we need to probe a little deeper to figure out
> >>>>>> what's
> >>>>>> going on.
> >>>>>>
> >>>>>> --
> >>>>>> {+} Jeff Squyres
> >>>>>> {+} The Open MPI Project
> >>>>>> {+} http://www.open-mpi.org/
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> _______________________________________________
> >>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >>>>
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >>
> >>
> >> --
> >> {+} Jeff Squyres
> >> {+} The Open MPI Project
> >> {+} http://www.open-mpi.org/
> >>
> >>
> >> _______________________________________________
> >> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> >>
> >
> >
> >
> > _______________________________________________
> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>
> --
> {+} Jeff Squyres
> {+} The Open MPI Project
> {+} http://www.open-mpi.org/
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|