Hello everyone,
I am currently trying to get my MPI app to run off a somewhat heterogenous environement. Here is how I compile my apps:
1- Run the following command on all arches (head node and slave node)
mpicc -lm -lX11 -o mandelbrot-mpi.$(laminfo -arch | cut -d' ' -f10) mandelbrot-mpi.c
This generates the following binaries:
mandelbrot-mpi.i686-pc-linux-gnu
mandelbrot-mpi.x86_64-pc-linux-gnu
2- Start lam-mpi on the desired nodes with lamboot:
lamboot small_hst
Where small_hst contains:
headless
thinkbig1
thinkbig21
The "headless" host is the head node (dual opteron, x86_64) and the other "thinkbig" nodes are AthlonXP nodes. Lamboot starts with no complaints
3- (Try to) use mpiexec to launch the parallel application:
mpiexec -n 4 -arch i686-pc-linux-gnu $PWD/mandelbrot-mpi.i686-pc-linux-gnu 100 200 200 1 : -arch x86_64-pc-linux-gnu $PWD/mandelbrot-mpi.x86_64-pc-linux-gnu 100 200 200 1
The output I get is:
Use of uninitialized value in concatenation (.) or string at /usr/bin/mpiexec line 641.
Use of uninitialized value in pattern match (m//) at /usr/bin/mpiexec line 640.
Use of uninitialized value in pattern match (m//) at /usr/bin/mpiexec line 640.
Use of uninitialized value in concatenation (.) or string at /usr/bin/mpiexec line 641.
Use of uninitialized value in pattern match (m//) at /usr/bin/mpiexec line 640.
Use of uninitialized value in pattern match (m//) at /usr/bin/mpiexec line 640.
Use of uninitialized value in pattern match (m//) at /usr/bin/mpiexec line 640.
Use of uninitialized value in concatenation (.) or string at /usr/bin/mpiexec line 641.
Use of uninitialized value in pattern match (m//) at /usr/bin/mpiexec line 640.
/export/home/eric/1_Files/1_ETS/1_Maitrise/MGL810/Devoir2/mandelbrot-mpi.i686-pc-linux-gnu: error while loading shared libraries: liblamf77mpi.so.0: cannot open shared object file: No such file or directory
/export/home/eric/1_Files/1_ETS/1_Maitrise/MGL810/Devoir2/mandelbrot-mpi.i686-pc-linux-gnu: error while loading shared libraries: liblamf77mpi.so.0: cannot open shared object file: No such file or directory
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).
mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------
mpirun failed with exit status 252
Now, I noticed the library loading error and get it even though I set the following in my ~/.bashrc (which is sourced by ~/.profile, and that is the only thing that ~/.profile does):
if [ $(uname -m) == "x86_64" ]
then
export LD_LIBRARY_PATH="/usr/lib64"
else
export LD_LIBRARY_PATH="/usr/lib"
fi
Which seems to have no impact unless I am loging on interactively (I made sure that ~/.bashrc was not being bypassed within the script in that specific case).
Now, the questions:
1- I am not sure I am using mpiexec correctly (based my command line on the FAQ and the manpage).
2- How do I get lam-mpi to look in the correct path for the libraries. The manpage for lamboot claims that ~/.profile is sourced by deault on the local nodes but I have no way of confirming this.
3- Is setting the LD_LIBRARY_PATH the real solution to my problem or am-I missing something else?
4- This application has the first process 0 perform some display. The first process _has_ to be one running on the host named "headless", where all commands are launched. Am-I assuming that the process 0 will always be on the node first?
Thanks for the info in advance,
Eric Thibodeau
PS: I am also trying to do this with OpenMPI, if it's easyer to accomplish this under OpenMPI, please don't hesitate to inform me of this since I found no evidence that it was (I also decided not to cross-post this to the OpenMPI list for the moment)
|