Yikes; this does sound like compiler incompatibilities.
With the wrapper compilers, we're using pretty straight STL string
usage -- you might want to look at the bt's where it was failing and
see exactly what it was doing (from your statements, it looked like it
was failing *in* the STL somewhere, but it didn't give an indication of
where it was in the wrapper compiler itself).
FWIW, C++ (and therefore the STL) is only used in a few places in
top-level MPI executables (e.g., the wrappers). It is not used in
mpirun or libmpi at all (the C++ MPI bindings are in their own
library).
Your probably is likely not with mpirun -- mpirun simply sends messages
off to lamd's to start processes and then waits for messages back from
the lamd's saying that the processes have finished.
When you say that your application "quickly exits without errors", I
assume you mean that it doesn't appear to be doing whatever it is
supposed to be doing. FWIW: mpirun will complain if all processes in
your application does not call both MPI_INIT and MPI_FINALIZE. So if
you're not getting an error from mpirun, it looks like a lot of LAM
infrastructure has run successfully (i.e., all of MPI_INIT and
MPI_FINALIZE, which involves a bunch of synchronization with mpirun).
Are you getting corefiles at all? Can you tell if your application is
actually starting on the nodes? Is it mpirun that it exiting
prematurely, or is it your application that is somehow calling MPI_INIT
and MPI_FINALIZE quickly?
Hope these questions help...
On Dec 1, 2004, at 1:49 PM, Jim Shepherd wrote:
> On Wed, 2004-12-01 at 13:24 -0600, Ryuta Suzuki wrote:
>> I had the same/similar problem. What version of gcc/g++ compiler are
>> you
>> using? In case of gcc-3.4.x the STL library (libstdc++.so.6) are not
>> compatible with gcc-3.2/gcc-3.3 (libstdc++.so.5) and intel compiler
>> seems to have a trouble with resolving this issue. I was able to
>> succefully build with intel 8.1 + gcc-3.3.x combination and works
>> fine.
>> It is a bit strange since Intel compiler should be able to support
>> gcc-3.4. It might be the case for you to have multiple version of
>> gcc's.
>> Hope this help.
>
> Thanks for the suggestion. I am running on a freshly installed Fedora
> Core 3 system which only has gcc-3.4.2-6.fc3 installed (and the same
> release versions of libgcc,gcc-java,gcc-g77,and gcc-c++). With your
> finding of gcc-3.4 and intel 8.1 incompatibilities, I came across a
> couple of compiler switches to try from a google search. If I use the
> -gcc-version=340 option, the segmentation fault still occurs. However,
> if I use the -cxxlib-icc option, the segmentation faults no longer
> occur.
>
> Unfortunately, when I run my mpi-compiled program with mpirun (mpirun -
> np 1 ./lmp_linux), it quickly exits without any errors. If I run the
> program without mpirun (./lmp_linux), it works as expected. I'm going
> to look into debugging mpirun, but if anyone has any suggestions, I
> would appreciate it.
>
> -Jim
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|