LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Rich Naff (rlnaff_at_[hidden])
Date: 2007-04-16 15:16:47


You guys are tough! Don't let me get away with anything! Okay

The 64-bit dual core opteron only has OpenMpi installed, and I would like
to test my code in this environment. However, the only fortran
in a 64-bit flavor that is available to me is gfortran 4.1.1. Tim
Prince inquired as to why I had recommended g95 over gfortran. It is
on the basis of experiences such as these, with gfortran, that I
suggested g95. Here is the result of using gfortran 4.1.1 with LAM
7.1.3 with the same problem (32-bit machine):

**************************************************************************
(bash) robocomp.pts/6% ./lamboot_robo

LAM 7.1.3/MPI 2 C++/ROMIO - Indiana University

n-1<12713> ssi:boot:base:linear: booting n0 (robocomp)
n-1<12713> ssi:boot:base:linear: finished
(bash) robocomp.pts/6% lamboot

LAM 7.1.3/MPI 2 C++/ROMIO - Indiana University

(bash) robocomp.pts/6% mpirun -np 1 parent_713_gf.ex
  Input path to spawn executable
./
  Input name of spawn executable
PPCG_713_gf.ex
  Problem size currently 8x8x8; do you wish to change the problem size? (y/n)
n
  Input max_iter, max_L2
100 .01
  Input precond, infill_no
3 0
  Input max_part
8
  Input p_flag, t_flag
0 0
  Input i_bound type: 0, 1, 2 or 3
0
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 12722 failed on node n0 (127.0.0.1) due to signal 11.
-----------------------------------------------------------------------------
*****************************************************************************

I can tar this thing up and send it to you, if you think that it is
worth the while.

So you believe that a trial with gfortran 4.2 would be worth the effort?

--Rich Naff

On Mon, 16 Apr 2007, Jeff Squyres wrote:

> FWIW, it looks like you are using Open MPI, not LAM/MPI. Is that
> what you intended? (you're mailing the LAM/MPI mailing list, not the
> Open MPI mailing list)
>
> I ask because if that's not what you intended, perhaps mixing and
> matching of MPI implementations could be what is causing you problems.
>
> Additionally, from the stack trace you showed, it looks like the
> problem is in a call to MPI_Comm_get_attr. That *might* be
> relatively simple to track down. I have personally been using
> gfortran for all OMPI development for years; I haven't used g77 or
> g95 in a long, long time. But then again, I'm an MPI developer --
> not a real fortran user, so it's quite possible that I'm unaware of
> glaring gfortran bugs that would be problematic for real applications...
>
>
>
> On Apr 16, 2007, at 2:14 PM, Rich Naff wrote:
>
>> The issues I have with gfortran concern version 4.1.1; the gfortran
>> web
>> page indicates that the developers are not interested in bugs for
>> versions
>> older that 4.2:
>>
>> "gfortran is under development. that is a polite way of saying it
>> is not
>> finished..bugs are fixed and new features are added every day. before
>> reporting a bug, please check and see if you are using an old
>> version. (at
>> this date anything older than gcc-4.2 should be upgraded)."
>>
>> As I live in a world where the computer administrator does the
>> upgrades, I
>> am reluctant to ask for upgrades unless I can justify the effort.
>>
>> I have the unfortunate experience of having only gfortran 4.1.1
>> available on our 64 bit dual opteron machine, but a program that works
>> with other compilers on other 32-bit machiness, when compiled with
>> gfortran, doesn't execute properly:
>>
>> ******************************************************************
>> (bash) lobo09% mpirun -np 1 -hostfile hosts_09_16 parent_ompi_gf.ex
>> Input path to spawn executable
>> ./
>> Input name of spawn executable
>> PPCG_ompi_gf.ex
>> Problem size currently 8x8x8; do you wish to change the problem
>> size? (y/n)
>> n
>> Input max_iter, max_L2
>> 100 .01
>> Input precond, infill_no
>> 3 0
>> Input max_part
>> 8
>> Input p_flag, t_flag
>> 0 0
>> Input i_bound type: 0, 1, 2 or 3
>> 0
>> Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
>> Failing at addr:(nil)
>> [0] func:/usr/lib64/libopal.so.0 [0x2af8a06d994f]
>> [1] func:/lib64/libpthread.so.0 [0x2af8a0f321f0]
>> [2] func:/usr/lib64/libmpi.so.0(mpi_comm_get_attr_f+0x46)
>> [0x2af8a0436316]
>> [3] func:parent_ompi_gf.ex(__ms_pcg_parent__spawn+0x539) [0x41d2bc]
>> [4] func:parent_ompi_gf.ex(__ms_pcg_parent__ms_partition+0x5c3)
>> [0x4213f2]
>> [5] func:parent_ompi_gf.ex(MAIN__+0x108d) [0x410c77]
>> [6] func:parent_ompi_gf.ex(main+0xe) [0x4235be]
>> [7] func:/lib64/libc.so.6(__libc_start_main+0xf4) [0x2af8a1057e64]
>> [8] func:parent_ompi_gf.ex [0x403699]
>> *** End of error message ***
>> mpirun noticed that job rank 0 with PID 30086 on node "lobo09"
>> exited on signal
>> 11.
>> 1 additional process aborted (not shown)
>> **********************************************************************
>> *
>>
>> I obtain a similar result for a 32-bit machine compile and execution
>> (version 4.1.1).
>> With a g95-compilation on a 32 bit machine, I at least get a
>> complete execution:
>>
>> *******************************************************************
>> (bash) grundvand.pts/9% mpirun -np 1 -hostfile hosts_grundvand
>> parent_ompi_g
>> x
>> Input path to spawn executable
>> ./
>> Input name of spawn executable
>> PPCG_ompi_g95.ex
>> Problem size currently 8x8x8; do you wish to change the problem
>> size? (y/n)
>> n
>> Input max_iter, max_L2
>> 100 .01
>> Input precond, infill_no
>> 3 0
>> Input max_part
>> 8
>> Input p_flag, t_flag
>> 0 0
>> Input i_bound type: 0, 1, 2 or 3
>> 0
>> Number of spawned processes = 8
>>
>> SOLVE PROBLEM 1
>> max_L2= 0.0000950043185981186
>> iterations required= 38
>> SQUARE ROOT, SUMMED-SQUARED RESIDUALS; 4.61819E-05
>> ABSOLUTE MAXIMUM RESIDUAL; 1.24164E-04
>> MEAN ABSOLUTE RESIDUAL; 3.61884E-05
>> sqrt(x_resid.dot.x_resid); 1.04498E-03
>>
>> r.dot.x_resid; -4.64876E-05
>> sqrt(r.dot.r); .136107
>> **********************************************************************
>> ****
>>
>> Don't get me wrong; the gfortran compiler appears superior to g95 in
>> catching programming errors, so I am looking forward to its continued
>> development. However, I am now faced with the problem of asking our
>> system administrator for an upgrade: should it be to 64 bit g95, or
>> 64 bit gfortran version 4.2?
>>
>>
>> Concerning your other point, I use Makefiles to compile these complex
>> codes. Our g95 compiler sits on one machine (lobo) and I frequently
>> run on other machines. I set the following variables and flags in the
>> makefile:
>>
>> OMPI_FC=/z/lobo/usr/local/bin/g95
>> export OMPI_FC
>> LD_LIBS = -L/z/lobo/usr/local/g95/lib/gcc-lib/i686-pc-linux-gnu/4.0.1/
>> -lf95
>>
>> .
>> .
>> .
>>
>> FLAGS = $(USRLIB) $(SYSLIBS) $(LD_LIBS) $(FCFLAGS)
>> all: ${PROGRAM}
>>
>> ${PROGRAM} : ${OBJECT_FILES}
>> $(FC) -o $@ ${OBJECT_FILES} $(FLAGS)
>>
>>
>> So to answer your question, while I would assumed that g95 is similar
>> to gfortran in that it is built upon gcc, it appears to behave like
>> more tradional fortran libraries when compiled and executed. Caveat
>> emptor: I haven't really tested g95 in the cluster environment, so
>> this may be a problem. On our cluster, I generally use lf95.
>>
>> Rich Naff
>>
>> On Mon, 16 Apr 2007, Timothy C Prince wrote:
>>
>>>
>>>
>>> -----Original Message-----
>>> From: Rich Naff <rlnaff_at_[hidden]>
>>> To: General LAM/MPI mailing list <lam_at_[hidden]>
>>> Date: Mon, 16 Apr 2007 09:13:33 -0600 (MDT)
>>> Subject: Re: LAM: 2 question on lam
>>>
>>> On Sat, 14 Apr 2007, Ruhollah Moussavi Baygi wrote:
>>> ..
>>> ..
>>> ..
>>>> 1- By default, wrapper compiler mpif77 refers to g77 compiler.
>>>> But, I want
>>>> to use gfortran instead of g77 in parallel mode, .i.e. I want
>>>> that mpif77
>>>> uses gfortran while compiling a FORTRAN program. Please help me.
>>>>
>>>
>>> In the Linux world with the bash shell, you can temporarily change
>>> compilers with the following unset and export commands:
>>>
>>> unset LAMHF77
>>> export LAMHF77=gfortran (or path to other compiler)
>>>
>>>
>>> I recommend g95 over gfortran; I find gfortran to be a
>>> little buggy.
>>>
>>> _____________________________--
>>> That's undeniably true of gfortran 4.0; there are plenty of
>>> sources for better up to date versions of gfortran. You also have
>>> several options for reporting any bugs you find in current
>>> versions; please don't call it buggy without providing specifics.
>>>
>>> An earlier reply pointed out that lam f77 libraries must be
>>> rebuilt to match gfortran, unless you can somehow get by with g77
>>> compatibility options. Don't you have the same issues with g95?
>>> Tim Prince
>>>
>>> _______________________________________________
>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>
>>
>> --
>> Rich Naff
>> U.S. Geological Survey
>> MS 413 Box 25046
>> Denver, CO 80225 USA
>> telephone: (303) 236-4986
>> Email: rlnaff_at_[hidden]
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>
>

-- 
Rich Naff
U.S. Geological Survey
MS 413 Box 25046
Denver, CO 80225 USA
telephone: (303) 236-4986
Email: rlnaff_at_[hidden]