LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Ole Holm Nielsen (Ole.H.Nielsen_at_[hidden])
Date: 2005-07-04 04:17:01


The LAM-MPI integer*2 test dtyp/fint2_f.f fails with PGI 6.0-4
as described below. I added a couple of print-statements in the
code as requested, and now the code works as expected !

So I went back to the original code and added a -O0 (no
optimization) to FFLAGS in the dtyp/Makefile.
Now the error is absent even with the original code.
Then I changed the optimization to -O1 (the PGI default level),
and the error returns.

I have repeated this test on another system with the
Portland Group PGI compiler 5.2-4, and the exact same
behavior is observed: The error goes away when I insert
a print statement.

If I copy the receiver's test
         if (int2array(2) .ne. int2) then
           call lamtest_error_f('MPI_INTEGER2 receive error')
         endif
into the sender's code (rank .eq. 0) after the MPI_SEND,
the error exists also on the sender side.

I've experimented with this a bit, and it seems to me that
the PGI optimizer has removed the variable named "int2".
If I make a "call dummy(int2)" subroutine call, or if I
turn int2 into an array(3), the error goes away.

It seems to me that an "optimizer fooler" code such as
"call dummy(int2)" or better should be added to dtyp/fint2_f.f.
On the other hand, I'm not enough of an expert to judge
whether the present error is really a compiler bug or not.
I'd be happy to report the error to the Portland Group, in case
someone is dead sure that it's a compiler bug.

Best regards,
Ole

Jeff Squyres wrote:
> This is quite odd; the test is relatively simple. I've thought about
> this all day and I can't see how it would go wrong without the others
> going wrong as well. :-(
>
> If I could impose, could you stick a few print* statements in there and
> examine a few values to see what's happening at the relevant steps?
>
> Then again, if your applications do not use MPI_INTEGER2, don't worry
> about it! ;-) (however, I would like to figure this out and potentially
> fix whatever the problem is, if you've got a few free cycles)
>
> On Jun 29, 2005, at 10:44 AM, Ole Holm Nielsen wrote:
>
>> I managed to build the lamtests-7.1.1 with the Portland Group
>> PGI compilers version 6.0-4, employing a fix as discussed in
>> http://www.lam-mpi.org/MailArchives/lam/2005/06/10866.php
>>
>> When running the lamtests on 4 nodes over Ethernet, "make check"
>> gives a single unexpected error in a Fortran module:
>>
>> mpirun -x TEST C -ssi rpi crtcp
>> /home/camp/ohnielse/lamtests-7.1.1/dtyp/./fint2_f
>> [**ERROR**]: LAM/MPI MPI_COMM_WORLD rank 1, file fint2_f:
>> MPI_INTEGER2 receive error
>> FORTRAN STOP
>> FORTRAN STOP
>> FORTRAN STOP
>> -----------------------------------------------------------------------
>> ------
>> One of the processes started by mpirun has exited with a nonzero exit
>> code. This typically indicates that the process finished in error.
>> If your process did not finish in error, be sure to include a "return
>> 0" or "exit(0)" in your C code before exiting the application.
>>
>> PID 21610 failed on node n1 (10.2.131.139) with exit status 1.
>> -----------------------------------------------------------------------
>> ------
>>
>> This error occurs for all 5 modes: crtcp, lamd, sysv, tcp, usysv.
>> All other tests pass (except for the ones where errors are expected).
>> As requested in the README file, I attach a gzipped tar-file with
>> the files check.out config.log laminfo.log configure.log.
>>
>> What can be done to remedy this lamtests error ?
>>
>> Ole Holm Nielsen
>> Department of Physics, Technical University of Denmark