LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Eric Swenson (eric_at_[hidden])
Date: 2005-02-23 19:50:51


Hi,

We have an old Fortran-77 code doing huge static memory allocations, so
we've seen the same problem on x86_64 (this is running on opterons, I'm
assuming?)

- You want to use a fairly recent distribution, such as Suse Enterprise
9 or Suse Pro 9.2 (64-bit versions!), with updates applied. I'm sure
other recent distros will work as well, but we've had our best luck with
Suse. Somewhat older distros will NOT let you allocate >2GB blocks of
memory statically.

- I'd recommend that you try using pgf77 or pgf90, with the
-mcmodel=medium compiler flag. Please don't use g77, for the love of
all that is holy. BTW, make sure that you are using the 64-bit version
of pgf77- I've seen people install the 32-bit version and wonder why
they couldn't produce 64-bit code with it.

And with that, you should be good to go. Of course, the better option
would be to do as others here have suggested and dynamically allocate
memory instead, but the above two recommendations will get you going if
you wish to stick with that.

Good luck,
Eric

On Tue, 2005-02-22 at 17:43 -0500, Brian Barrett wrote:
> On Feb 22, 2005, at 4:32 PM, Srinivasa Prade Patri wrote:
>
> > Iam running LU decomposition algorithm on a 23 node cluster.
> > Each node has 2GB of RAM. When iam trying to run the algorithm for
> > single precision complex matrix of size 16400*16400,iam getting the
> > following compilation errors.
> >
> > /tmp/ccmtcg1e.o(.text+0xa1): In function `MAIN__':
> > : relocation truncated to fit: R_X86_64_32S .bss
>
> <snip>
>
> > I know for such a big matrix the matrix
> > size is more than 2GB (16400*16400*8 Bytes = 2.15GB). IS this the
> > reason for the above compilation errors?
> >
> > If this is the case,do we have any
> > techniques where i can run the algorithm for matrix sizes more than
> > 16400*16400. Or the cluster iam using is limited for certain problem
> > size (16384*16384).
>
> Based on your error messages and some google searches, it looks like
> you're tripping up a limitation in at least some versions of the GCC
> suite on x86_64 machines. They don't seem to do well with large text
> sections (anywhere they have to have hard coded addresses > 2GB in
> size. I was unable to reproduce the problems on our Opteron machines,
> but they are running a very recently installed version of Gentoo, so
> maybe the bug was fixed along the way? Based on that, upgrading might
> help.. The other option is to stick the large matrix in the heap
> instead of the text or data section by making it dynamically allocated.
> That should keep you from angering the linker gods.
>
>
> Hope this helps,
>
> Brian
>