On Feb 11, 2005, at 4:47 PM, Paul Mitchell wrote:
First, I wanted to apologize for the slow reply. I'm a bit behind this
month (Oral qualifier exams on Friday). Anyway, on with the
question....
> Just joined the list and the LAM world in general. Trying to run
> LAM on
> a set of G5 Xserves and have run into a problem;
>
> After downloading and installing the package on one machine (call it
> bp08)
> I get the following:
>
> laminfo
> LAM/MPI: 7.1.1
> Prefix: /usr/local
> Architecture: powerpc-apple-darwin7.5.0
>
> Of course, it turns out that we need Fortran support, and I have to
> compile the code.
Yeah, I wish there was a better way to do packaging with all the
Fortran compilers :/. If there is sufficient interest, I can make the
scripty foo that we use to build the OS X package available. You could
then build a custom package using the compilers of your choice. Any
opinions, anyone?
> I've done this on the same machine, however, as it turns out,
> /usr/local
> is NFS exported from another box (call it bp01). So, I tarred the
> directories up and moved them to bp01, untarred them and ran make
> install. On this box I get:
>
> ./laminfo
> LAM/MPI: 7.1.1
> Prefix: /usr/local
> Architecture: m68k-apple-macos
> Configured by: pmitchel
> Configured on: Fri Feb 11 15:22:40 EST 2005
> Configure host: bp08.isis.unc.edu
> Memory manager: ptmalloc2
> C bindings: yes
> C++ bindings: yes
> Fortran bindings: yes
> C compiler: gcc
> C++ compiler: g++
> Fortran compiler: /opt/ibmcmp/xlf/8.1/bin/f77
> Fortran symbols: plain
> C profiling: yes
> C++ profiling: yes
> Fortran profiling: yes
> C++ exceptions: no
> Thread support: yes
> ROMIO support: yes
> IMPI support: no
> Debug support: no
> Purify clean: no
> Segmentation fault
>
> (Lets disregard the Segmentation fault for the moment. at least I have
> the
> Fortran bindings). The /usr/local directory is exported back to bp08:
Actually, I think there's something critically wrong with your build of
LAM/MPI. There are three things that jump out at me right away.
First, the architecture is completely wrong (it should be
powerpc-apple-darwin*, not m68k-apple-macos). We don't support
apple-macos as a platform, as that's OS 9 (and therefore not very
Unix-like). Second, the wrong memory manager was used (Memory manager
should be none or darwin7malloc on OS X). Finally, the segmentation
fault - this is probably due to the wrong memory manager being chosen.
Of course, the wrong memory manager was probably chosen because the
wrong architecture was detected.
Which is a really long-winded way of saying that you've got a borked
build that probably isn't ever going to work properly :(. I'd
recommend fixing that by rebuilding from scratch (untar, configure,
etc.) on your NFS server, assuming they all run OS X. Otherwise, I
would be tempted to get the NFS server setup so that you can run "make
install" from the host where you want to run LAM jobs.
> bp01.isis.unc.edu:/mnt
> 80287128 10123064 69908064 13%
> /private/automount/usr/local
>
> Based on a letter I found from Feb 10th on this list, I went to
> /Library/Receipts/Contents/Resources and found lam-mpi.bom, which when
> querried produces:
>
> . 40775 501/0
> ./usr 40775 501/0
> ./usr/local 40775 501/0
> ./usr/local/bin 40775 501/0
> ./usr/local/bin/hboot 100755 501/0 16988 1066722712
> .
> .
> .
> ./usr/local/man/man7/libmpi.7 100644 501/0 7967 1516485839
> ./usr/local/man/man7/mpi.7 100644 501/0 7961 3186708915
> ./usr/local/man/mans 40775 501/0
> ./usr/local/man/mans/mpi.share 100644 501/0 7965 2451356181
> ./usr/local/share 40775 501/0
> ./usr/local/share/lam 40775 501/0
> ./usr/local/share/lam/doc 40775 501/0
> ./usr/local/share/lam/doc/APPLE_LICENSE 100644 501/0 19829
> 335628692
>
> Now, I've rebooted bp08, unmounted /usr/local, and find nothing in any
> of
> these directories. So there';s nothing I can remove, save the
> lam-mpi.pkg
> directory from /Library/Receipts, I suppose. But somewhere on this box
> there's some information since the laminfo is so radically different
> between the two machines!
Since it sounds like /usr/local on bp08 was an NFS mount from bp01 when
you installed the LAM packaga, the data was probably all written out
into that NFS share. So my guess is that it was all overwritten when
you installed the (not so healthy) build from scratch that you were
having trouble with earlier. Other than that, I don't really know what
to tell you. I might recommend using 'which laminfo' to make sure you
don't have multiple copies of laminfo in your path or anything like
that.
As to why laminfo is so different, I can't really say without seeing
the config.log from your build (which it sounds like will eventually
become the problem you have...).
Hope this helps,
Brian
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
|