LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-11-21 09:30:19


A few people have rightfully pointed out to me in off-list e-mails two
main themes that I should be more clear about:

1. I am NOT trying to bash libtool. I"m actually quite a strong
proponent of AC/AM/LT (indeed, I just donated code to LT about the
Portland compilers the other day). More importantly, Libtool is NOT
the only bottleneck in our build process -- it's just an easy one to
talk about. The post that I replied to specifically referred to the
build time, so the libtool bottleneck was an easy thing to reply about.
  Although AC/AM are fine tools as well, they also have limitations that
we have had to spend a considerable amount of effort to work around
(read: kludge) because what we want/need to do is outside the normal
AC/AM model. These invoke large bottlenecks of their own (e.g., run
./configure in LAM and time it; also, the bootstrapping process for
running ./configure, which only developers see, can take in excess of 5
minutes on some of our development machines). All of these
bottlenecks, which need to be invoked not infrequently (remember that
we're still in a state of rapid, active development), cause developers
to be sitting around waiting for the configure/build system rather than
working on code.

2. There's the argument, "it ain't broke, so why fix it? Just be
patient." Let me explain a little more of our motivation. We're not
looking to revamp this system *now*. It may not even occur until a few
months from now (i.e., this may wait until after the first stable
release). So we're taking the time to rethink the entire
configure/build process. Specifically: should we just optimize the
current [AC/AM/LT] setup, or should we change to a new system? Hence,
this information-gathering phase. We also have some new requirements
-- the most notable of which is to be able to build natively on Windows
platforms with the Microsoft cl compiler. AC's current generation just
does not support this at all (although, in fairness, I have not [yet]
posted to the AC list asking if there are plans to do so), so we
currently have a wholly separate configure/build system for Windows.
This is *quite* unattractive from a software engineering and
maintenance point of view (I think LT would need to also have some
knowledge of the MS cl compiler, too. The biggest "gotcha" is that all
of cl's command line options start with "/", not "-" -- so things like
"-g" are "/g", which usually entails an overhaul of an entire
configure/build system to parameterize on the leading option
indicator).

So we're evaluating our current system and looking at other systems --
trying to gauge what would be the best for a long-term solution.

Hope this helps clarify our position.

On Nov 20, 2004, at 8:02 AM, Jeff Squyres wrote:

> On Nov 18, 2004, at 5:39 PM, Anthony J. Ciani wrote:
>
>>> 1. If Open MPI uses a build system that requires extra tools (such
>>> as cmake
>>> or jam or ...) to be installed in order to be built from source,
>>> would this
>>> be a deterrent to you installing Open MPI from a source tarball?
>>
>> It wouldn't be a deterent, but why should I have to build and install
>> three packages instead of one? I haven't seen other projects
>> adopting jam
>> and cmake, so they would be sort of "LAM-only" tools. This really
>> isn't a time conservation issue, because the build time difference
>> between
>> JA/CM and AC/AM/LT would only be a minute or two. In fact, I would
>> need
>> to configure LAM many times to make up for the time spent installing
>> jam
>> and cmake. Now, if this is about portability or code maintainance,
>> that's
>> another issue, but the choice between JA/CM and AC/AM/LT really only
>> affects how the package is configured.
>
> We are thinking of changing for these reasons (portability and code
> maintenance) *and* for linking speed. I agree that compiling speed
> won't change, but GNU Libtool is actually pretty slow. Consider
> building libmpi. In Open MPI, it's several hundred .c files. This
> results in several hundred .o files, and therefore several hundred .lo
> files ("Libtool Object" files, which are text files containing meta
> data about their corresponding .o files). Libtool has to analyze each
> of these .lo files (and or .la files, if they were previously rolled
> up into Libtool libraries) and then do the final link.
>
> Currently, Open MPI uses AC/AM/LT. I did an admittedly quite
> unscientific benchmark on a cluster with reasonably fast nodes that we
> use for Open MPI development. This was done on the head node; the NFS
> server.
>
> 1. Open MPI rolls up .c files into 38 .la files, and then rolls those
> up into a single libmpi.la (there is only one level of rolling). I
> removed all .la files and invoked "make". This entailed Libtool
> rolling up all the .la files into libmpi.la, and took just under 2
> minutes (1:57, IIRC).
>
> 2. With some creative find's and grep's, I found all the .lo files
> necessary to build libmpi.la and manually issued a single libtool
> command to build libmpi.la from them (i.e., no .la rollup). This took
> just over 1 minute (1:05, IIRC).
>
> 3. I then took the same list of files and s/.lo$/.o/ and manually
> issued a single "ar" command to create libmpi.a (followed by ranlib,
> of course). This took 3 seconds.
>
> So Libtool is actually quite slow -- remember that it's a Bourne shell
> script. Don't get me wrong; I'm not knocking Libtool. It's a great
> product and we use it because it's able to make shared libraries on a
> wide variety of platforms. But *every* time we developers need to
> recompile libmpi means a 1 or 2 minute link -- when it really could be
> 3 seconds. Hence, this is overhead because libtool is a) written in a
> shell, and b) is analyzing all of its meta dependencies. Multiply
> this by a million (which is roughly the number of times a day that an
> Open MPI developer relinks libmpi) and the result actually does add up
> to quite a significant amount of time that we're just waiting for
> Libtool. From a performance standpoint, this is a no-brainer -- this
> is at least one bottleneck that can be fixed / changed.
>
> In all fairness, I have *not* tried the Libtool 1.9/2.x betas to see
> if there is any speed improvement.
>
> Just trying to provide some concrete reasoning behind our rationale...
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/