This is certainly a fair point.
However, my concern here is that the configure/build system is an
entire sub-system in itself -- it's rather large and complex, which
means that it can have plenty of bugs in itself. For example, the
current configure/build system is approximately 20,000+ lines of code.
The thought of creating and maintaining two sub-systems of this size
that perform roughly the same functionality is daunting, to say the
least.
On Nov 22, 2004, at 12:04 PM, Bussoletti, John E wrote:
> Jeff,
>
> It seems you have two problems that your trying to address with a
> single
> solution.
>
> On the one hand, for the development process, you want fast builds. On
> the other hand you want universal ability to build LAM on a wide
> variety
> of systems.
>
> Why insist on a single answer to two problems? It seems to me that you
> could use cmake etc internally and, prior to a release, revert back to
> the more-widely available toolset automake, autoconf etc.
>
> John Bussoletti
>
>
> -----Original Message-----
> From: Jeff Squyres [mailto:jsquyres_at_[hidden]]
> Sent: Sunday, November 21, 2004 6:30 AM
> To: General LAM/MPI mailing list
> Subject: Re: LAM: Poll for LAM users
>
>
> A few people have rightfully pointed out to me in off-list e-mails two
> main themes that I should be more clear about:
>
> 1. I am NOT trying to bash libtool. I"m actually quite a strong
> proponent of AC/AM/LT (indeed, I just donated code to LT about the
> Portland compilers the other day). More importantly, Libtool is NOT
> the only bottleneck in our build process -- it's just an easy one to
> talk about. The post that I replied to specifically referred to the
> build time, so the libtool bottleneck was an easy thing to reply about.
> Although AC/AM are fine tools as well, they also have limitations
> that
>
> we have had to spend a considerable amount of effort to work around
> (read: kludge) because what we want/need to do is outside the normal
> AC/AM model. These invoke large bottlenecks of their own (e.g., run
> ./configure in LAM and time it; also, the bootstrapping process for
> running ./configure, which only developers see, can take in excess of 5
> minutes on some of our development machines). All of these
> bottlenecks, which need to be invoked not infrequently (remember that
> we're still in a state of rapid, active development), cause developers
> to be sitting around waiting for the configure/build system rather than
> working on code.
>
> 2. There's the argument, "it ain't broke, so why fix it? Just be
> patient." Let me explain a little more of our motivation. We're not
> looking to revamp this system *now*. It may not even occur until a few
> months from now (i.e., this may wait until after the first stable
> release). So we're taking the time to rethink the entire
> configure/build process. Specifically: should we just optimize the
> current [AC/AM/LT] setup, or should we change to a new system? Hence,
> this information-gathering phase. We also have some new requirements
> -- the most notable of which is to be able to build natively on
> Windows
> platforms with the Microsoft cl compiler. AC's current generation just
> does not support this at all (although, in fairness, I have not [yet]
> posted to the AC list asking if there are plans to do so), so we
> currently have a wholly separate configure/build system for Windows.
> This is *quite* unattractive from a software engineering and
> maintenance point of view (I think LT would need to also have some
> knowledge of the MS cl compiler, too. The biggest "gotcha" is that all
> of cl's command line options start with "/", not "-" -- so things like
> "-g" are "/g", which usually entails an overhaul of an entire
> configure/build system to parameterize on the leading option
> indicator).
>
> So we're evaluating our current system and looking at other systems --
> trying to gauge what would be the best for a long-term solution.
>
> Hope this helps clarify our position.
>
>
> On Nov 20, 2004, at 8:02 AM, Jeff Squyres wrote:
>
>> On Nov 18, 2004, at 5:39 PM, Anthony J. Ciani wrote:
>>
>>>> 1. If Open MPI uses a build system that requires extra tools (such
>>>> as cmake
>>>> or jam or ...) to be installed in order to be built from source,
>>>> would this
>>>> be a deterrent to you installing Open MPI from a source tarball?
>>>
>>> It wouldn't be a deterent, but why should I have to build and install
>
>>> three packages instead of one? I haven't seen other projects
>>> adopting jam and cmake, so they would be sort of "LAM-only" tools.
>>> This really isn't a time conservation issue, because the build time
>>> difference between
>>> JA/CM and AC/AM/LT would only be a minute or two. In fact, I would
>>> need
>>> to configure LAM many times to make up for the time spent installing
>>> jam
>>> and cmake. Now, if this is about portability or code maintainance,
>>> that's
>>> another issue, but the choice between JA/CM and AC/AM/LT really only
>>> affects how the package is configured.
>>
>> We are thinking of changing for these reasons (portability and code
>> maintenance) *and* for linking speed. I agree that compiling speed
>> won't change, but GNU Libtool is actually pretty slow. Consider
>> building libmpi. In Open MPI, it's several hundred .c files. This
>> results in several hundred .o files, and therefore several hundred .lo
>
>> files ("Libtool Object" files, which are text files containing meta
>> data about their corresponding .o files). Libtool has to analyze each
>
>> of these .lo files (and or .la files, if they were previously rolled
>> up into Libtool libraries) and then do the final link.
>>
>> Currently, Open MPI uses AC/AM/LT. I did an admittedly quite
>> unscientific benchmark on a cluster with reasonably fast nodes that we
>
>> use for Open MPI development. This was done on the head node; the NFS
>
>> server.
>>
>> 1. Open MPI rolls up .c files into 38 .la files, and then rolls those
>> up into a single libmpi.la (there is only one level of rolling). I
>> removed all .la files and invoked "make". This entailed Libtool
>> rolling up all the .la files into libmpi.la, and took just under 2
>> minutes (1:57, IIRC).
>>
>> 2. With some creative find's and grep's, I found all the .lo files
>> necessary to build libmpi.la and manually issued a single libtool
>> command to build libmpi.la from them (i.e., no .la rollup). This took
>
>> just over 1 minute (1:05, IIRC).
>>
>> 3. I then took the same list of files and s/.lo$/.o/ and manually
>> issued a single "ar" command to create libmpi.a (followed by ranlib,
>> of course). This took 3 seconds.
>>
>> So Libtool is actually quite slow -- remember that it's a Bourne shell
>> script. Don't get me wrong; I'm not knocking Libtool. It's a great
>> product and we use it because it's able to make shared libraries on a
>> wide variety of platforms. But *every* time we developers need to
>> recompile libmpi means a 1 or 2 minute link -- when it really could be
>
>> 3 seconds. Hence, this is overhead because libtool is a) written in a
>
>> shell, and b) is analyzing all of its meta dependencies. Multiply
>> this by a million (which is roughly the number of times a day that an
>> Open MPI developer relinks libmpi) and the result actually does add up
>
>> to quite a significant amount of time that we're just waiting for
>> Libtool. From a performance standpoint, this is a no-brainer -- this
>> is at least one bottleneck that can be fixed / changed.
>>
>> In all fairness, I have *not* tried the Libtool 1.9/2.x betas to see
>> if there is any speed improvement.
>>
>> Just trying to provide some concrete reasoning behind our rationale...
>>
>> --
>> {+} Jeff Squyres
>> {+} jsquyres_at_[hidden]
>> {+} http://www.lam-mpi.org/
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|