LAM does provide shared libraries, but they are not the default option
(mainly for historical reasons -- not technical reasons). But that's not
the entire issue.
A big issue is that there are startup network protocols within the MPI layer
and the LAM run-time environment, not all of which are fully contained
within the shared libraries (e.g., some of them are in the executables
mpirun, lamboot, etc.). These network protocols have changed over time. At
run-time, if the protocol in one executable doesn't match the protocol in
another (e.g., an MPI application and mpirun), Bad Things can happen (hang,
segv, etc.).
We never made the attempt to keep the protocols compatible between versions
(or add alternate code paths to preserve old versions of network protocols)
because, as Brian mentioned, LAM is a research project -- not a
customer-paid-for software product with professional support. The software
engineering required to maintain backwards compatibility for all these
different revisions is quite large (and dramatically increases the testing
matrix) which, as a small research project at a university, we were not
equipped to handle. We did the best that we could, but long ago made the
executive decision that binary compatibility between versions was one
feature that we were not going to support.
On 8/22/06 9:15 PM, "Liu Xuezhao" <lxz_at_[hidden]> wrote:
>
>> Mainly because at it's core, LAM has always been a research project and
>> given the choice between binary compatibility and new research, we had
>> to be able to do the research. Binary compatibility is also hard and
>> time consuming.
>>
>> With Open MPI, which is funded by people interested in production
>> environments, we are trying much harder to guarantee binary
>> compatibility. It isn't always possible, but we're trying.
>>
>> Brian
>>
> Thank you for the reply!
> I still feel puzzled about the problem. I think that if the applications
> dynamically link with the MPI's library at runtime, then it can be binary
> compatible with different versions of MPI,am I wrong? Why LAM do not provide
> the dynamic shared library to let the applications link dynamically with it,
> is the performance reason?
> Can you tell something about why you feel hard to guarantee binary
> compatibility and how will you do with it?
> Thanks again. Liu.
>
>
>
> _______________________________________________
> lam-devel mailing list
> lam-devel_at_[hidden]
> http://www.lam-mpi.org/mailman/listinfo.cgi/lam-devel
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
|