We actually added this in LAM a few versions back. The problem is that
it's not in a whole lot of the versions out there. So if LAM had
continued development, we'd eventually have gotten to the point where
these problems at least gave useful error messages (assuming people
retired old versions of LAM). Of course, if you are mixing a version
of LAM with version checking and a version without, you still might not
always get useful error messages :/.
Thanks for the feedback!
Brian
On Jun 22, 2005, at 9:12 AM, Pierre Valiron wrote:
> Hi folks,
>
> I feel these lam/mpi version mismatch problems are *very* frequent on
> the list.
>
> After all it is not so obvious for the end user to be sure he has set
> his PATH correctly (as it may show up differently in a normal login and
> in a rsh call). Also when using poor man's clusters, a version mismatch
> in the same PATH is always possible.
>
> I guess whether it should be possible to automate somehow the version
> checking durint the lamboot process ? That might result in converting
> most of these errors into a clear message indicating the version found
> on the home node and on the mismatching ones.
>
> This would imply that the preliminary boot phase would use an
> additional
> version-independant ack which would fail with an explicit message for
> all newer versions and would state a less explicit version mismatch
> with
> older versions.
>
> Ideally the library itself should recognize that all executables have
> been linked to the same version as well, because mismatch may result in
> later hang up or even in wrong results.
>
> At least that automatic consistency check might be a useful requirement
> for the Open-MPI developments.
>
> Best.
> Pierre.
>
>
>
>
> Brian Barrett wrote:
>
>> On Jun 21, 2005, at 10:02 PM, Madhurjya P. Bora wrote:
>>
>>
>>
>>> I have the lam-7.0.3 successfully running on my Fedora Core 2
>>> standalone
>>> machine (localhost), which I use for test purpose. This lam-7.0.3
>>> came
>>> as an RPM along with the system.
>>>
>>> When I've built the lam-7.1.1 from the .tar.bz2 package for the
>>> Lahey-Fujitsu FORTRAN 95 compiler, the built went on successfully.
>>> But
>>> during lamboot from the newly built pacakge complains of TCP random
>>> ports. However recon is successful!
>>>
>>> My configure option was just with a prefix dir i.e. ./configure
>>> -prefix=/usr/local/lam/lf95.
>>> The old lam still boots! I'm using SSH-2. Kindly help!
>>>
>>>
>>
>> Tim is right - you should read the error message before posting ;).
>> Unfortunately, you didn't include enough information for me to be able
>> to help you. There are a couple of different things that could be the
>> problem. First, the default RPM install is going to be in /usr/bin.
>> Make sure that /usr/local/lam/lf95/bin appears in your path before
>> /usr/bin. That means you should be able to do:
>>
>> ssh localhost which lamboot
>>
>> And see "/usr/local/lam/lf95/bin/lamboot". If you see
>> "/usr/bin/lamboot", you do not have your path setup correctly. Please
>> see the LAM faq for more information about the requirements for
>> setting
>> up your path.
>>
>> It's also possible that there is a problem with the installation of
>> LAM. If you are still having problems once you are sure you have your
>> path setup correctly, please send the output of lamboot, with the "-d
>> -v" flags specified (in addition to your normal arguments). This will
>> give a bunch of diagnostic information that can be useful in figuring
>> out what is going on. Post to this list with that information and we
>> should be able to figure out what is going on.
>>
>> Hope this helps,
>>
>> Brian
>>
>>
>>
>
>
> --
> Soutenez le mouvement SAUVONS LA RECHERCHE :
> http://recherche-en-danger.apinc.org/
>
> _/_/_/_/ _/ _/ Dr. Pierre VALIRON
> _/ _/ _/ _/ Laboratoire d'Astrophysique
> _/ _/ _/ _/ Observatoire de Grenoble / UJF
> _/_/_/_/ _/ _/ BP 53 F-38041 Grenoble Cedex 9 (France)
> _/ _/ _/ http://www-laog.obs.ujf-grenoble.fr
> _/ _/ _/ mail: Pierre.Valiron_at_[hidden]
> _/ _/ _/ Phone: +33 4 7651 4787 Fax: +33 4 7644 8821
> _/ _/_/
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
|