Hello,
Great news! I have found the cause of the error: it's very tricky.
It comes from the building process. The compilation options are not the
same for the two machines PA-Risc and Itanium. The procedures
"configure, make and install" , all of them went through for the two
machines and the warnings are practically the same. It means that
working with the HP ux is quite hard!
Anyway, many thanks for your suggestions (I have verified that the
machines are well reachable).
Best regards,
V.Khiem Truong
Onera - France
>On Apr 3, 2007, at 10:31 AM, Van-Khiem Truong wrote:
>
>> You are right about the number of IP address of the machine. The
>> cluster has two IP addresses, I thought that the monoprocessor has
>> only
>> one, but I just
>> asked the system administrator who told me that it has two also. He
>> will erase one for the monoprocessot machine, but it is not
>> possible for
>> the cluster.
>>
>> How can make the code Lam-MPi cope with a machine that has two IP
>> addresseS ?
>LAM should be able to handle machines with multiple IP addresses.
>It's been a long, long time since I've looked at this code in LAM,
>but as long as your machine can accept the connection on arbitrary
>ports on IP address 125.1.2.17, then it should work fine...?
>
>Can you verify that this is the correct IP address for your machine,
>and that arbitrary incoming socket connections can be made to that
>address from the localhost? Perhaps try running NetPIPE's TCP
>bandwidth tester on your localhost using that IP address...?
>
>> Thank you and best regards,
>>
>> V. Khiem Truong
>> Onera -France
>>
>>
>>> On Apr 2, 2007, at 9:35 AM, Van-Khiem Truong wrote:
>>>
>>>> Hello Jeff Squyres,
>>>>
>>>> Thank you for your quick response. That is really odd! I spend
>>>> some
>>>> time to check about the trouble.
>>>>
>>>> (1) You are right about the configuration without "memory-manager";
>>>>
>>>> (2) There is no firewall software running;
>>>>
>>>> (3) Instead of using the multiprocessor machine, I installed the
>>>> Lam-MPI on a single processor machine
>>>> with the same processor Itanium. Then I make the lamboot call with
>>>> only
>>>> the Itanium station alone (with
>>>> two work stations, it results into the same error):
>>>> it results into the same error message as before, as you can see on
>>>> the following file:
>>>
>>> It's actually not hboot that is failing, but the lamd (hboot is
>>> mainly a wrapper around fork/exec'ing the lamd). The lamd is trying
>>> to open a socket back to 125.1.2.17 port 62915 (which *should* be the
>>> same as the local host).
>>>
>>> Do you, perchance, have multiple IP addresses on this machine? I'm
>>> wondering if LAM is using the "wrong" IP address such that it can't
>>> open a socket back to 125.1.2.17 properly.
>>
>>>> --
>>> Jeff Squyres
>>> Cisco Systems
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>--
>Jeff Squyres
>Cisco Systems
|