Bogdan Costescu wrote:
> On Mon, 9 Jul 2007, Filipe Garrett wrote:
>
>> the "lamnodes" directly on the shell correctly reports both nodes
>> but when run from within PBS just the client is reported.
>
> This means that you have started the LAM/MPI daemons on the nodes.
>
> But just to make sure: have you started lamd on the nodes outside of
> PBS and expect to only run 'mpirun' from inside the PBS batch script ?
No, I started "lamd" from inside the PBS script.
>
>> I've tried to run it with -v $PBS_NODEFILE but nothing. Does this
>> file has to include all the computing nodes or just the client
>> nodes? I mean, do I include both nodes or just the client?
>
> The LAM/MPI daemons have to be started on each node that you plan to
> use for the MPI job. So, if you want to use both nodes, then both
> their names should be in the file.
>
Ok
>> Configured on: Mon Jun 12 18:27:10 EDT 2006
>> Configure host: ls20-bc2-14.build.redhat.com
>
> I guess that the installation of LAM/MPI that you are using is the
> pre-packaged one from Red Hat (on some version of RHEL). This is not
> configured with TM support, so the only chance for you to have native
> PBS support in LAM/MPI is to get the LAM/MPI source and
> compile/install it yourself. But before doing that, please remove the
> existing LAM/MPI installation to avoid any bad effects from having 2
> versions installed side-by-side.
>
Thanks a lot for your help. Since it didn't support "tm boot" I simply changed
it to "rsh" and added the "-v nodefile" option. And works perfectly!!!
thanks a lot!!!!
FG
PS - i'll leave it like this for now but later I'll give it a try to the "tm boot"
|