Since LAM came installed with your OS, we can't know exactly what is
wrong (i.e., it may have been packaged oddly).
Is hboot a binary executable or a script? If it's a script, is it
trying to invoke the mkdir command and failing because mkdir(1) does
not exist on the back-end node?
Does /tmp exist and is it writable? What does running "hboot -d"
manually on the back-end node (interactively or via ssh) result in?
Do you get a lamd running?
LAM does not require any additional identity information that you
mentioned -- a valid shell, UID, and $HOME are good enough.
Also -- is there a reason you're not using Open MPI? All of our work
is centered around Open MPI these days. LAM/MPI is in maintenance mode.
On Nov 23, 2007, at 1:06 AM, An Ching wrote:
> I tried to add the $TMPDIR by export, however the "mkdir: No such
> file or directory" is still occurred.
>
> Then, I used the original ssh to login remote node to give a look in
> its environment variables,. I found that the $TMPDIR did not exist
> actually in this case. However, when I run the hboot in shell
> ( opened by original ssh client), I got another messages (it seems
> ok and just said missing some parameters) but "mkdir: No such file
> or directory".
>
> What next idea?
>
> I don't know what default configuration parameters hboot need to
> run. Because the modified ssh use a login account that does not
> practically exist in remote node, MPI application knows the account
> since it holds the necessary information stored in struct passwd.
> The struct passwd can only provide uid, gid, home directory, and
> shell path. I don't know whether hboot needs other prerequisties
> besides identity-related information.
>
> Thanks again,
>
> Ann
>
> The error about no such file or directory is that the command "hboot"
> called the system call mkdir() and it failed. hboot tries to make a
> directory in the temporary directory (found by looking at $TMPDIR
> then /tmp). If the directory specified in $TMPDIR doesn't exist or /
> tmp doesn't exist, then you could get this error.
>
>
>
> ÑÅ»¢ÓÊÏ䣬ÖÕÉú»ï°é£¡ _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
Jeff Squyres
Cisco Systems
|