On Mon, 24 Oct 2005, Austin Leach wrote:
> LAM is booting on the nodes successfully, so that isnt a problem.
> However, I can only get the command above to work if I remove
> 'schedule=no' from the machine file. The -s option is the only way I
> have been able to get a job to run, and Im relatively sure that if -s is
> used, then the node MUST be schedulable. My guess is that their is a
> path problem? I have read through the FAQs and haven't been able to
> figure out the problem. The directory from which I execute mpirun is
> present on all nodes, and the permissions of and inside of said
> directory are rwxrwxrwx.
>
> With the headnode set to schedule=no,
>
> When I do: "mpirun C ../../myprogram < myinfile"
> I get: mpirun: cannot start ../../myprogram on n1: No such file or
> directory
did you try turning on debugging for recon, lamboot etc? (-d option).
it has helped me out a few times already. i'm guessing you should
start the job on a node that is actually part of the list of working
nodes. starting the jobs with a batch scheduler will do this for you.
i'm new to lam/mpi, so i could be completely wrong...
good luck! joost.
|