I have been trying to look into your suggestion of checking whether I am using LAM's mpirun.
When I type which mpirun I get:
/usr/bin/mpirun
It sounded like I should get something that mentions LAM
I did in fact load a LAM module and did lamboot to acquire the nodes, so I am not sure.
I tried using the laminfo command, but that did not seem to work; perhaps I am not using it correctly (I tried to mimic what is in MPITB web)
I also checked to make sure that dld=1 in octave_config_info and it is.
Perhaps I will send a message to the techs at the supercomputer I am trying to run this on. I sent them one before I posted on this forum, but they do not know much about OCTAVE. If you could give me some pointer as to what I could tell them to check, that would be very helpful.
Thanks,
Mike
> Date: Sun, 30 Sep 2007 12:30:35 +0200
> From: javier_at_[hidden]
> To: lam_at_[hidden]
> Subject: Re: LAM: Running Octave Programs with MPI
>
> lam-request_at_[hidden] wrote:
> >> Can you run octave without the mpirun part? This usually means that
> >> the command you are trying to run either can't properly execute on
> >> the current architecture or is not found in your path.
> >
> > Hi, Mike,
> > this is a log of how Brian's test works in my system:
> >> [javier_at_xxx Hello]$ octave -q --eval Hello
> >> [...]
> >> Help on MPI: help mpi
> >> Help(this demo): help Hello
> >> Hello, MPI_COMM_world! I'm rank 0/1 (xxx)
>
> > I can start octave without mpirun, but when I try:
> > [marin][~/octave/mpitb/Hello]> octave -q --eval Hello
> > [...]
> > Help(this demo): help Hello
> > -----------------------------------------------------------------------------
> > It seems that at least one rank invoked some MPI function before
> > invoking MPI_INIT. The only information that I can give is that it
> > was PID 27255 on host marin.
> > -----------------------------------------------------------------------------
> >
> > So, Hello just fails
> Well, proving it's Hello's fault would be rather involved, since this is
> Hello's code:
> > function Hello
> > % The classical greeting: "Hello (MPI_COMM_) world! I'm rank m/n"
> > [...]
> > info = MPI_Init;
> > [info rank] = MPI_Comm_rank (MPI_COMM_WORLD);
> > [info size] = MPI_Comm_size (MPI_COMM_WORLD);
> > [info name] = MPI_Get_processor_name;
> >
> > fprintf("Hello, MPI_COMM_world! I'm rank %d/%d (%s)\n", rank, size,
> > name);
> >
> > info = MPI_Finalize;
> >
> > [...]
> So there is no single line of code before MPI_Init. Should every comment
> be deleted, the resulting text would be
> > function Hello
> > info = MPI_Init;
>
>
> > n0 localhost:3:origin,this_node
> That worked fine. You mentioned 3 cpus in your single localhost node,
> and there they are.
> > The output I get when I run: mpirun -np 2 octave -q --eval Hello
> > is:
> > MPI: marin: 0x132c0000464b9eb5: octave: Command not found.
> > MPI: could not run executable (case #4)
> > Killed
> Hmpf, I tried to find the text "case #4" in mpirun sources to no avail.
> > I had to put -np instead of -c. with -c I just get:
> > MPI: bad process count
> Ouch! That's probably not LAM's mpirun, but other people's mpirun.
> > It was mentioned that maybe the architecture is not good. The system
> > I am running it on is an Itanium2 SGI Altix 4700.
> > Any ideas?
> Double-check you have indeed LAM/MPI installed. Try the "which" and
> "laminfo" commands suggested in MPITB web.
> http://atc.ugr.es/~javier/mpitb.html#Installing
> My guess is you are not using LAM's mpirun.
>
> Brian can probably provide not only a better guess on the "case #4" and
> "bad process count" error messages, but also better instructions based
> on LAM's FAQ pages. If he suggest something different, do follow his
> advice instead of mine :-)
>
> -javier
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
_________________________________________________________________
Boo! Scare away worms, viruses and so much more! Try Windows Live OneCare!
http://onecare.live.com/standard/en-us/purchase/trial.aspx?s_cid=wl_hotmailnews
|