LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: 460853_at_[hidden]
Date: 2006-11-22 12:17:57


Thank you all! It's already fixed. As I was told by Tim Prins, I was trying to
receive the message in something that poits to a constant... By making it a
variable, everything worked

Instead of having
char *message="Hello world";

Changing it by

char message[12];
strncpy (message, "Hello world", 12);

Worked

Thank you very much again for your answers! :)

Quoting "Alastuey, Lucas" <Lucas.Alastuey_at_[hidden]>:

> How we can debug this kind of error??
>
> This message is not very descriptive
>> MPI_Recv: process in local group is dead (rank 1,
>> MPI_COMM_WORLD)
>> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
>> Rank (1, MPI_COMM_WORLD): - MPI_Recv()
>> Rank (1, MPI_COMM_WORLD): - main()
>
> Gdb, Valgrind ??
>
> -----Original Message-----
> From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On
> Behalf Of Nam Hoang
> Sent: Martes, 21 de Noviembre de 2006 11:39 p.m.
> To: lam_at_[hidden]
> Subject: Re: LAM: lam Digest, Vol 813, Issue 1
>
> Hi Hector,
> I think your program is not written correctly.
> First, you should allocate a memory to message (for
> example : char message[12]..., or using malloc).
> Second, initializing value to message should be placed
> in private code of node 0 (sending node) :
> if (rank == 0)
> {
> strcpy(message, "Hello world !");
> for (i = 1; i < size; i++)
> {
> MPI_Send (message, 12, MPI_CHAR, i, tag,
> MPI_COMM_WORLD);
> }
> }
>
> Hope this helps | :)
> --- lam-request_at_[hidden] wrote:
>
>> Send lam mailing list submissions to
>> lam_at_[hidden]
>>
>> To subscribe or unsubscribe via the World Wide Web,
>> visit
>> http://www.lam-mpi.org/mailman/listinfo.cgi/lam
>> or, via email, send a message with subject or body
>> 'help' to
>> lam-request_at_[hidden]
>>
>> You can reach the person managing the list at
>> lam-owner_at_[hidden]
>>
>> When replying, please edit your Subject line so it
>> is more specific
>> than "Re: Contents of lam digest..."
>> > Today's Topics:
>>
>> 1. Re: Unable to boot Lam in a remote machine
>> (460853_at_[hidden])
>> > From: 460853_at_[hidden]
>> To: lam_at_[hidden]
>> Date: Mon, 20 Nov 2006 18:24:24 +0100
>> Subject: Re: LAM: Unable to boot Lam in a remote
>> machine
>>
>> Hello everyone
>>
>> Well, at first, thank you for answering. I'd also
>> like to apologize for not
>> having been able to write earlier, but some family
>> dutys kept me out of all
>> this for a while.
>>
>> Next, I'd like to say that the trouble I asked about
>> in my previous mail has
>> been solved by disabling the Firewall so, certainly,
>> that was the problem. The
>> thing is that now, I'm having another trouble.
>>
>> After disabling the firewall, and managing to set
>> the environemnt up, I looked
>> in the Internet for a very simple program (actually,
>> a "Hello World")
>> done with
>> MPI:
>>
>>
>> ---------------------prueba.c ------------------
>> /* C Example */
>> #include <stdio.h>
>> #include <mpi.h>
>> #include <math.h>
>>
>>
>> void
>> main (argc, argv)
>> int argc;
>> char *argv[];
>> {
>> char *message = "Hello world";
>> int rank, size, i, tag, node;
>> MPI_Status status;
>>
>> MPI_Init (&argc, &argv); /* starts MPI */
>> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /*
>> get current process id */
>> MPI_Comm_size (MPI_COMM_WORLD, &size); /*
>> get number of processes */
>> tag = 100;
>>
>> if (rank == 0)
>> {
>> for (i = 1; i < size; i++)
>> {
>> MPI_Send (message, 12, MPI_CHAR, i, tag,
>> MPI_COMM_WORLD);
>> }
>> }
>> else
>> {
>> MPI_Recv (message, 12, MPI_CHAR, 0, tag,
>> MPI_COMM_WORLD, &status);
>> }
>>
>> printf ("node:%d %s\n", rank, message);
>> MPI_Finalize ();
>> }
>> --------------------------------------------
>>
>> I compile it with: mpicc -o prueba.exe prueba.c
>> (It's a Linux system, so I know that this of the
>> .exe is unnecessary, but
>> anyway... I did it this way in order to know which
>> the executable file is).
>> Then I place a copy of that executable in a folder
>> which is in the Path
>> in both
>> computers (preciseness in $HOME/bin/)
>>
>> Next, I start the environment properly (ehm...
>> properly "I guess")
>> ---------------------------------------------
>> hector_at_rdp13:~/Pa aprendé/Pruebas MPI> lamboot -v
>> lamhosts
>>
>> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
>>
>> n-1<26498> ssi:boot:base:linear: booting n0
>> (155.210.155.67)
>> n-1<26498> ssi:boot:base:linear: booting n1
>> (155.210.155.70)
>> n-1<26498> ssi:boot:base:linear: finished
>> ----------------------------------------------
>>
>> But when I try to execute with mpirun, I get the
>> following output:
>> ---------------------------------------------
>> hector_at_rdp13:~/bin> mpirun -v -np 2 prueba.exe
>> 26535 prueba.exe running on n0 (o)
>> 4861 prueba.exe running on n1
>> node:0 Hello world
>> MPI_Recv: process in local group is dead (rank 1,
>> MPI_COMM_WORLD)
>> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
>> Rank (1, MPI_COMM_WORLD): - MPI_Recv()
>> Rank (1, MPI_COMM_WORLD): - main()
>> ---------------------------------------------
>>
>> It seems that node 1 (the remote node) is not
>> working. It says it's "dead". I
>> looked for this error message in Google, and I
>> understood that what is
>> happenning is that the process is not running in the
>> remote machine. It was
>> also said that this can happen because the
>> MPI_Finalize (); instruction was
>> executed too soon. I think in this case, that can't
>> be it, because is an
>> absolutely simple program that has been downloaded
>> from an example web
>> page, so
>> I guess it should work.
>>
>> I would also like to say that in the remote machine,
>> after setting up the
>> enviroment with the lamboot command, a "ps aux"
>> shows (among many other
>> things)
>> a lamd daemon running
>>
>> -----------------------------------
>> hector_at_venus2:~/bin> ps aux
>> USER PID %CPU %MEM VSZ RSS TTY STAT
>> START TIME COMMAND
>> root 1 0.0 0.0 776 304 ? S
>> 17:24 0:00 init [5]
>> root 2 0.0 0.0 0 0 ? SN
>> 17:24 0:00 [ksoftirqd/0]
>> [. . .]
>> hector 3743 0.0 0.0 6484 1148 ? S
>> 17:26 0:00
>> /usr/bin/lamd -
>> -----------------------------------
>>
>> So the environement seems to be raised properly...
>> The thing is that it
>> doesn't
>> execute the program properly.
>>
>> I imagine that the solution will be quite simple,
>> but I can't see it :(
>>
>> Thank you very much in advance!!
>> //Hector
>>
>> >> 460853_at_[hidden] wrote:
>> >>> I know there's a firewall in each machine that
>> only opens the SSH
>> >>> (22) port, so
>> >>> I guess the problem comes from that. So, what
>> ports do I have to
>> >>> open in order
>> >>> to boot LAM?.
>> >>>
>> >>> Executing the lamboot with the -d option, I've
>> read (among many
>> >>> other things)
>> >>> this:
>> >>>
>> >>> lamd -H 155.210.155.67 -P 6459 -n 1 -o 0 -d
>> >>>
>> >>> So, I guess that this means that the .155.70
>> machine should be able
>> >>> to reach the
>> >>> port 6459 in the .155.67 machine. Am I right? So
>> the solution comes
>> >>> by opening
>> >>> the 6459 port in the .155.67 machine? Should I
>> open this port also in the
>> >>> .155.70 machine? Otherwise, which ports should I
>> open? Because I
>> >>> don't know if
>> >>> it will be enough with opening only these ports.
>> >>
>> >> All non-system (> 1024) TCP ports are needed to
>> boot and run LAM. In
>> >> more detail - LAM does not use any specific port
>> numbers, but instead
>> >> requests any random open port from the OS. Check
>> out FAQs 17 and 18
>> >> here for some more info:
>> >>
>> >> http://www.lam-mpi.org/faq/category4.php3
>> >>
>> >> Hope this helps!
>> >>
>> >> Andrew
>>
>>
>>
>>
>>
>>
>
>
>
>
> ____________________________________________________________________________________
> Sponsored Link
>
> $200,000 mortgage for $660/ mo
> 30/15 yr fixed, reduce debt
> http://yahoo.ratemarketplace.com
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>
>
> Sonda S.A.
> La información contenida en este correo electrónico, así como en
> cualquiera de sus archivos adjuntos, es confidencial y está dirigida
> exclusivamente a él o los destinatarios indicados. Cualquier uso,
> reproducción, divulgación o distribución por otras personas distintas
> de él o los destinatarios está estrictamente prohibida. Si ha
> recibido este correo por error, por favor notifíquelo inmediatamente
> al remitente y bórrelo de su sistema sin dejar copia del mismo. SONDA
> no acepta responsabilidad alguna por cualquier pérdida o daño como
> consecuencia, directa o indirecta, del uso indebido de este e-mail o
> de los archivos adjuntos al mismo.
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>