LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: James Fang (cf8e_at_[hidden])
Date: 2004-03-20 10:18:02


Hi

I have given up on using lam_sche_round_robin, however, I have found that
if I specify my root to be 3, and only 3, I will be able to create the
round_robin effect, which spawns a worker on every cpu. If the root is not
specified as 3, the manager program will only create worker on its own cpu,
but not in other nodes. Could this be an hardware issue? The only thing
special about my node 3 that I can think of is that it is node I use to log
on to the cluster.

That, however, is not really an issue for the program I am trying to create
just as long as the round_robin effects actually takes place, the real
issue is after spawning, every node's rank becomes zero,

Parent 0 is viscomp01.vision and its process id is 18360
Parent 2 is visdata2.vision and its process id is 4585
Parent 1 is visback.vision and its process id is 10962
Parent 3 is vision.sys.virginia.edu and its process id is 11011
Parent 0 is viscomp01.vision and its process id is 18360 after calling spawn
Parent 0 is visback.vision and its process id is 10962 after calling spawn
Parent 0 is visdata2.vision and its process id is 4585 after calling spawn
Children 4 is vision.sys.virginia.edu has 11012 as the pid
Children 3 is visdata2.vision has 4586 as the pid
Children 1 is visback.vision has 10963 as the pid
Children 2 is visback.vision has 10964 as the pid
Children 0 is viscomp01.vision has 18361 as the pid
Parent 0 is vision.sys.virginia.edu and its process id is 11011 after
calling spawn

This has become very problematic especially when I try send the pid of the
manager to the worker since only node3 (named vision.sys.virginia.edu)
passes its pid to the workers. And it doesn't appear to be possible to
save the pid of the other managers ahead of the time either. Is there
anyway that I can get around this? Thank you very much helping out. I
have included my manager code below:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <string>
#include <iostream>
#include <sys/types.h>
#include <unistd.h>
#include <fstream>
#include <sys/times.h>
#include <time.h>
#include <sys/resource.h>

using namespace std;

int main(int argc, char *argv[])
{
         int universe_size, *universe_sizep, flag;
         MPI_Comm everyone;
         char worker_program[100] = "t.o";

         MPI_Info info;
         char *lam_spawn_sched_round_robin = (char *)
"lam_spawn_sched_round_robin";

         char name [MPI_MAX_PROCESSOR_NAME];
         int namelen;

         int pid = getpid();

         MPI::Init(argc,argv);

         int rank = MPI::COMM_WORLD.Get_rank();
     int world_size = MPI::COMM_WORLD.Get_size();
         MPI_Get_processor_name(name, &namelen);
         cout<<"Parent "<<rank<<" is "<< name <<" and its process id is "<<
pid<<endl;

         MPI_Info_create (&info);
     MPI_Info_set (info, "lam_spawn_sched_round_robin", "n2");

// if(world_size != 1)
// cout<<"Top heavy with management"<<endl;

         MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE,
&universe_sizep, &flag);

         if(!flag)
         {
                 printf("This MPI does not support Universe_size.");
                 printf("How many processes total?");
                 scanf("%d", &universe_size);
         }

         else
                 universe_size = *universe_sizep;

         if (universe_size == 1)
                 cout << "no room to start workers"<<endl;

     //spawn workers
         int err = 2; //meaning less variable, have to use this since i
get a uisng void* error with MPI_ERRCODES_IGNORE

          MPI_Comm_spawn(worker_program, MPI_ARGV_NULL, 5, info, 3,
MPI_COMM_WORLD, &everyone, &err);

         pid = getpid();

         cout<<"Parent "<<rank<<" is "<< name <<" and its process id is "<<
pid <<" after calling spawn"<<endl;

          /*
                 MPI_Send(&pid, 1, MPI_INT, 0, 1, everyone);
                 MPI_Send(&pid, 1, MPI_INT, 1, 1, everyone);
                 MPI_Send(&pid, 1, MPI_INT, 2, 1, everyone);
                 MPI_Send(&pid, 1, MPI_INT, 3, 1, everyone);
                 MPI_Send(&pid, 1, MPI_INT, 4, 1, everyone);
         */
         //cout<<"Parent "<<rank<<" is "<< name <<" and its process id is
"<< pid <<" after sending"<<endl;

    MPI::Finalize();

         return 0;
}