LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2003-09-22 19:54:52


Oops -- Rich -- this is my fault for not catching your mail earlier (Nihar
asked me about this the other day and I forgot that we had fixed this in
CVS).

Yes, you are correct -- it is likely the problem of compiler/linker flags
not passing down to SSI modules properly when they are compiled. The
previous post about adding -mt was for the Solaris native compilers.
Since you're using gcc, the workaround flags are different.

You basically have 2 choices:

1. Set the CFLAGS, CXXFLAGS, FFLAGS, and/or LIBS environment variables to
proper values before running LAM's configure. I see that you're running
gcc 2.95.x on Solaris -- I'm not sure offhand what the right flags will
be. It'll be one of the following:

        a. set CFLAGS, CXXFLAGS, and FFLAGS all to "-pthread"
        b. set CFLAGS, CXXFLAGS, and FFLAGS all to "-D_REENTRANT" and LIBS
           to "-lpthread"

Make sure you "make clean all install" -- you need to recompile all of LAM
to ensure that it gets these flags properly.

2. Grab the latest 7.0.1 beta tarball from http://www.lam-mpi.org/beta/.

Hopefully, 7.0.1 will be out shortly, which will contain a fix for this
problem.

On Mon, 22 Sep 2003, Bailey, Richard T (US SSA) wrote:

> The main thread does the setup work, including creating the
> communicator. Only after this is completed does the worker thread begin
> its work, and while the worker thread is working, the main thread does
> not make any MPI calls. Therefore, even though I am using 2 threads,
> there are never any simultaneous calls of MPI functions. I also read
> section 12.4 of the user's manual, but it sheds no light on why
> MPI_Irecv becomes a blocking call if I call it from the worker thread
> instead of the main thread.
>
> I found references to another threading problem in the archives which
> has some similarities to mine, except that the end result is a seg
> fault, not the conversion of a nonblocking call to a blocking call.
> This is discussed at
> http://www.lam-mpi.org/MailArchives/lam/msg06479.php and
> http://www.lam-mpi.org/MailArchives/lam/msg06505.php ("This is a LAM
> build bug to do with threads and errno") etc. The solution was
> recompilation using the Sun compilers with a special CFLAGS setting.
> Before going to all this work, I would like to know if it is really
> going to help.
>
> I would expect LAM/MPI to work in this scenario because it it very
> common in GUI-based applications to have a worker thread do
> time-consuming work so that the main thread can keep the GUI refreshed.
> For example, it is very common to use a main and worker thread so that
> the worker thread can report its progress to a global variable and have
> the main thread read the progress value and display it in a progress
> bar.
>
>
> -----Original Message-----
> From: Nihar Sanghvi [mailto:nsanghvi_at_[hidden]]
> Sent: Thursday, September 18, 2003 7:15 PM
> To: General LAM/MPI mailing list
> Subject: Re: LAM: MPI_Irecv() blocks when called from a pthread
>
>
> Hi Richard,
> From your email it is not clear if you are trying to use MPI in
> different threads at the same time.
> Currently, LAM does not support multi-threading. The only
> multi-threaded support it provides is that it puts a lock on the MPI
> library. So, if your code is expecting to run multiple threads in MPI
> simultaneously, it won't work as you would want it to.
> Whatever is your first call to MPI Library will be blocking and
> MPI_Irecv wont be able to enter MPI library until the first call
> returns.
>
> You could find more details about this in the User Manual at
> http://www.lam-mpi.org/download/files/7.0-user.pdf in section 12.4.
>
>
> Nihar
>
>
> On Thu, 18 Sep 2003, Bailey, Richard T (US SSA) wrote:
>
> - I developed a C++ app w/ a master/slave architecture for splitting an
> - array in the master and having the slaves each process part of the
> - array. The master sets up the slaves, including the communicator,
> when
> - the master object is constructed. The Master->ProcessArray() member
> - splits the array and, depending on input params either does or does
> not
> - create a separate thread to send and receive the subarrays from the
> - previously-established slaves. In either case, I use nonblocking
> - MPI_Irecv() and MPI_Isend() calls for master/slave communication. When
> I
> - do not use a separate thread, all is well. However, if I do use a
> - separate thread (so the client code can do other work while the thread
> - spawned by Master->ProcessArray() is running), then MPI_Irecv()
> blocks,
> - with the effect that only 1 slave ever gets work and all processing is
> - sequential. I put 1 second sleeps in the slaves and put fprintfs
> around
> - the code to verify the it is the MPI_Irecv() that is causing the
> - problem:
> -
> -
> -
> - fprintf(
> -
> - stderr,
> -
> - "V: Starting AsyncRecvSubarrayParamsOutFromSlave %d @T
> %8.3f\n",
> -
> - TileJob->SlaveStatus->Rank,
> -
> - GetSecs()
> -
> - );
> -
> -
> -
> - MPI_Irecv(
> -
> - & TileJob->SubarrayParamsOut,
> -
> - sizeof(SubarrayParamsOutType) / sizeof(int),
> -
> - MPI_INT,
> -
> - TileJob->SlaveStatus->Rank, // Source
> -
> - SubarrayParamsOutTag,
> -
> - AllComms,
> -
> - & TileJob->SlaveStatus->MPIRequest
> -
> - );
> -
> -
> -
> - fprintf(
> -
> - stderr,
> -
> - "V: Done w/ AsyncRecvSubarrayParamsOutFromSlave %d @T
> %8.3f\n",
> -
> - TileJob->SlaveStatus->Rank,
> -
> - GetSecs()
> -
> - );
> -
> -
> -
> - Here is the output:
> -
> - V: Starting AsyncRecvSubarrayParamsOutFromSlave 1 @T 1.755
> -
> - V: Done w/ AsyncRecvSubarrayParamsOutFromSlave 1 @T 2.763
> -
> -
> -
> - The same output from a non-threaded run shows that MPI_Irecv() is
> - nonblocking.
> -
> -
> -
> - I am running LAM v. 7.0 compiled w/ gcc 2.95.3 on a Sun workstation OS
> - 5.7. The thread is created with pthread_create(). I tried
> initializing
> - MPI 3 different ways with the same result:
> -
> - MPI_Init(NULL, NULL);
> -
> - MPI_Init_thread(NULL, NULL, MPI_THREAD_SINGLE, &DontCare);
> -
> - MPI_Init_thread(NULL, NULL, MPI_THREAD_SERIALIZED, &DontCare);
> -
> -
> -
> - What do I need to do to get this to work in the multithreaded mode?
> -
> -
> -
> -
> -
> - R. T. Bailey
> -
> -
>
> ---------------------------------------
> Nihar Sanghvi
> LAM-MPI Team
> Graduate Student (Indiana University)
> http://www.lam-mpi.org
> --------------------------------------
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/