LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jonathan Herriott (jherriott_at_[hidden])
Date: 2005-02-03 17:11:31


When I use gdb, it seems to stop up on line 823 of mpirun.c. The line
reads "if (rpwait(&nodeid, &pid, &status))"

--
Jonathan Herriott
Architecture and Performance Group
Apple Computer, Inc.
On Feb 3, 2005, at 7:18 AM, Jeff Squyres wrote:
> Something sounds quite wrong here -- the lam_tv_load_type_defs() 
> function is a dummy function that is essentially a no-op, and is only 
> included so that the linker pulls in relevant symbols.  Indeed, here's 
> the code for that function:
>
> -----
> void *
> lam_tv_load_type_defs(void)
> {
>   static void *dummy[11];
>
>   /* Referencing the above variables needed for loading type
>      definitions in TotalView so that compiler does not optimize them
>      out. */
>
>   dummy[0] = &dummy_req;
>   dummy[1] = &dummy_comm;
>   dummy[2] = &dummy_group;
>   dummy[3] = &dummy_proc;
>   dummy[4] = &dummy_gps;
>   dummy[5] = &dummy_ah_desc;
>   dummy[6] = &dummy_al_desc;
>   dummy[7] = &dummy_al_head;
>   dummy[8] = &dummy_msg;
>   dummy[9] = &dummy_cid;
>   dummy[10] = &dummy_envl;
>
>   return dummy;
> }
> -----
>
> All the "dummy" variables are instantiated earlier in the file.
>
> So if a thread is blocking in this function, there is something wrong 
> with the installation.  Can you attach a debugger to see where exactly 
> it is blocking?
>
>
> On Feb 2, 2005, at 3:42 PM, Jonathan Herriott wrote:
>
>> Well, you were right about it being a spinlock issue (95% of the 
>> profile) when running two threads.  The problem is being spent in the 
>> function lam_tv_load_type_defs.  I'll include the shark profile.  I 
>> also tried leaving the program running over night on two threads, 
>> which it should finish around 430s, but after 17 hours, it was still 
>> running.  Both processors are being used, but only one thread is 
>> active and being passed between the two.  The other thread starts up 
>> and then doesn't do anything.  There was no use in trying to do it 
>> with one thread since the thread stays inactive.  On another note, 
>> which version of LAM/MPI uses the mpirun_ssh command if any does at 
>> all?
>>
>> <LAM_Thr2.mshark>
>>
>> --
>> Jonathan Herriott
>> Architecture and Performance Group
>> Apple Computer, Inc.
>> 408-974-5931_______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> -- 
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>