When I use gdb, it seems to stop up on line 823 of mpirun.c. The line
reads "if (rpwait(&nodeid, &pid, &status))"
--
Jonathan Herriott
Architecture and Performance Group
Apple Computer, Inc.
On Feb 3, 2005, at 7:18 AM, Jeff Squyres wrote:
> Something sounds quite wrong here -- the lam_tv_load_type_defs()
> function is a dummy function that is essentially a no-op, and is only
> included so that the linker pulls in relevant symbols. Indeed, here's
> the code for that function:
>
> -----
> void *
> lam_tv_load_type_defs(void)
> {
> static void *dummy[11];
>
> /* Referencing the above variables needed for loading type
> definitions in TotalView so that compiler does not optimize them
> out. */
>
> dummy[0] = &dummy_req;
> dummy[1] = &dummy_comm;
> dummy[2] = &dummy_group;
> dummy[3] = &dummy_proc;
> dummy[4] = &dummy_gps;
> dummy[5] = &dummy_ah_desc;
> dummy[6] = &dummy_al_desc;
> dummy[7] = &dummy_al_head;
> dummy[8] = &dummy_msg;
> dummy[9] = &dummy_cid;
> dummy[10] = &dummy_envl;
>
> return dummy;
> }
> -----
>
> All the "dummy" variables are instantiated earlier in the file.
>
> So if a thread is blocking in this function, there is something wrong
> with the installation. Can you attach a debugger to see where exactly
> it is blocking?
>
>
> On Feb 2, 2005, at 3:42 PM, Jonathan Herriott wrote:
>
>> Well, you were right about it being a spinlock issue (95% of the
>> profile) when running two threads. The problem is being spent in the
>> function lam_tv_load_type_defs. I'll include the shark profile. I
>> also tried leaving the program running over night on two threads,
>> which it should finish around 430s, but after 17 hours, it was still
>> running. Both processors are being used, but only one thread is
>> active and being passed between the two. The other thread starts up
>> and then doesn't do anything. There was no use in trying to do it
>> with one thread since the thread stays inactive. On another note,
>> which version of LAM/MPI uses the mpirun_ssh command if any does at
>> all?
>>
>> <LAM_Thr2.mshark>
>>
>> --
>> Jonathan Herriott
>> Architecture and Performance Group
>> Apple Computer, Inc.
>> 408-974-5931_______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|