LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-04-14 12:18:30


On Apr 13, 2006, at 9:40 AM, Timothy G Thompson wrote:

> I'm a developer working on a research effort using LAM/MPI with
> multi-objective genetic algorithms. I've developed an ‘asynchronous
> island’ parallelization model using one-sided communication with
> the post-start-complete-wait as the synchronization mechanism.
> We’re running on a heterogeneous environment with linux boxes
> (Redhat 9), Sun sparc boxes (Solaris 2.9) and Sun x86 boxes
> (Solaris 2.10). We’re using LAM 7.1.1, with the downloaded linux
> RGP, and the Sun executables being built from the source.
>
> I’ve got detailed timers in place (wall, virtual: Sun only (thread
> local), and rusage (process virtual and system)). These timers show
> that the post-start-complete-wait calls on the Sun boxes have very
> poor performance (large amount of virtual AND system time being
> consumed during these four LAM calls). Whereas the linux boxes
> show very low (good) performance overhead.
>
> I’d look forward to any insight you can provide that might explain
> these differing results. Are the differences in the OS, my LAM
> build, and/or the linux RGP install causing this. Is the LAM code
> somehow blocking on linux and polling on Solaris?

That's an unusual finding. We don't do anything differently for the
post/wait/start/complete synchronization on Solaris and Linux. Both
are implemented over our point-to-point communication routines. Is
LAM/MPI built the same on both machines (you might want to look at
the output of laminfo to see if the same set of components are
available). You might want to try using tcp instead of usysv or sysv
for the transport engine - it might help calm the Solaris boxes down
(but that's just a guess). Can you use the profiling tools available
with Solaris to see where LAM is spending all it's time when it is
behaving badly?

Brian

-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have a LAM/MPI day: http://www.lam-mpi.org/