Hello -
There is a beta for XMPI that works with LAM 7.0 - see the XMPI web
page for more information. Depending on what you are looking for, it
may give you some very useful information. XMPI does best at showing
problems with point-to-point communication - places where there are
long periods of blocking or heavy activity. Due to some limitations,
it doesn't do particularly well exposing problems with the collective
operations. So if you are running into performance issues with
collectives, it probably won't help.
I believe there are a number of profiling tools that use the MPI
profiling interface that may shed some light on any performance issues
- MPE from Argonne is one that comes to mind. Depending on what you
are having issues with, something as simple as running gprof on some of
the nodes may give a good performance metric datapoint.
I think the short answer to your question is "it depends". Without
knowing exactly what problems you were seeing and what you expected,
it's really hard to give some specific suggestions. Hopefully, this
was enough to get you started.
Brian
On Wednesday, August 27, 2003, at 10:21 AM, Robin Laing wrote:
> Hello
>
> We have a small cluster running LAM/MPI 7 and I am trying to answer
> some questions from the scientist running his processes.
>
> When we received our licences for LS-DYNA, the scientist ran some
> benchmarks to test the cluster and find any hidden problems. While
> running the program, some questions came up in how different aspects
> of the cluster orientation were affecting run times and posted the
> question to me.
>
> We are running Ganglia Cluster Toolkit as a general monitor, but this
> package (as far as I have had a chance to learn) won't give me the
> metrics that I feel I require, especially in small time chunks.
>
> XMPI looks promising but there is no mention of a version for LAM 7.x
> as of yet. Any word on when there will be a version? I don't want to
> subscribe to the xmpi list just for this question.
>
> Any suggestions or links would be appreciated on how I could possibly
> gather metrics on the LAM or MPI processes.
>
> --
> Robin Laing
> Instrumentation Technologist Voice: 1.403.544.4762
> Military Engineering Section FAX: 1.403.544.4704
> Defence R&D Canada - Suffield Email: Robin.Laing_at_[hidden]
> PO Box 4000, Station Main WWW:http://www.suffield.drdc-rddc.gc.ca
> Medicine Hat, AB, T1A 8K6
> Canada
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
|