LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Bogdan Costescu (bogdan.costescu_at_[hidden])
Date: 2004-06-22 03:38:09


On Mon, 21 Jun 2004, Tobias Wenzel wrote:

> A summary of the tests we made, you can find on
> http://www.tu-chemnitz.de/informatik/HomePages/RA/projects/cocgrid/viarpi/tests.php
> It is Pallas and LAMs conformance Suite with up to 8 nodes.

My question had a good reason: applications that run for hours/days
might expose bugs that don't appear when doing only syntetic tests.
It's often the less likely code paths and the non-homogeneity of the
communication pattern that can can occur at these time scales that can
bite. I have already had a negative experience in this direction from
the GAMMA project (http://www.disi.unige.it/project/gamma/) last time
I tried it (1.5-2 years ago).

Another point is also the testing of a larger number of nodes. There
are some projects (the above mentioned GAMMA being one of them) that
started with the assumption that network congestion can never happen;
but as soon as the project met real-world with packets being dropped
or delayed, they realized that a flow-control/recovery mechanism was
needed and this complicated matters a lot and decreased performance.
Up until very close to final 1.2 release, M-VIA also did no support
reliable delivery, which pushed the reliability side in the upper
layer; I don't know if it made it into the final release, I got bored
of waiting for it...

I don't want to sound like bashing your project. On the contrary and
as Brian wrote, it's very nice that a group independent of the LAM
developers created a new RPI. I was actually keeping my eyes on the
ParMa2 project (http://www.ce.unipr.it/research/parma2/via/via.html)
until they stopped updating their webpage. I will test your code
sometime later this summer, after I'll finish setting up our new
(still TCP/IP/Ethernet) cluster.

-- 
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]