LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2008-05-07 08:36:35


On May 7, 2008, at 1:38 AM, richard pan wrote:

> Hi, I'm studying about Parallel processing right now. And I'm kinda
> new to this world.
> When I test my cluster implementation using inverse matrix, 1 mill
> times 1 mill, the MPI_Barrier always error. Is there a way to remove
> this error? I can assure you that the source code I've been using
> dont have any error, since I used it to test the same inverse matrix
> but only until about 6000 times 6000.

It's kind of hard to help when you don't include the error message
that MPI_BARRIER caused. That being said, generally errors in barrier
are caused by a previous problem, such as memory corruption from
overwriting an array. You might want to use a memory debugger such as
valgrind to make sure you don't have any issues in your code. Just
because something works at one matrix size does not mean that its
correct -- we've seen many times where one matrix size works and
another doesn't, simply because of what was placed directly after the
array, depending on the whims of the compiler / allocator.

Brian

-- 
   Brian Barrett
   LAM/MPI Developer
   Make today a LAM/MPI day!