On May 7, 2008, at 1:38 AM, richard pan wrote:
> Hi, I'm studying about Parallel processing right now. And I'm kinda
> new to this world.
> When I test my cluster implementation using inverse matrix, 1 mill
> times 1 mill, the MPI_Barrier always error. Is there a way to remove
> this error? I can assure you that the source code I've been using
> dont have any error, since I used it to test the same inverse matrix
> but only until about 6000 times 6000.
It's kind of hard to help when you don't include the error message
that MPI_BARRIER caused. That being said, generally errors in barrier
are caused by a previous problem, such as memory corruption from
overwriting an array. You might want to use a memory debugger such as
valgrind to make sure you don't have any issues in your code. Just
because something works at one matrix size does not mean that its
correct -- we've seen many times where one matrix size works and
another doesn't, simply because of what was placed directly after the
array, depending on the whims of the compiler / allocator.
Brian
--
Brian Barrett
LAM/MPI Developer
Make today a LAM/MPI day!
|