I think Neil replied about this already...?
You aren't doing the dynamic allocation properly.
Static allocation is fine as long as your sizes don't get too big. If
they get too big, the compiler and/or the loader will have difficulties
allocating that much space on the stack (e.g., you'll see symptoms like
seg faulting when you run your program, but it doesn't even get to
main(), or perhaps it seg faults as soon as you call the function with
the massive arrays on the stack, etc.).
On Mar 10, 2005, at 1:45 PM, Kumar, Ravi Ranjan wrote:
> Hello Jeff,
>
> Thanks a lot for your reply!
>
> I wrote a test code to see how data is stored. Static allocation of 3D
> array
> code works fine but when I try to allocate arrays dynamically, I get
> error. Pls
> see the code and their results.
>
> #include<iostream.h>
> #include<math.h>
> #include<fstream.h>
> #include<time.h>
> #include<iomanip.h>
> #include<mpi.h>
> #include<stdio.h>
> #include<stdlib.h>
>
> int main( int argc, char **argv )
> {
> int myrank, count = 0;
> int A[3][4][5], B[60],C[3][4][5],i,j,k,m;
> MPI_Status status;
> MPI_Init( &argc, &argv );
> MPI_Comm_rank( MPI_COMM_WORLD, &myrank );
>
>
> for(k=0,m=0;k<3;k++)
> for(i=0;i<4;i++)
> for(j=0;j<5;j++,m++)
> B[m] = C[k][i][j] = A[k][i][j] = 0;
>
>
> if(myrank == 1)
> for(k=0;k<3;k++)
> {
> for(i=0;i<4;i++)
> for(j=0;j<5;j++)
> {
> A[k][i][j] = count++;
> cout<<A[k][i][j]<<" ";
> }
> cout<<endl;
> }
>
>
> if (myrank == 1) /* code for process 1 */
> MPI_Send(&A[1][0][0], 20, MPI_INT, 0, 99, MPI_COMM_WORLD);
>
> if (myrank == 0) /* code for process 0 */
> MPI_Recv(&B[0], 20, MPI_INT, 1, 99, MPI_COMM_WORLD, &status);
>
> if (myrank == 1) /* code for process 1 */
> MPI_Send(&A[2][0][0], 20, MPI_INT, 2, 99, MPI_COMM_WORLD);
>
> if (myrank == 2) /* code for process 2 */
> MPI_Recv(&C[1][0][0], 20, MPI_INT, 1, 99, MPI_COMM_WORLD,
> &status);
>
>
> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
> 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
> 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
>
> Rank = 1
> Rank = 0
>
> A[1][i][j] = 20 A[1][i][j] = 21 A[1][i][j] = 22 A[1][i][j] = 23
> A[1][i][j]
> = 24 A[1][i][j] = 25 A[1][i][j] = 26 A[1][i][j] = 27 A[1][i][j] =
> 28 A[1]
> [i][j] = 29 A[1][i][j] = 30 A[1][i][j] = 31 A[1][i][j] = 32
> A[1][i][j] =
> 33 A[1][i][j] = 34 A[1][i][j] = 35 A[1][i][j] = 36 A[1][i][j] = 37
> A[1][i]
> [j] = 38 A[1][i][j] = 39
>
> B[0] = 20
> B[1] = 21
> B[2] = 22
> B[3] = 23
> B[4] = 24
> B[5] = 25
> B[6] = 26
> B[7] = 27
> B[8] = 28
> B[9] = 29
> B[10] = 30
> B[11] = 31
> B[12] = 32
> B[13] = 33
> B[14] = 34
> B[15] = 35
> B[16] = 36
> B[17] = 37
> B[18] = 38
> B[19] = 39
> Rank = 2
> B[0] = 0 C[1][0][0] = 40
> B[1] = 0 C[1][0][1] = 41
> B[2] = 0 C[1][0][2] = 42
> B[3] = 0 C[1][0][3] = 43
> B[4] = 0 C[1][0][4] = 44
> B[5] = 0 C[1][1][0] = 45
> B[6] = 0 C[1][1][1] = 46
> B[7] = 0 C[1][1][2] = 47
> B[8] = 0 C[1][1][3] = 48
> B[9] = 0 C[1][1][4] = 49
> B[10] = 0 C[1][2][0] = 50
> B[11] = 0 C[1][2][1] = 51
> B[12] = 0 C[1][2][2] = 52
> B[13] = 0 C[1][2][3] = 53
> B[14] = 0 C[1][2][4] = 54
> B[15] = 0 C[1][3][0] = 55
> B[16] = 0 C[1][3][1] = 56
> B[17] = 0 C[1][3][2] = 57
> B[18] = 0 C[1][3][3] = 58
> B[19] = 0 C[1][3][4] = 59
>
> Another code is with dynamic allocation:
>
> #include<iostream.h>
> #include<math.h>
> #include<fstream.h>
> #include<time.h>
> #include<iomanip.h>
> #include<mpi.h>
> #include<stdio.h>
> #include<stdlib.h>
>
> int main( int argc, char **argv )
> {
> int myrank, count = 0,ln,br,thk;
> int ***A, *B,***C,i,j,k,m;
> MPI_Status status;
> MPI_Init( &argc, &argv );
> MPI_Comm_rank( MPI_COMM_WORLD, &myrank );
>
>
> ln = 5;
> br = 4;
> thk = 3;
>
> B = new int[ln*br];
>
> A = new int **[thk];
>
> for(i=0;i<thk;i++)
> {
> A[i] = new int *[ln];
> C[i] = new int *[ln];
> for(j=0;j<ln;j++){
> A[i][j] = new int [br];
> C[i][j] = new int [br];
> }
> }
>
> //if(myrank == 1)
> for(k=0,m=0;k<thk;k++)
> for(i=0;i<br;i++)
> for(j=0;j<ln;j++,m++)
> B[m] = C[k][i][j] = A[k][i][j] = 0;
>
>
> if(myrank == 1)
> for(k=0;k<thk;k++)
> {
> for(i=0;i<br;i++)
> for(j=0;j<ln;j++)
> {
> A[k][i][j] = count++;
> cout<<A[k][i][j]<<" ";
> }
> cout<<endl;
> }
>
>
> if (myrank == 1) /* code for process 1 */
> MPI_Send(&A[1][0][0], 20, MPI_INT, 0, 99, MPI_COMM_WORLD);
>
> if (myrank == 0) /* code for process 0 */
> MPI_Recv(&B[0], 20, MPI_INT, 1, 99, MPI_COMM_WORLD, &status);
>
> if (myrank == 1) /* code for process 1 */
> MPI_Send(&A[2][0][0], 20, MPI_INT, 2, 99, MPI_COMM_WORLD);
>
> if (myrank == 2) /* code for process 2 */
> MPI_Recv(&C[1][0][0], 20, MPI_INT, 1, 99, MPI_COMM_WORLD,
> &status);
>
> if(myrank == 2)
> {
> cout<<"Rank = "<<myrank<<endl;
> for(i=0,m=0;i<br;i++)
> for(j=0;j<ln;j++,m++)
> cout<<"B["<<m<<"] = "<<B[m]<<" C[1]["<<i<<"]["<<j<<"] =
> "<<C[1][i][j]
> <<endl;
> }
>
>
> if(myrank == 0)
> {
> cout<<"Rank = "<<myrank<<endl;
> for(i=0;i<ln*br;i++)
> cout<<"B["<<i<<"] = "<<B[i]<<endl;
> }
>
> if(myrank == 1)
> {
> cout<<"Rank = "<<myrank<<endl;
> for(i=0;i<br;i++)
> for(j=0;j<ln;j++)
> cout<<"A[1][i][j] = "<<A[1][i][j]<<" ";
> }
>
> for(i=0;i<thk;i++)
> {
> for(j=0;j<ln;j++){
> delete [] A[i][j];
> delete [] C[i][j];
> }
> delete [] A[i];
> delete [] C[i];
> }
>
> delete [] A;
> delete [] B;
> delete [] C;
>
> MPI_Finalize();
> return 0;
> }
>
> [rrkuma0_at_kfc1s1 SOR]$ mpirun -np 3 Diff3D_DynamicAllocation_SendRecv
> MPI process rank 0 (n0, p11384) caught a SIGSEGV.
> MPI process rank 1 (n0, p11385) caught a SIGSEGV.
> MPI process rank 2 (n0, p11386) caught a SIGSEGV.
> -----------------------------------------------------------------------
> ------
>
> One of the processes started by mpirun has exited with a nonzero exit
> code. This typically indicates that the process finished in error.
> If your process did not finish in error, be sure to include a "return
> 0" or "exit(0)" in your C code before exiting the application.
>
> PID 11384 failed on node n0 with exit status 1.
> -----------------------------------------------------------------------
> ------
>
>
> Kindly clarify why dynamic allocation is not working. Thanks again!
>
> Ravi R. Kumar
>
>
> Quoting Jeff Squyres <jsquyres_at_[hidden]>:
>
>> On Mar 9, 2005, at 12:41 PM, Kumar, Ravi Ranjan wrote:
>>
>>> Thank you for the reply! I'll apply non-blocking send-recv to check
>>> if
>>> problem still exits.
>>>
>>> I have another question on sending contiguous data. I want to know
>>> which index is traversed first among 3 indices (z, y & x) in
>>> T[Nz][Nx][Ny]. I wrote my code in C++ and MPI. Suppose, I want to
>>> send
>>> Nx*Ny number of data so I am simply using T[2][0][0] (if I want to
>>> send 3rd plane out of Nz planes of data) and I am receiving data in
>>> another processor at location (say) T[3][0][0]. I have doubt about
>>> x-y
>>> co-ordinate of data. Will it be received in the same fashion as it
>>> was
>>> sent?? Do I need to check which indices ( x or y ) is traversed
>>> first?
>>
>> Before answering that, you need to check how you allocated this
>> memory.
>> Generally, C arrays are contiguous starting at the lowest level
>> (e.g.,
>> foo[a][b][c] will be adjacent to foo[a][b][c+1]). But it depends on
>> how you allocated it whether the last element of the one row in the
>> 3rd
>> dimension is adjacent to the first element of the next row in the 3rd
>> dimension (e.g., whether foo[a][b][max] is adjacent to
>> foo[a][b+1][0]).
>> Similarly for traversing up to the 2nd dimension.
>>
>> Did you allocate the data in T with a single malloc, for example
>> (malloc(sizeof(double) * Nz * Nx * Ny))? And then setup pointers for
>> T[][]? Or did you malloc each row and/or plane separately?
>>
>> In your case, it is *probably* best to use a single malloc and get the
>> pointer arrays to match, because then (viewing the data as
>> T[a][b][c]),
>> you can send contiguous c rows, and bXc planes (which is what I'm
>> assuming you want). You can also setup datatypes to do more
>> interesting behavior if you need it (e.g., aXc planes). But you
>> really
>> can only do these datatypes if you use the one-big-malloc approach;
>> you
>> need to be able to guarantee repetitive absolute differences in
>> addresses (which you cannot if, for example, you malloc each row
>> separately).
>>
>> BTW, I saw *probably* because I don't know your exact application and
>> hardware and whatnot. Your mileage may vary depending on your
>> specific
>> setup.
>>
>> Note that I'm not entirely sure how
>>
>> double ***foo = new[a][b][c]
>>
>> lays out the memory. I *think* it's just like the one-big-malloc
>> approach, but I'm not 100% sure of that...
>>
>> All that being said, as long as you alloc the memory the same way in
>> both the sender and the receiver, what you send is what you'll
>> receive.
>> So if you use the one-big-malloc approach on both sides and send b*c
>> doubles starting at &T[a1][0][0], and receive b*c doubles starting at
>> &T[a2][0][0], you'll be ok.
>>
>> Make sense?
>>
>> --
>> {+} Jeff Squyres
>> {+} jsquyres_at_[hidden]
>> {+} http://www.lam-mpi.org/
>>
>>
>>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|