LAM is successfully booted on node 0 , which has 2 processors.
The code works when it is executed with single processor.
An error occurred when it is executed with 2 processors on node 0, but it
worked before.
According to the error message, this error might be something system
related.
How to fix this problem ?
Error message:
[cycchou_at_matrx demos]$ make test
lamboot -v hostfile
LAM 6.5.4/MPI 2 C++/ROMIO - University of Notre Dame
Executing hboot on n0 (matrx.engr.ucdavis.edu - 2 CPUs)...
topology done
mpirun c0-1 calpi
--------------------------------------------------------------------------
---
It seems that some error has occurred during MPI_INIT. This will
cause your process to abort. These kinds of errors are usually
system-related, such as running out of disk space, running out of
memory, or something more serious such as data not being passed
between processes properly. That is, you should not be seeing this
error message; if you are, somethings is likely Very Wrong with your
system. :-(
Perhaps this Unix error message will help:
Unix errno: 1252
Unknown error 1252
--------------------------------------------------------------------------
---
--------------------------------------------------------------------------
---
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
PID 20156 failed on node n0 with exit status 228.
--------------------------------------------------------------------------
---
bufferd (getroute): invalid node
make: *** [test] Broken pipe
|