LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: zkis_at_[hidden]
Date: 2006-01-10 12:59:22


Hi,

I am fighting for a while with a problem, and couldn't found a solution so
far. The problem is that my program exits regularly with the error message
pasted at the end of this message. My system is rather new, it consists of
AMD Athlon K7 and Intel Xeon processors, 100Mbit Ethernet connections, and
run (Debian distribution) Linux Kernel 2.6.10-14. I have lam-7.1.1
installed from a debian package. Beside the LAM MPI libraries I also use
parallel HDF5 in my program, installed from the libhdf5-lam-1.6.2-0
package + the necessary header files. The strange thing is that sometimes
my program ends correctly, but most of the time it exits with error. I
have tested the connection between the machines, there is no problem.
There is no error message in the log files either! No other application
complains, only my MPI programs. The program seems correct, under mpich no
such error occured.

I would very appretiate any suggestion.

Best wishes,

Zsolt Kis

PS: Sorry for double posting!!

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

zsolt_at_sas:/bird/pool/zsolt$ mpirun -np 15 ppu alma
File locking failed in ADIOI_Set_lock. If the file system is NFS, you need
to use NFS version 3 and mount the directory with the 'noac' option (no
attribute caching).
File locking failed in ADIOI_Set_lock. If the file system is NFS, you need
to use NFS version 3 and mount the directory with the 'noac' option (no
attribute caching).
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 23175 failed on node n3 (192.168.1.33) with exit status 1.
-----------------------------------------------------------------------------