Hi,
we are on a 50 node G5 (Mac X 10.4.6) cluster, and had used a b30 lammpi
for a while. We recently tried b34 (both made from source), and still
get the same errors in lamtest just for tcp.
It passes most tests but then starts failing with
PASS: client_server
mpirun -x TEST -ssi cr none -s h C -ssi rpi crtcp
/Users/bjm/lamtests-7.1.2b34/dynamic/./comm_join
[**ERROR**]: LAM/MPI MPI_COMM_WORLD rank 1, file comm_join.c:143:
ERROR: Client could not connect() properly
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
Are these fixable, or do we ignore them ?
Most user programs work fine via lammpi b30 or b34
[mac27:~/lamtests-7.1.2b34] bjm% laminfo
LAM/MPI: 7.1.2b34
Prefix: /sw/lammpi-xlf.34
Architecture: powerpc-apple-darwin8.6.0
Configured by: bjm
Configured on: Tue May 16 15:08:20 MDT 2006
Configure host: mac27.cdc.noaa.gov
Memory manager: darwin7malloc
C bindings: yes
C++ bindings: yes
Fortran bindings: yes
C compiler: gcc-3.3
C++ compiler: g++-3.3
Fortran compiler: /opt/ibmcmp/xlf/8.1/bin/f77
Fortran symbols: plain
C profiling: yes
C++ profiling: yes
Fortran profiling: yes
C++ exceptions: no
Thread support: yes
ROMIO support: yes
IMPI support: no
Debug support: no
Purify clean: no
SSI boot: globus (API v1.1, Module v0.6)
SSI boot: rsh (API v1.1, Module v1.1)
SSI boot: slurm (API v1.1, Module v1.0)
SSI coll: lam_basic (API v1.1, Module v7.1)
SSI coll: shmem (API v1.1, Module v1.0)
SSI coll: smp (API v1.1, Module v1.2)
SSI rpi: crtcp (API v1.1, Module v1.1)
SSI rpi: lamd (API v1.0, Module v7.1)
SSI rpi: sysv (API v1.0, Module v7.1)
SSI rpi: tcp (API v1.0, Module v7.1)
SSI rpi: usysv (API v1.0, Module v7.1)
SSI cr: self (API v1.0, Module v1.0)
[mac27:~/lamtests-7.1.2b34] bjm% .
|