LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-04-07 10:01:28


On Apr 7, 2006, at 9:55 AM, Peeyush Jain wrote:

> I am doing ARMCI with lammpi-7.1.1 on two processor. When I check
> test.x on one processor, it works fine but when i check it on more
> than two computers, then i get
> the following error:
>
> mpirun -np 2 -v ./test.x
> 7605 ./test.x running on n0 (o)
> 3168 ./test.x running on n1
> ARMCI configured for 2 cluster nodes. Network protocol is 'TCP/IP
> Sockets'.
> 1:trying connect to host=peeyush, port=37912 t=5 111
> trying to connect:: Connection refused
> 1:armci_CreateSocketAndConnect: connect failed: -1
> Last System Error Message from Task 1:: Connection refused
> 1:armci_CreateSocketAndConnect: connect failed: -1
> 0:trying connect to host=localhost, port=32911 t=5 111
> trying to connect:: Connection refused
> 0:armci_CreateSocketAndConnect: connect failed: -1
> Last System Error Message from Task 0:: Connection refused
> 0:armci_CreateSocketAndConnect: connect failed: -1
> -10001(s):armci_AcceptSockAll:timeout waiting for connection: 0
> -10001(s):armci_AcceptSockAll:timeout waiting for connection: 0
> -10000(s):armci_AcceptSockAll:timeout waiting for connection: 0
> -10000(s):armci_AcceptSockAll:timeout waiting for connection: 0
> ----------------------------------------------------------------------
> -------
> One of the processes started by mpirun has exited with a nonzero exit
> code. This typically indicates that the process finished in error.
> If your process did not finish in error, be sure to include a "return
> 0" or "exit(0)" in your C code before exiting the application.
>
> PID 7605 failed on node n0 (172.26.117.167) with exit status 1.
> ----------------------------------------------------------------------
> -------
>
> I have configure my lam mpi with ifort and gcc compiler. and armci
> with
> mpif77 and mpicc. Can anyone of you please tell me what the problem is
> with my TCP/IP connection.

I'm afraid we can't help you. All of the errors (with the exception
of the one error reporting that your processes died unexpectedly) are
from ARMCI. You should probably ask them through their support
channels.

Brian

-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have a LAM/MPI day: http://www.lam-mpi.org/