LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-02-28 09:08:21


On Feb 28, 2006, at 4:58 AM, Shenbaganathan M , Bangalore wrote:

> Good morning. We wanted to cluster the machines in Suse Linux
> platfom. The following system configuration we have. We have used
> Secure shell for connecting main & client systems. We have added
> the client systems in hosts & we have given authorization keys for
> guest users without asking password.
>
> System: HP XW8200 (workstation)
> OS : SUSE9.1
> LAM-MPI: lam-7.0.4
> ABAQUS : V 6.5.3
>
> When we solve the explicit problem, main and client system has
> connected by ssh & files are splited into client system also and
> after 10secs we are getting the following error
>
> ------------------------------------------------------------------
> Lnc: /abaqus/EES/trial3/beam_1.lnc linux2 1079 -p4amslave -
> p4yourname linux1 -p4rmrank 1
> Lnc: /abaqus/EES/trial3/beam_1.lnc linux2 1079 -p4amslave -
> p4yourname linux1 -p4rmrank 2
> p2_5044: p4_error: interrupt SIGSEGV: 11
> p0_7412: p4_error: net_recv read: probable EOF on socket: 1
> p0_7412: (0.936817) net_send: could not write to fd=7, errno = 32
> -------------------------------------------------------------------
>
> Please give us the suggestion to solve the above problem

Bogdan is correct - these errors are coming from MPICH, not LAM/MPI.
I believe that Abaqus builds against only one MPI implementation, but
I'm only somewhat familiar with their distribution process. You
might want to talk to whomever is responsible for your Abaqus
installation and ask them which MPI implementation you should be
using. If you are that person, then you might want to ask Abaqus
that question.

Hope this helps,

Brian

-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have a LAM/MPI day: http://www.lam-mpi.org/