LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Gustavo Seabra (gustavo.seabra_at_[hidden])
Date: 2008-10-10 12:54:26


Hi All,

I just installed LAM/MPI (Lam 7.1.3) in my computer, using Cygwin and
g95. The installation seems to have completed normally, although for
some reason LAM keeps creating a 'libtool' executable in the
installation root directory, even though libtool is installed in my
system: ("$ which libtool" returns "/usr/bin/libtool").

However, when I try to use LAM, I get this "bad file descriptor"
problem whenever I execute a *second* mpirun. (see the example below).
This happens when using the same program twice or using a second
(different) program. The first execution is always OK, the second
crashes the LAM environment.

If, instead, I start LAM with 'lamboot -d', some programs (e.g.
trivialf.exe) can execute multiple times with no problems. But others
still crash in the second execution. (The program I originally want to
use LAM with - AMBER - crashes on the second execution, just like
before). The 'hello.exe' example also crashes after the second
attempt.

Just some extra information that may (or may not) be relevant:
1. I installed LAM under a regular user account. However, all the
executables are in a directory in my account, and I have full rights
to all files.

2. During the installation, the Windows Firewall blocked "conftest",
but installation did not stop and went all the way to the end, no
problems.

3. The first time I used LAM, again the windows firewall blocked
"lamd" and "lamboot". After that, I manually added exceptions to lamd
lamboot, so it doesn't complain about it anymore. The firewall keeps
blocking any program that ses LAM, but the execution still runs to the
end with no problems (at least the first time, as I reported)

     This blocking from the Windows firewall is unlikely to be the
problem: I created an exception for "sander.MPI" (the executable for
the "AMBER" program) so that, when I use amber I now get no messages
from the firewall. Still, the same problem happens (it can be executed
only once).

4. I also tried issuing 'lamclean' between executions, doesn't help
(nothing changes).

If there is any more information that can be helpful in tracking this
down, please let me know.

What can be causing this? Any suggestions?

Thanks a lot.
Gustavo.

 Example: This is representative of the error I'm seeing.
=========================================
$ lamboot

LAM 7.1.3 - Indiana University

$ mpirun -np 2 trivialf.exe
 rank 1 received message
 rank 0 sent message
$ mpirun -np 2 trivialf.exe
 rank 1 received message
 rank 0 sent message
lamd kernel: problem with select() (1): Bad file descriptor
=========================================