Hi Anju,
Thank you for your response; I have attached some config logs for you. Are
you using Cywin 1.5.10-3?
I ran lamboot:
$ lamboot -v -ssi boot rsh /usr/local/etc/lam-hostmap.txt
LAM 7.1b18/MPI 2 C++ - Indiana University
n-1<3068> ssi:boot:base:linear: booting n0 (xxxxx)
n-1<3068> ssi:boot:base:linear: finished
I can run mpirun without any arguments, and I can run lamnodes as well:
$ lamnodes
n0 xxxxx.xxxxxxxxxx.xxxx.xxxxxxx.xxx:1:origin,this_node
Here's what I get when I run hello with mpirun:
xxxx_at_xxxxx /mpi/lam-7.1b16/examples/hello
$ mpirun C ./hello
----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).
mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
----------------------------------------------------------------------------
xxxx_at_xxxxx /mpi/lam-7.1b16/examples/hello
$
Nothing seems to happen when I run hello without mpirun:
xxxx_at_xxxxx /mpi/lam-7.1b16/examples/hello
$ hello
xxxx_at_xxxxx /mpi/lam-7.1b16/examples/hello
$
However, I built a debug 1.5.10-3 cygwin1.dll from source and used gdb to
discover a segfault:
$ gdb hello
GNU gdb 2003-09-20-cvs (cygwin-special)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-cygwin"...
(gdb) run
Starting program: /mpi/lam-7.1b16/examples/hello/hello.exe
Program received signal SIGSEGV, Segmentation fault.
0x61011e24 in my_findenv(char const*, int*) (
name=0x45b2d6 "MALLOC_TRIM_THRESHOLD_", offset=0x22eff4)
at ../../../winsup/cygwin/environ.cc:177
177 for (p = cur_environ (); *p; ++p)
Current language: auto; currently c++
(gdb)
I tried linking in the dmalloc library, but I got multiple definitions of
symbols; the details are below. Will these go away if I use -DWITH_DMALLOC?
Making all in laminfo
make[2]: Entering directory `/mpi/lam-7.1b18/tools/laminfo'
if g++ -DHAVE_CONFIG_H -I. -I. -I../../share/include
-DLAM_PREFIX="\"/usr/local
\"" -DLAM_BINDIR="\"/usr/local/bin\"" -DLAM_LIBDIR="\"/usr/local/lib\""
-DLAM_IN
CDIR="\"/usr/local/include\"" -DLAM_PKGLIBDIR="\"/usr/local/lib/lam\""
-DLAM_SYS
CONFDIR="\"/usr/local/etc\"" -I../../share/include -DLAM_BUILDING=1
-D_REENTRAN
T -g -Wall -Wundef -Wno-long-long -MT laminfo.o -MD -MP -MF
".deps/laminfo.Tpo
" -c -o laminfo.o laminfo.cc; \
then mv -f ".deps/laminfo.Tpo" ".deps/laminfo.Po"; else rm -f
".deps/laminfo.Tpo
"; exit 1; fi
/bin/bash ../../libtool --mode=link g++ -g -Wall -Wundef -Wno-long-long
-ldma
lloc -L/usr/local/lib -o laminfo.exe laminfo.o
../../share/libmpi/libmpi.la .
./../share/liblam/liblam.la
mkdir .libs
g++ -g -Wall -Wundef -Wno-long-long -o laminfo.exe laminfo.o
-L/usr/local/lib .
./../share/libmpi/.libs/libmpi.a ../../share/liblam/.libs/liblam.a -ldmalloc
/usr/local/lib/libdmalloc.a(malloc.o)(.text+0xba0): In function `malloc':
/mpi/dmalloc/dmalloc-5.3.0/malloc.c:1050: multiple definition of `_malloc'
../../share/libmpi/.libs/libmpi.a(malloc.o)(.text+0x24e9):/mpi/lam-7.1b18/sh
are/
memory/ptmalloc2/malloc.c:3286: first defined here
/usr/local/lib/libdmalloc.a(malloc.o)(.text+0xbe0): In function `calloc':
/mpi/dmalloc/dmalloc-5.3.0/malloc.c:1079: multiple definition of `_calloc'
../../share/libmpi/.libs/libmpi.a(malloc.o)(.text+0x2b6f):/mpi/lam-7.1b18/sh
are/
memory/ptmalloc2/malloc.c:3516: first defined here
/usr/local/lib/libdmalloc.a(malloc.o)(.text+0xc20): In function `realloc':
/mpi/dmalloc/dmalloc-5.3.0/malloc.c:1111: multiple definition of `_realloc'
../../share/libmpi/.libs/libmpi.a(malloc.o)(.text+0x26f7):/mpi/lam-7.1b18/sh
are/
memory/ptmalloc2/malloc.c:3367: first defined here
/usr/local/lib/libdmalloc.a(malloc.o)(.text+0xca0): In function `memalign':
/mpi/dmalloc/dmalloc-5.3.0/malloc.c:1176: multiple definition of `_memalign'
../../share/libmpi/.libs/libmpi.a(malloc.o)(.text+0x28ae):/mpi/lam-7.1b18/sh
are/
memory/ptmalloc2/malloc.c:3441: first defined here
/usr/local/lib/libdmalloc.a(malloc.o)(.text+0xce0): In function `valloc':
/mpi/dmalloc/dmalloc-5.3.0/malloc.c:1206: multiple definition of `_valloc'
../../share/libmpi/.libs/libmpi.a(malloc.o)(.text+0x2a44):/mpi/lam-7.1b18/sh
are/
memory/ptmalloc2/malloc.c:3486: first defined here
/usr/local/lib/libdmalloc.a(malloc.o)(.text+0xd90): In function `free':
/mpi/dmalloc/dmalloc-5.3.0/malloc.c:1272: multiple definition of `_free'
../../share/libmpi/.libs/libmpi.a(malloc.o)(.text+0x2650):/mpi/lam-7.1b18/sh
are/
memory/ptmalloc2/malloc.c:3326: first defined here
/usr/local/lib/libdmalloc.a(malloc.o)(.text+0xdc0): In function `cfree':
/mpi/dmalloc/dmalloc-5.3.0/malloc.c:1303: multiple definition of `_cfree'
../../share/libmpi/.libs/libmpi.a(malloc.o)(.text+0x2fa2):/mpi/lam-7.1b18/sh
are/
memory/ptmalloc2/malloc.c:3670: first defined here
collect2: ld returned 1 exit status
make[2]: *** [laminfo.exe] Error 1
make[2]: Leaving directory `/mpi/lam-7.1b18/tools/laminfo'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/mpi/lam-7.1b18/tools'
make: *** [all-recursive] Error 1
xxxx_at_xxxxx /mpi/lam-7.1b18
$
Thanks again,
Mark
-----Original Message-----
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf Of
Prabhanjan Kambadur
Sent: Monday, August 30, 2004 11:29 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: lam7.1b16 doesn't work on Cygwin 1.5.10 Windows XP SP1
Hi,
Sorry for the late reply. I tried replicating the issue, but could not.
Could you do a couple of things for me:
1. Could you send me the config logs for the build
2. Were you able to run any other LAM executables such as lamboot or
mpirun. Could you try running a simple hello world program and let me know
whether you see similar results.
Anju
On Thu, 26 Aug 2004, Mark wrote:
> I sent this to the developers' list, but perhaps this list is more
> appropriate for this issue.
>
>
>
> I built lam-7.1b16 on Cygwin 1.5.10 Windows XP SP 1, and when I type in
> laminfo at the prompt the cursor returns without printing any information,
> and I get the following message when I try to run hello:
>
>
>
> xxxx_at_xxxxx /mpi/lam-7.1b16/examples/hello
>
> $ mpirun C hello
>
>
----------------------------------------------------------------------------
> -
>
> It seems that [at least] one of the processes that was started with
>
> mpirun did not invoke MPI_INIT before quitting (it is possible that
>
> more than one process did not invoke MPI_INIT -- mpirun was only
>
> notified of the first one, which was on node n0).
>
>
>
> mpirun can *only* be used with MPI programs (i.e., programs that
>
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
>
> to run non-MPI programs over the lambooted nodes.
>
>
----------------------------------------------------------------------------
> -
>
>
>
>
>
>
>
> When I debugged laminfo, I got the following:
>
>
>
>
>
>
>
> $ gdb laminfo
>
> GNU gdb 2003-09-20-cvs (cygwin-special)
>
> Copyright 2003 Free Software Foundation, Inc.
>
> GDB is free software, covered by the GNU General Public License, and you
are
>
> welcome to change it and/or distribute copies of it under certain
> conditions.
>
> Type "show copying" to see the conditions.
>
> There is absolutely no warranty for GDB. Type "show warranty" for
details.
>
> This GDB was configured as "i686-pc-cygwin"...
>
> (gdb) break 136
>
> Breakpoint 1 at 0x4010b8: file laminfo.cc, line 136.
>
> (gdb) run
>
> Starting program: /usr/local/bin/laminfo.exe
>
>
>
> Program received signal SIGSEGV, Segmentation fault.
>
> 0x6101fd44 in dlfork () from /usr/bin/cygwin1.dll
>
> (gdb)
>
>
>
>
>
>
>
> Does anyone have any insight into this problem?
>
>
>
> Thanks,
>
>
>
> Mark
>
>
>
>
>
>
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
- application/octet-stream attachment: libtool
|