LAM/MPI logo

What's new in Subversion vs. the current release?

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in SVN info
Last updated: Friday, March 28, 2008 at 10:51PM

This page gives a brief overview of what's in Subversion that's not in the current stable release of LAM/MPI (listed in more-or-less reverse chronological order):

  • Add gfortran to the list of Fortran compilers searched in configure.
  • Work around apparent alignment issue in OS X 10.5's CMSG_DATA() macro when compiling on Intel Macs in 64 bit mode, which would cause mpirun to fail with stdio errors. Also properly initialize msg_flags in the sfh_send_fd() and sfh_recv_fd() calls.
  • Fix a number of places in configure where main() was being improperly declared due to Autoconf's square bracket eating. Thanks to Jeff Squyres for the patch.
  • Add -Wl,-search_paths_first to wrapper LDFLAGS on OS X to combat issue with OS X linker finding Open MPI's libmpi.dylib in /usr/lib instead of LAM/MPI's libmpi.a due to the dynamic-first search policy.
  • Fix Fortran interface functions for MPIL_Trace_on and MPIL_Trace_off.
  • Clean up shared library dependencies to support use of --as-needed linker option on GNU systems. Thanks to Justin Bronder for bringing this to our attention.

  • Released LAM/MPI 7.1.4
    These changes are available both on the Subversion trunk and the tags/lam-7-1-4 Subversion tag.
  • Work around some batch schedulers (BJS, LANL's BProc + MOAB) from killing the lamds when lamboot exits by keeping a child of lamboot around for the life of the lamds.
  • Properly escape SSI parameters and pass SSI parameters to lamboot when using mpiexec. Also use /tmp or $TMPDIR for the app schema. Thanks to Sam Steingold for bringing this to our attention.
  • Allow user to disable building the TM or SLURM boot ssi module, even if the libraries are available on the system. Thanks to Jens Klostermann for bringing this to our attention.
  • Fix compile issue on NetBSD 3.0 and later. Thanks to Aleksey Cheusov for the patch.
  • Properly handle slurm clusters where all nodes do not have the same prefix in a hostname. Thanks to Moe Jette for the patch.

  • Released LAM/MPI 7.1.3
    These changes are available both on the Subversion trunk and the tags/lam-7-1-3 Subversion tag.
  • A number of man page cleanups suggested by Eric Raymond.
  • Search for tkill, in addition to the default install location and /bin. Also, do not segfault if tkill is not found after searching all these locations. Thanks to Josh Lehan and Jeff Squyres for the patch.
  • Abort rather than hang if lamboot is unable to get the list of local network devices.
  • Fix for hangs in 64 bit nuilds on Mac OS X systems (Intel and PowerPC).
  • Correct check for localhost in hostfiles during lamboot to check for 127.0.0.0/8 instead of 127.0.0.1/32, to meet RFC 1700. Thanks to Martin Knoblauch for the patch.
  • Added support for Fortran types MPI_REAL{4,8,16} for predefined reduction operations supported by floating point types (MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD).
  • Fixed error in IB rpi that could cause compiler errors with some compilers. Thanks to Jens Klostermann for bringing this to our attention.
  • Fixed error with MPI_COMM_ACCEPT on Fedora 4 that would cause a "bad address error". Thanks to Orion Poplawski for the fix.
  • Renamed internal strtonum function lam_strtonum to avoid clashes with a function of the same name in FreeBSD.
  • Fixed installation issue on Cygwin when trying to make symlinks to executables (such as lamwipe -> wipe).
  • Fix bug with comments in hostfiles where a comment in the middle of a line would cause the entire line to be ignored. Thanks to Christian Siebert for bringing this to our attention.
  • Build the totalview queue debugging shared object as a dynamically loaded shared object instead of a shared library. Fixes an issue on Mac OS X where the TotalView library could not be found.
  • Cleanup restart logic in 'cr self' module. Add a bunch of documentation regarding this module to the man page, and user docs. Thanks to Jeff Squyres for helping in this effort.

  • Released LAM/MPI 7.1.2
    These changes are available both on the Subversion trunk and the tags/lam-7-1-2 Subversion tag.
  • Fix MPI_COMM_SPAWN problem with app schema info keys.
  • Fix assembly problem for AIX in 64 bit mode with usysv.
  • Fix bad cast in C++ bindings with MPI::Win::Create() that was causing an invalid MPI_Comm to be passed to the underlying MPI_Win_create() call.
  • Woarkaround for newer MVAPI implementations that call free() in VAPI_deregister_mr(), which was causing hangs in certain situations. As a result, sbrk() is not called with a negative value, so for a very small number of applications, memory usage might be slightly higher.
  • Fixed references to cr_base_dir in user docs -- the SSI parameters is actually cr_blcr_base_dir.
  • Fixed error that resulted in wipe not properly using the session directory prefix/suffix options.
  • Fix two errors in ptmalloc2 code. The "TSD hack" was not properly enabled, causing an infinite loop leading to a segfault. Also, we were not properly intercepting munmap().
  • Fix a bug deep within MPI_INIT that prevented using IMPI.
  • Updated to GNU Libtool 1.5.22.
  • Add another setsid() in hboot to facilite working in SGE environments.
  • Fixed a command-line parsing problem with mpiexec.
  • Renable a few virtual destructors in the C++ bindings.
  • Updated to GNU Automake 1.9.6.
  • Don't add external declarations for the PMPI_W{TICK,TIME} functions if profiling isn't enabled. It appers that some compilers (g95) will try to resolve the symbols if they are prototyped.
  • Added work around for Apple's mis-interpretation of the use of semctl's 4th argument, as specified in IEEE Std 1003.1, 2004 Edition. Correct reading would say that the 4th argument when cmd is SETVAL should be the specified union, with the val having meaning. Apple interpreted it to mean the 4th argument should be an integer. There is significant difference on big-endian LP64 machines. Note that every other 64 bit big endian Unix (including Linux, Solaris, AIX, and IRIX) take the first interpretation.
  • Added support for Fortran types MPI_INTEGER{1,2,4,8} for predefined reduction operations supported by Fortran integer types (MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD, MPI_BAND, MPI_BOR, MPI_BXOR).
  • Fix problem where singletons and jobs launched via mpirun could not MPI_COMM_CONNECT / MPI_COMM_ACCEPT each other.
  • Fix silly putenv() mistake in hboot.c.
  • Work around bug in net/if.h header file on OS X 10.4 in 64bit mode that was preventing ioctl(..., SIOCGIFCONF, ...) from working.
  • Fixed corner case in the rsh boot SSI module where we were invoking .profile on the remote side for Bash shells, when it really wasn't necessary (because bash will invoke .bashrc automatically).
  • Properly handle MPI_*_NULL for the MPI_*_c2f functions.
  • Added missing MPI_ROOT definition to mpif.h.in.
  • Always populate mpirun's MPIR_proctable structure so that parallel debuggers can find all processes in the job. Previously, the table was only populated if mpirun was also starting a parallel debugger daemon on all the nodes (ie, -tv was given as an option to mpirun).
  • Added access to the Fortran datatypes from the C mpi.h.
  • Change default behavior of lamhalt to wait until all lamds are dead before returning. Add -i (immediate) option that replicates older (deprecated) behavior -- lamhalt returns immediately, most likely before the universe is completely halted.
  • Added some missing man pages to LAM tarballs (lamnodes, lamhalt).
  • Make the lack of a PATH environment variable in hboot not be an error.
  • Added and to various configure tests that link Fortran executables so that systems with icky Fortran installations can add additional linker flags / libraries. Documented that OS X Tiger (10.4) users should probably add LIBS=-lSystemStubs to their configure line because gfortran doesn't do it automatically, and not having it will cause several of our tests to incorrectly fail due to missing symbols.
  • MPI_GET_VERSION checked to see if MPI_INIT had already been called, which is erroneous (MPI-2, 3.1).
  • Somehow the processing for lamboot's -b option was removed; fixed.
  • Per advice from the ROMIO maintainers, remove some NFS locking tests from romio/configure.
  • Fixed a problem with the upper and lower bounds when creating DARRAY datatypes.
  • Fixed number of prefix 0's generated in SLURM host lists.
  • Fixed MPI_ALLTOALLW fortran wrapper.
  • Fixed a potential infinite loop in tkill if some system calls returned bogus values.
  • Make BSD systems have a default of "none" for the memory manager.
  • The SLURM boot SSI module help messages all accidentally had the wrong filename, so if anything ever went wrong, no help message would be printed. Fixed.
  • Workaround for a bug in some versions of gcc that masked the debugging definition of struct _proc, which caused problems when using the TotalView debugger.
  • Changed LAM Basic MPI_Bcast binomial tree algorithm to complete send to one process before starting the next send, resulting in much better performance in some situations.
  • Fixed bug in smp collective module that would cause corrupted collective operation if multiple communicators with different sizes were created.
  • Fixed bug that allowed multi-word values to accidentally propagate to incorrect places (like WRAPPER_EXTRA_LDFLAGS).
  • Fix ROMIO test for Fotran linking convention on OS X by using nm instead of strings on that platform
  • Update to the Totalview docs for TV v6.6.
  • Fixed problem with zero padding in Slurm host list parsing.
  • Add support for BProc implementations without bproc_vexecmove support.
  • Fixed bad egrepping in configure to snarf LDFLAGS and LIBS from generated Makefiles.
  • Fixed errors in pthread tests that could result in incorrect flags being set for threading when f77 is used for linking. Also fixed an error where the Linux pthreads test could give a harmless false positive.
  • Converted IMPI coll module to coll API v1.1.0.
  • Fixed verbosity in coll selection to print the module that was selected, not the last module that was examined.
  • Fix problem with the usysv RPI on the Apple G5 platform. The G5 can reorder writes to improve memory performance, which was causing failures in the synchronization routines. Added sync instruction to force data / lock writes to be ordered.
  • Expanded ib RPI configure tests to look for vapi.h and the VAPI libraries in odd places for some IB implementations.
  • Fix some forgotten / bit-rot compile errors in the impi coll module.
  • Set a number of LAM daemon sockets to be close on exec to eliminate wasted file descriptors in clients
  • Reordered tkill shutdown to better support platforms with /tmp in NFS.
  • Patch libtool to recognize Portland C compilers so that snarfing flags from libtool does the right thing in the MPI wrapper compilers.
  • Fixed compile problem with recent gcc versions (missing #include in a really old source file).
  • Fix a problem with some ancient F77 compilers and remove all single and double quotes from mpif.h.
  • Fix a problem inadvertantly caused by bug 682: instead of trying to rectify crmpi modules that are sent by MPI processes to the spawning agent, simply disallow MPI_COMM_SPAWN'ed processes from being checkpointable.
  • LAM no longer examines the (argc, argv) that comes in from MPI_INIT because it can cause problems in some scenarios.
  • Fix an uninitialized variable that can cause seg faults in the rsh boot SSI.
  • Re-enable stdin for rank 0; this was accidentally disabled in 7.1.
  • Add a configure test for inet_ntop() in the slurm boot module so that environments that do not have that function (e.g., Cygwin) will not attempt to compile the slurm module.
  • Fix wrapper compiler LDFLAGS and LIBS.
  • Updated User Guide to clarify the ib RPI module scalability restrictions.
  • Re-enable a virtual destructor for MPI::Comm_null.
  • Escape linker flags added to the wrapper compilers' LDFLAGS to support the OS X malloc intercept code with -Wl,. The XL compilers were getting confused by the -u _lam_darwin_malloc_linker_hack option when it was passed to them.
  • Add zsh to the list of shells that do not have the .profile script explicitly run for lamboot. This list includes csh-derived shells and bash, as both have a set of scripts run for non-interactive logins.
  • Fix missing space in test for the existance of a .profile script when using an sh-derived shell.

  • Released LAM/MPI 7.1.1
    These changes are available both on the Subversion trunk and the tags/lam-7-1-1 Subversion tag.
  • Upgraded to Libtool v1.5.8.
  • Added rpi_ib_mtu SSI param (see User Guide for more info).
  • Fixed minor problem with ib RPI startup code that prevented it from working on some vendor IB stacks.
  • Fixed problem with --export-dynamic showed up in the wrapper compiler underlying commands.
  • Don't emit warning on stderr and abort if we get a permission denied when killing a process with tkill. If the lamd dies uncleanly, it is possible for another process (possibly with another user) to end up with that lamd's pid which will cause tkill to have problems later (if the pid is another users).

  • Released LAM/MPI 7.1
    These changes are available both at the Subversion trunk and the tags/lam-7-1 Subversion tag.
  • Add the --with-memory-manager=external flag that allows LAM to be configured to allow external triggering of its sbrk() interception code. See the LAM/MPI Installation Guide release notes on Myrinet and Infiniband for more details.
  • Added first version of Infiniband RPI module (ib).
  • Fix a problem where $includedir/lam_config.h may end up with permissions affected by the installer's default umask instead of being set to 0644.
  • Added preliminary support for the upcoming BProc 4.0 release.
  • Added ability for mpirun to start applications that have execute but not read permissions. Only works if the -s option is not given to mpirun. Also fixed path searching problem when ./test was specified as command to mpirun.
  • mpirun is now better about returning non-zero in the cases where the launched job aborts before calling MPI_INIT.
  • Add support for ptmalloc2 and Apple Darwin/OS X memory managers when catching deallocations for unpinning user memory.
  • Added possibility of using IMPI_HOST_NAME environment variable for external name publishing.
  • Added support for optional MPI datatypes MPI_INTEGER1, MPI_INTEGER2, MPI_INTEGER4, MPI_REAL4, and MPI_REAL8. Added support for non-existant MPI datatypes (!) MPI_INTEGER8, MPI_REAL16.
  • Added boot_rsh_ignore_stderr SSI parameter for users too lazy to fix their "dot" files. :-)
  • Added SLURM boot SSI module.
  • Added support for run-time dynamically loaded SSI modules. A LAM installation can therefore be extended by simply adding a shared library SSI module into a specific directory.
  • Various gm RPI fixes:
    • Added --with-rpi-gm-lib option to specify a non-default location for the GM library.
    • Fix for incorrectly handling when gm dropped packets.
    • Performance improvements in the gm RPI; no more "short" message protocol -- only "tiny" and "long".
    • Added "fast" support for the gm rpi module, although it's unreliable for communication-intense applications (and therefore disabled by default).
    • Support for building the gm rpi module dynamically.
    • The gm RPI module now supports checkpoint/restart (must set the rpi_gm_cr SSI parameter to 1).
    • Enable experimental use of the gm 2.x gm_get() function for long messages when explicitly asked for with the --with-rpi-gm-get configure switch.
  • Added smp-aware collective algorithms for the following MPI functions: MPI_ALLGATHER, MPI_ALLGATHERV, MPI_REDUCE_SCATTER, MPI_SCAN
  • Added new MPI functions: MPI_EXSCAN and MPI_ALLTOALLW.
  • Added mpi_hostmap SSI parameter to transform the IP addresses supplied by the LAM run-time environment to an alternate set of addresses that will be used for MPI communications.
  • Added option "-prefix " in lamboot and lamwipe to allow users to switch between LAM installations without having to modify their local environments.
  • Added prefix parameter for the rsh boot module boot schema files to allow users to specify different LAM installation paths on different nodes.
  • Added a new MPI_COMM_SPAWN info key (lam_no_root_node_schedule) to disallow processes to be spawned on the root node.
  • Wrapper compilers now do not add any additional flags unless there is at least one argv that does not begin with "-" (e.g., "mpicc -v" will not add any additional LAM/MPI-specific flags).
  • Added options:cxx_exceptions output in laminfo to indicate whether LAM was configured --with-cxx-exceptions or not.
  • Added -param option to laminfo to display available SSI parameters and their default values.
  • Added -showme:compile and -showme:link flags to the wrapper compilers for printing out the compiler and linker flags, respectively. For example "cc foo.c `mpicc -showme:compile`" and "cc foo.o `mpicc -showme:link` -o foo".
  • Performance improvements in the gm RPI; no more "short" message protocol -- only "tiny" and "long".
  • Renamed "wipe" command to "lamwipe" per request from the Mandrake Cooker team. The name "wipe" is now deprecated, and will be removed in some future release.

  • LAM/MPI 7.0.7 (unreleased; all included in 7.1)
    These changes are available both at the Subversion trunk and the branches/branch-7-0 Subversion branch.
  • Removed the reset of the MAKE macro in romio/Makefile.in that disallowed using a make other that what is found at configure time.
  • Fixed some missing header files that caused unresolved symbols on some platforms.
  • Added possitiblity of --without-exflags to force not using any special C++ exception compiler flags.
  • Fix man page sections.
  • Only execute .profile if it exists in the rsh module.

  • Released LAM/MPI 7.0.6
    These changes are available both at the Subversion trunk and the tags/lam-7-0-6 Subversion tag.
  • Fixed error in lamnet code used to find available interfaces when we don't pre-allocate enough space.
  • Fixed ordering of LAM_SESSION_SUFFIX and batch system ID evaluation when determining the session directory suffix.

  • Released LAM/MPI 7.0.5
    These changes are available both at the Subversion trunk and the tags/lam-7-0-5 Subversion tag.
  • Fix an obscure race condition that could occur if running in a LAM universe with more than 255 nodes.
  • Make getorigin() and getnodeid() return proper value in _kio, if they are called before kenter(), based on the pids.
  • Fix the calculation of upper and lower bound of datatype which is used for the calculation of extent. The fix handles the cases where the block size is 0 and it is the first or the last block of the datatype.
  • Fix the value of MPI_ERRCODES_IGNORE to be a (int *) 0 instead of (void *) 0.
  • Add fix to set TCP socket buffer size to run-time value of ssi_rpi_tcp_short / ssi_rpi_crtcp_short in all rpi modules, as relevant.
  • Change the lam-helpfile to correct the error in lamboot synopsis. Add -s and delineate the options -bdhHlsvVx. Also correct all those cases where all options were lumped together.
  • Fix minor prototype problem with lam_ksignal().
  • Make network interface code allow for arbitrary numbers of interfaces on the localhost.
  • Add dependant libraries for the PBS TM library on Solaris.
  • Fixes to the SGE detection logic for the session directory.
  • Updates to documentation about Globus module.

  • Released LAM/MPI 7.0.4
    These changes are available both at the Subversion trunk and the tags/lam-7-0-4 Subversion tag.
  • Update docs to reflect true behavior of LAM_MPI_SESSION_PREFIX.
  • Do not propagate LAM_MPI_SESSION_PREFIX via mpirun.
  • Fixed crtcp rpi deadlock handling for deferred writes during a checkpoint in the presence of other blocking reads.
  • Better fix for Libtool 1.5 broken icc -c/-o test; patch the generated configure script to remove the bad commands.
  • Fixed minor typo in blcr cr module configure scripts.

  • Released LAM/MPI 7.0.3
    These changes are available both at the Subversion trunk and the tags/lam-7-0-3 Subversion tag.
  • Minor fixes with bad printf() formats in the kenyad and blcr/crlam.
  • Workaround for libtool 1.5 bug with the Intel compiler (libtool didn't think that icc supported -c and -o at the same time).
  • Changed LAM_CONFIGURE_* macros from -D command line arguments to #define's to prevent problems with some compilers that don't like -D values with embedded spaces.
  • Changed search order for Fortran compilers to look for GNU g77 before f77 so that the default matches the defaults for the C and C++ compilers.
  • Removed LAM_NEED_SYS_SELECT_H, instead including sys/select.h any time it is available.
  • Updated SYS V semaphore and shmem tests to check for functionality. Adds -lrt (Solaris) and -lcygipc (Cygwin) if needed.
  • Added configure switch --with-fd-setsize to increase the size of an FD_SET and increase the soft per-process file descriptor limit on platforms that support such things. This should allow larger TCP LAM jobs on. Be sure to read the release notes for your platform before using this option.

  • Released LAM/MPI 7.0.2
    These changes are available both at the Subversion trunk and the tags/lam-7-0-2 Subversion tag.
  • Fixed a problem in LAM's distribution scripts that accidentally left out the gm RPI from the 7.0.1 tarballs.

  • Released LAM/MPI 7.0.1
    These changes are available both at the Subversion trunk and the tags/lam-7-0-1 Subversion tag.
  • Removed legacy function panic() because it conflicts with a function in OS X's system headers with the same name.
  • Fixed a problem with the sbrk() declaration in ptmalloc.c and the Portland C compiler.
  • Fixed a problem with the boot_rsh_agent SSI parameter not being recognized properly.
  • Fixed a problem with mpirun's default running with tracing enabled. Tracing is now only enabled if -t, -ton, or -toff is specified on the mpirun command line (see mpirun(1) for more information).
  • Fixed a memory leak when freeing a datatype created by MPI_Type_create_hindexed.
  • Fixed a minor problem with the cr_base_dir SSI parameter.
  • Fixed a couple of problems with duplicate symbols on OS X when using the Fortran bindings.
  • Fixed thread configure tests to test a much wider variety of thread compiler and linker flags.
  • Ensure that relevant compiler and linker flags are propgated properly to SSI configure scripts so that we compile all of LAM with the same flags.
  • Added support for GM-2.x in the gm rpi module.
  • Removed errant "-" typo in MPI_Intercomm_merge.
  • Minor #include fixes for FreeBSD 4.x.
  • Made the tests for getsockopt() and recvfrom() more robust.
  • Fixed a problem with opening unix sockets with really long filenames (e.g., in PBS Pro environments).
  • Add --with-romio-libs=LIBS to allow passing of arbitrary LDFLAGS/LIBS args down to the environment of ROMIO's configure script and also into the wrapper compilers. e.g., when building ROMIO with PVFS support, "-lpvfs" needs to be added in both places.

  • Released LAM/MPI 7.0
    These changes are available both at the Subversion trunk and the tags/lam-7-0 Subversion tag.
  • Allow the internal "name" (argv\0\ to underlying MPI_Init) for FORTRAN programs to be overridden by the environment variable LAM_MPI_PROCESS_NAME.
  • Fixed file descriptor leak for non-MPI processes (and MPI procs that did not exit properly) in the lamd.
  • Added mpiexec for portable MPI process startup (described in MPI 2 standard). mpiexec also has support for "one shot" lamboot, mpirun and lamhalt.
  • Restore umask to original value when launching application from the lamd, as the lamd runs with a umask of 077.
  • Updated ROMIO to v1.2.5.1. Revamped ROMIO configure/build process to be better integrated with LAM.
  • bproc boot SSI support added; can now lamboot on bproc clusters (still launches a lamd on every node). Added bonus that "mpirun C|N foo" will, by default, not run on the bproc head node.
  • lamnodes now reports per-node flags, such as "origin", "this_node", and "no_schedule".
  • Re-activated long-unused feature in LAM to not schedule MPI and serial processes on selected nodes. For example, you can lamboot on a head node and some compute nodes and have "mpirun C foo" only run on the compute nodes.
  • Added new laminfo command to get detailed information about LAM's configuration, including available SSI modules and their various version numbers.
  • Added support for attaching TotalView debugger to MPI processes launched by mpirun, including support for the partial-attach feature provided by TotalView. Also include support for examining messages queues.
  • MPI collectives have been SSI-ized. The LAM collective algorithms have been moved into a module named lam_basic. See lamssi_coll(7).
  • Increased the number of MPI tags and communicator contexts available in all RPIs where this was possible (i.e., everything except lamd). MPI jobs that do not use the lamd RPI will now automatically get use of more MPI tags and simultaneous communicators. Additionally, increased the efficiency of the communicator context ID allocation algorithm (at the expense of communication efficiency during communicator construction).
  • Try to use -pthread when compiling with POSIX threads and GNU compilers, since many Linux / BSD-flavored distributions include this flag in the local configurations. Failing that, fall back to -D_REENTRANT and -lpthread.
  • When the LAM daemon is killed by SIGTERM, it will gracefully kill all of its sub-processes, release all of its resources, and die nicely (as opposed to just dying).
  • LAM will use the $TMPDIR environment variable to determine where to create temporary files.
  • Added "promiscuous" and "expected" modes for base SSI boot protocols, where connections are accepted from any IP address or only from the IP addresses listed in the boot schema, respectively.
  • The back-end process for lamboot (and friends) have been SSI-ized with the "boot" SSI kind. See lamssi_boot(7). Currently have two boot modules available: rsh (which also does ssh) and tm (for PBS).
  • Added the MPI-2 C++ bindings implementation for MPI::Win.
  • Added --with-wrapper-extra-ldflags option to configure that parses the output of libtool to get the extra compiler/linker flags and put them into the wrapper compilers (e.g., shared library run-time search path).
  • The memcpy() in glibc performs poorly if the copy size is not divisable by 4. Added a workaround to significantly increase LAM's shmem RPIs and unexpected message buffering performance in these cases, as well as command line configure switches to enable/disable this behavior (--with-prefix-memcpy and --without-prefix-memcpy).
  • Changed the bit mapping in error codes that are used in MPI because the field specifying the MPI function was only 8 bits, yet there are 300+ functions in MPI. This unfortunately changes the bit mapping of the errorcode argument in MPI_ABORT; see the MPI_Abort(3) man page for more information.
  • Added functionality per MPI-2:4.8 -- attributes added to MPI_COMM_SELF will be deleted as nearly the first thing in MPI_FINALIZE, effectively allowing user-specified functions during MPI_FINALIZE.
  • Updated BSD4.4 file descriptor passing to fit expected use.
  • Removed MPIL_Spawn (LAM-specific, pre-MPI-2 spawn call).
  • MPI thread support now MPI_THREAD_SERIALIZED. We don't enforce any distinction between FUNNELED or SERIALIZED, so it is possible to write a threaded application that runs fine on LAM but causes issues on other platforms.
  • Add checks for if running under an LSF job, and automatically set the socket suffix to be the LSF job ID (a la how PBS jobs are already handled).
  • Print out friendly error message from wrapper compilers if underlying compiler isn't found.
  • Update for MPI 2.1 errata: MPI_GET_COUNT behavior with respect to 0 byte datatypes now returns 0 (vs. MPI_UNDEFINED) when 0 data bytes have been transferred.
  • There are now lots of run-time tunable parameters for the various RPIs. See the lamssi_rpi(7) man page for a list of the tunable parameters that can be passed in to each RPI.
  • The first System Services Interface (SSI) kind has been added -- the RPI layers have been converted to SSI. Now all available RPI's are compiled in simultaneously and you can choose which to use at run-time. See the mpirun(1), lamssi(7), and lamssi_rpi(7) man pages.
  • Fixed a problem where the IMPI client was not properly endianizing IMPI_CMD_FINI before sending it to the IMPI server.
  • Fixed a problem where if $prefix is /usr, hf77 would complain that it could not find the ROMIO and MPI-2 C++ libraries. This isn't too important for 6.6.x since we've totally re-written the wrapper compilers, but we record the bug fix anyway.
  • Completely rewrote the Myri/gm RPI. It's smaller, faster, and generally mo' better.
  • Only install lam-bhost.def if one does not exist in $(sysconfdir).
  • Renamed lam-conf.lam and lam-conf.otb to lam-conf.lamd and lam-conf.separate to make the meanings more obvious and less confusing with the corresponding lam-bhost.* files. Renamed lam-conf.lam to be lam-conf.example to make its purpose more obvious, and no longer install it under $(sysconfdir).
  • Added the MPI-2 C++ bindings implementation for MPI::Info.
  • Fixed a problem in the main lamd kernel on NetBSD where select() will zero out fd_sets even on "accepted" failures.
  • Fixed minor issue with show_help() that could cause problems for help messages with large numbers of arguments.
  • Added the "C++ only" datatypes specified in the MPI-2 standard: MPI_BOOL, MPI_COMPLEX, MPI_DOUBLE_COMPLEX, and MPI_LONG_DOUBLE_COMPLEX, as well as the built-in operands specified in the standard. Note that the complex types will *only* work if the implementation of complex<float> allows casting to struct { float r ; float i; } ; (and likewise for double and long double. This seems to be the case everywhere we have seen.
  • Fixed a couple small problems that prevented running the lamd as a group of processes. Moved -b from $inet_topo to $socket_suffix in the lam-conf files.
  • Add version checking into the LAM commands and MPI_INIT. If a user attempts to run a LAM or MPI program that does not match the version of the lamd that is running, a warning message will be displayed and the program will bail.
  • Changed name of binary for C++ compiler to mpic++. On most systems, there will be a symlink from mpiCC -> mpic++. On systems without a case sensitive file system (like HFS+ on Mac OS X), this symlink will not be created, as it conflicts with mpicc.
  • Removed linking to the C++ bindings when using mpicc and mpif77 because this creates a problem when using gcc 3.0, and it doesn't make sense anyway.
  • Able to finally remove the automake_bogosity.(c|h) files and extra noinst_HEADERS/noinst_PROGRAMS rules from various directories/Makefile.am's.
  • Add -nn and -np options to lamboot, recon, and wipe to prevent adding "-n" to the remote agent command line and to prevent the execution of $HOME/.profile on the remote side, even if the remote shell is Bourne.
  • In addition to the syslog, send lamd debugging output to the lam-debug-log.txt file in the LAM session directory. This is particularly helpful since many Linux distributions do not allow normal users to view the syslog.
  • Related to the note below (LAM session directory located on a networked filesystem), add a workaround in the flatd when attempting to open a new flatd temp file in the LAM session directory with O_APPEND. If the first attempt to open a new file fails, try again without O_APPEND.
  • Fix the lamd kernel to set the kill file to be close-on-exec so that it is not inherited by child processes (this can cause a problem during lamhalt if the LAM session directory is on NFS -- tkill will inherit the open file descriptor and then remove the file. NFS will then created a ".nfsXXXXX" cache file entry, which will prevent the removal of the directory).
  • Change LAM's registry to not depend on the O_EXCL flag to open(). Use an alternative locking mechanism if it is determined (at run time) that O_EXCL will not work in the LAM session directory. This can happen when the LAM session directory is on a networked filesystem.
  • Pass "-d" to tkill during lamboot (through hboot) if lamboot was invoked with "-d".
  • Some fixes to the gm RPI, particularly with respect to allocating and freeing memory.
  • Add specific error message for the case where the gm RPI is unable allocate a gm port. This is much more helpful than an amorphous "something went wrong during MPI_INIT" message.
  • Robust-ized lamhalt such that it will timeout (after 15 seconds) if it doesn't receive all the HALT ACKs back that it thinks that it should receive -- and prints out an appropriate error message indicating which nodes it didn't get ACKs from.
  • Various minor improvements in the build system.
  • Integrated the C++ bindings into the configure/build system better.
  • Revamped the configure system for future extensibility. Updated build system to use Autoconf 2.52, Automake 1.5, and Libtool 1.4.2 (or higher).
  • In share/etc/kill.c, kill off LAM directory with rmdir(), not remove() - it appears that MacOS X will not allow remove() to be called on a directory.
  • Removed use of .so nroff "include" directive in man pages; it didn't work on all platforms. Also updated some text in the mpicc and mpif77 man pages.
  • Added a specific check to ensure MPI_INIT is not called after MPI_FINALIZE. This is a special case of the check that no MPI function was called after MPI_FINALIZE, as new users tend not to realize that you can't re-INIT a process.

  • Released LAM 6.6b1
  • Removed the LAM-version-checking code from the mpi2c++ bindings; they're really not necessary since we're inside LAM anyway.
  • Fixed ambiguity of RTF_KENYA flag being used for two purposes (forked from the kenyad and attached to the kenyad), and split it into RTF_KEYNA_CHILD and RTF_KEYNA_ATTACH.
  • Changed the behavior of the --with-rsh option in configure. Now, rather than always putting the full path in lam_config.h, it only adds the full path when an absolute or relative path was given (as opposed to just a binary name).
  • First public release of Myrinet/gm support in the gm RPI.
  • Fixed a problem where two different flags had the accidentally same value on a request, which lead to truncation errors in one-sided communications in lamd mode when the daemons were compiled separately.
  • Added better support to mpirun and the kenyad to catch when an MPI process dies without first detaching (i.e., calling MPI_FINALIZE).
  • Re-added hooks to create/remove the "impirun" sym link in $(bindir) during "make install"/"make uninstall". These were lost when we converted to an automake-style build.
  • Fixed a bug in dlo_inet in fault tolerant mode. On some OSes, recvfrom() can return ECONNREFUSED, which should not cause an abort in FT mode.
  • Fixed a problem when a process sends a LAM signal to itself via kdoom(); the signal handler would erroneously get triggered twice.
  • Added the MPI_Info key "lam_spawn_sched_round_robin" on MPI_COMM_SPAWN to allow finer-grained control on the placement of spawned MPI processes without the need to write an app schema to a temporary file (and allows functionality that you can't really do with an app schema, anyway). See MPI_Comm_spawn(1) for more information on this key.
  • Renamed the MPI_Info key name on MPI_COMM_SPAWN "file" to "lam_spawn_file". Since it is a LAM-specific key, it should have a LAM-specific name. While the "file" key still exists for backwards compatability, its use is deprecated.
  • Added two predefined attributes on MPI_COMM_WORLD: LAM_UNIVERSE_NCPUS and LAM_UNIVERSE_NNODES. They return the number of CPUs in the current LAM universe and the number of nodes in the current LAM universe (respectively). Note that these values can be larger than their corresponding counts from the application's MPI_COMM_WORLD.
  • Increase the default optimization flags in configure to be -O3 for gcc/g++, -O for all other compilers.
  • Moved the handling of signals in user code from signal handlers installed by MPI_INIT to the lamd and mpirun. That is, the lamd will now detect that a process died due to a signal and send back that information to mpirun. mpirun will print out the appropriate error messages. This has the side effect of allowing the OS default signal handlers to be used in user programs rather than the LAM singal handlers. In at least some cases, this is a good thing -- some users want core dumps, for example. Two new options have been added to mpirun -- "-sigs" and "-nsigs", to enable / disable the LAM signal handlers from MPI_INIT. "-nsigs" is now the default, since the lamd/mpirun make these signal handlers redundant. However, "-sigs" will enable the old behavior for backwards compatibility.
  • Fixed a bunch of potential signed / unsigned comparison problems. This was a real bug in at least one case, which could effectively result in garbage being sent to the lamd, which would cause the lamd to eventually die.
  • Fixed up lamnodes to print more intelligible error messages when you specify an illegal node/CPU.
  • Ensure that the directory where the lamd named socket lives is not left around if you invoke a LAM command when there is no lamd running. Moved the function lam_rmdir() from tkill.c to share/etc/kill.c, and renamed it to be lam_rmsocknamedir(), and ensured that it is called when kinit() fails because there is no lamd.
  • Made LAM's signal handler a bit smarter by checking to see if it is already in ths signal handler. e.g., if a callback function has been registered via atexit()/onexit() and causes a seg fault after the signal handler has been triggered the first time, this can cause a loop of seg faults which is quite difficult to kill. LAM's signal handler will now detect this situation and gracefully abort().