-*- text -*- Ideas to think about for future releases: 1-- The ability to specify which ports LAM should use. 2-- Implement mpiexec -- command line version of mpi_comm_spawn 3-- Don't need to run all the no_romio and no_mpi2cpp ones; only need to do that on one or two architectures to ensure that the configure and Make stuff is right; those packages are already known to work everywhere (perhaps we need to put stuff in configure to stop them from building on systems where they are known to not work...?) 4-- cmd line option to mpirun/lamexec to run each command in a separate xterm with the DISPLAY piped back to the invoking node (e.g., mpirun N -xterm gdb myprogam); check to ensure DISPLAY is set properly (i.e., check to ensure that it does not start with ":" or is empty -- also give option to not export DISPLAY -- e.g., for ssh). 5-- How about an mpirun option that makes Send/Rsend act as Ssend, Isend/Irsend act as Issend, and Send_init/Rsend_init act as Ssend_init? MPI_Init() sets a global flag for this, and the various entry-point routines check it and pass switch the LAM_RQ*SEND flag as needed. 6-- Improve NULL checking for choice arguments (e.g., MPI_Send). It's in lambuf.c:lam_bufinit(). 7-- From Raja: With the same changes, all went well on HP-UX 10.01 for LAM (1 and 2-node clusters). I didn't compile the C++ bindings or ROMIO: no C++ compiler; and ROMIO by default doesn't build on systems that don't accept "long long" which is the case for the C compiler on 10.01. I noticed that on both 10.01 and 10.20, wipe is much slower than in previous LAM versions, even for a 1-node cluster. Anything you know about or should this be investigated (low priority, doesn't hold the release)? 8-- Add ability for LAM executables to check version number upon startup. This would prevent mismatched lamboot/hboot/lamboot's, for example, as well as attempting to run user programs compiled with the "wrong" version of LAM (i.e., not matching the current lamd that is running). 9-- Add implicit "." in PATH on remote nodes; per messages on LLAMAS and LAM list, if "." is not in the PATH, "mpirun N foo" will not work, causing confusion for the user. Not having "." is the default in some Linuxes, for example, a big target OS for LAM. 10--How about integrating MagPIe into LAM? The collectives are falling behind in LAM, still treating all processes as equi-distant. Many of the collectives are heavily used by commercial codes. Thilo has made his code layerable; LAM could provide the routines to determine the cluster topology, using shmem-vs-tcp as the divide level. It could be done in two steps: - Provide a patch for 6.3b that adds these two routines, test the combo as is, and announce it (mailing list + web page). - For 6.4 slurp it into LAM, with Thilo's blessing, so users don't have to rely on PMPI and have it work out-of-the-box (or tarfile). If testing shows that MagPIe is not an across-the-board win, then decide where the big win is and cut-n-paste the parts needed. You could check: - If Thilo is willing to help by being part of the LAM extended team. - If somebody on the LAM mailing list wants to help in testing and performance measurement on the variety of clusters out there. I don't know if somebody has enough info to provide the two LAM-specific routines, you may have to do that in-house. *** Followup: Thilo Kielmann says that it already works with LAM, and he would love to have it become an [optional] part of LAM. I told him it would probably take a while (IMPI first), but we would get to it someday. JMS 11-- Add ./configure and make stuff to build multiple MPI transports simultaneously (which would effectively do multiple compiles at least on the share/mpi dir -- but also would need multiple executables for lamd, mpitask, mpimsg, ...and others? Ew -- this doesn't sound nice). Would need to rename resulting libmpi.a to be libmpi-RPI.a, where RPI is tcp, sysv, usysv (or something like that). hcc/etc. can decide at run time which libmpi to link to -- perhaps something like: hcc foo.c -o foo -- -tcp hcc foo.c -o foo -- -sysv -impi So we would need to build 6 libraries: tcp/sysv/usysv, each with and without impi. Would need a nice way in ./configure to specify building multiple RPI's -- perhaps a la gcc, ./configure --with-rpi="tcp sysv usysv tcp-impi". What would happen to "--with-impi"? Would that imply the "-impi" versions of all the RPI's selected to be built? Who knows. Food for thought. 12 -- Make XMPI work with LAM 6.3.