Help:MPI
MPI stands for Message Passing Interface.
See also a simple qsub and MPI example for a C programming example.
MPI not just software, but an application programming interface, protocol, and library. MPI has been implemented for many operating systems and parallel architectures by many different vendors. It is likely that more than one version is available on any given cluster.
For use examples on the clusters, please see Qsub and MPI example.
Software availability[edit]
NOTE: You must choose a version of MPI compatible with the compiler you are using, and you must use the correct version of mpirun to match.
- Roman cluster
- mpich version 1 or 2 is on most cluster nodes
- lam mpi is on all upgraded nodes
- MMAE student cluster
- Sun MPI was at one time on the sun machines, but is currently not set up. Ask if you wish to use it.
- i2 / Euler cluster
- mpich version 1 is installed, but does not seem to work between nodes
- lam mpi is installed and compiled to use the intel fortran compiler; add /opt/lam/bin to the beginning of your path to use it; NOTE: you must add lam to the START of your path, and you must do it in your .bashrc near the top, before the conditional for interactive shells
- i2 / deli cluster (Hilbert)
- lam mpi, mpich, lam mpi for Intel compilers
MPI web links[edit]
- Using MPI-2: Advanced Features of the Message-Passing Interface
- wikipedia:Message Passing Interface
- MPICH2 documentation
- Notes On Using Mpi (Fortran)
- MPI Forum standards group
- mcs.anl.gov
- MPI home
- MPICH home a popular MPI implementation
- tutorials
- Quick overview of SEND
- http://www10.informatik.uni-erlangen.de/~mohr/MPI/
- MPI tutorials
- Parallel Programming with MPI
- LLNL mpi tutorials
Cluster use[edit]
Please be considerate of other users. Check the system load with the uptime command or look at ganglia on the head node's web server and check the load on the machines you are using. Ask if you are having trouble finding this.
using LAM mpi[edit]
The cluster machines running newer versions of RedHat, Fedora Core, and CentOS have LAM mpi installed, and should use the same commands as above with some additions.
Some LAM mpi commands are:
hcc hcp hf77 laminfo lamnodes lamshrink lamtrace mpiCC mpic++ mpicc mpiexec mpif77 mpimsg mpirun mpitask tkill tping
To start LAM mpi, type:
lamboot hostfile
Where hostfile lists the hosts to use. You can either list hosts multiple times, or include cpu=2 after each hostname to use more cpus. If ssh asks for a password AND rsh works on the cluster, you can instead try
lamboot hostfile
or
LAMRSH=rsh lamboot hostfile
Once lamboot has successfully run, to run a.out, try any of these:
- mpirun n0-3 a.out
- run on 4 nodes, 0 to 3,
- mpirun -np 4 a.out
- run on first 4 nodes
- mpirun -np 8 n1-4 a.out
- run on 4 nodes (1-4) but use 8 processors
To check which nodes mpi is correctly running on, type
lamnodes
To clean up things you might have left running without shutting down your lam subcluster
lamclean
To stop LAM mpi, type
lamhalt
SPECIAL NOTE: The first node in the hosts file will be the master node, not the one you start from!
Lam documentation is available at http://www.lam-mpi.org/ Some reference documentation is in the following man pages:
- lam
- overview and introduction to LAM
- introu
- lam user commands