Creating a LINUX Cluster

Introduction:
Cluster computing is a very economical form of parallel computing. The configuration that is described in this document is based on the concept of a Beowulf cluster, however, the software used is not entirely the same. These instructions are intended to make the process as easy as possible.

Parts of a Cluster:
The cluster consists of four major parts. These parts are: 1) The network, 2) The nodes, 3) The server, 4) the gateway. Each part has a specific function. The last line of each description tells the special requirements (if any) that are needed for the hardware to perform its function.

  1. Network:
  2. Nodes:
  3. Server:
  4. Gateway:
The Shared File System:
The shared file system on the server will contain the home directories for the users as well as the MPI programs. In order to avoid data loss from hard drive failure, it is highly recommended that the shared drive be a RAID drive. A RAID drive consists of several drives that work as one. Data is striped or mirrored across these drives in a manner that prevents data loss in the event of a hard disk failure. (The method by which the computer does this is determined by something called the RAID level. For a description of the different RAID levels see the RAID setup file.) There are two ways to accomplish this, hardware RAID and software RAID. The hardware RAID uses a special controller card to link the disks together. This makes this option a bit more expensive. A less expensive way to RAID your disks is to use software RAID. With this configuration, the OS handles the RAID arrangement rather than a controller card. All of the RAID levels can be accomplished through software RAID and the process to setup software RAID is described in the RAID setup file.

Hardware Considerations:
This is the hardware configuration that was used to create this documentation:

Network:

Nodes:

Server:

Gateway:

If you use the same or a very similar setup you should be able to use the documentation without making any changes. Any differences (especially in the ethernet card) will not render this documentation worthless, however, you will need to be able to select the correct drivers for your hardware. (Especially while building the minimal kernel.) Every effort has been made to make this documentation as generic as possible, however, it is possible that hardware specific items are still lurking within.

What Hardware do I need?
The hardware list is pretty flexible. To make sure that installation goes as smoothly as possible it is preferable to use hardware that you know will work easily with LINUX.

General Hardware:

Computers:
With the computers a general rule applies, the faster the processor and the more memory the better. SCSI is also a good idea for the computers (especially the server). If you intend to store information on a scratch partition on the nodes you will need to have larger drives on the nodes. If you do not intend to do this you can get away with much smaller hard drives.

Other Considerations:
You are going to set up a rather large number of computers in an enclosed space. The average engineer/architect does not anticipate 18+ computers in a room. (They usually only anticipate one or two) This means that there may be electrical/heat dissipation issues that you need to consider. Before constructing your cluster you should check with someone to make sure that the room that you will build it in is suitable.

What Software do I need?
In order to make your cluster run you will need several software packages. These are RedHat LINUX (your operating system), ssh (your communications package), MPICH (the software that allows you to run parallel programs), and ntp timeserver (not essential, however, it does help keep the time on the cluster synchronized). These packages are provided with this documentation.

Setting up the Cluster:
Once you have your hardware assembled the setup of the cluster can be divided into five parts. Each step is explained in documents that are located in the remainder of this documentation.

  1. Server Setup
    1. Installing RedHat
    2. Network Driver and Hosts File
    3. DHCPD Setup
    4. Net Install Config
    5. Ramdisk creation and Explanation
    6. NFS Setup
    7. Time Server Setup
    8. Install and Configure SSH
  2. Gateway Setup
    1. Installing RedHat
    2. Install and Configure SSH
  3. Node Setup
    1. Installing RedHat
    2. Node Configuration
    3. Moving/Creating Node Image
    4. Booting Nodes
  4. MPI Setup

General Information:

ADDING USERS
When you add users you will need you make sure that they can ssh unchallenged from the server to any node. To accomplish this you will need to do the following.

  1. Make the home directories reside on the shared partition of the server (the home directory needs to be accessible to all machines)
    • to do this you need to edit the home directory that is indicated on the /etc/passwd file
    • you need to actually make the new home directory for the user (don't forget to change the owner so that the user can actually work with his home directory "chown -R username:usergroup /new/home/dir")
    • once edited you need to push this file out to all of the nodes and the gateway so that these machines know where the new home directory
  2. Every user needs to be able to ssh unchallenged to the nodes from the server
    • become the new user "su username"
    • type "ssh-keygen" and hit return three times (at the prompts)
    • go to ~/.ssh
    • copy the identity.pub file to authorized_keys
    • change server to * in authorized keys
    • run tstmachines (located in mpich/util) until any error messages disappear. You may have to run this program twice.

COMPILING PROGRAMS
You will need to add the following lines to your makefile

IFLAGS = -I$(INCLDIR) -I/usr/local/mpich/include -I/usr/local/mpich/build/LINUX/ch_p4/include

LFLAGS = -l./timing/ -L/usr/local/mpich/build/LINUX/ch_p4/lib -lm -lmpich -ltiming

Another way to get around this is to compile with mipcc (or mpiCOMPILERNAME) this should take care of everything as well. MPI compilers should be located in mpich/bin.

RUNNING PROGRAMS
To run programs in parallel you will need to add "mpirun -np Insert#OfNodesHere" before the regular command line expression for the program
i.e. If you would normally type "hello++ -ivqrh' to run your program on a cluster configured with 493 nodes you would need to type "mpirun -np 493 hello++ -ivqrh" to run the parallel version of the program.

IMPORTANT NOTE:
In order to run the programs in parallel they have to be written in parallel. This means that you have to have inserted mpi calls into the program. If you do not do this you will not see any gain from running programs in parallel.