Though the hardware is expensive, all software listed in this how-to is free, and most are open source. If you would like to see how fast your supercomputer would theoretically be, use this tool: http://hpl-calculator. sourceforge. net/

Choose a computer server chassis that maximizes space, cooling, and energy efficiency. Or you can utilize a dozen or so used, outdated servers - whose whole will outweigh the sum of their parts yet save you a sizable lump of cash. All processors, network adapters, and motherboards should be identical for the whole system to play together nicely. Of course, don’t forget about RAM and storage for each node and at least one optical drive for the head node.

Begin with installing the latest version of the motherboard BIOS and firmware, which should be the same on all nodes. Install your preferred linux distro on each node, with a graphical UI for the head node. Popular choices include CentOS, OpenSuse, Scientific Linux, RedHat, and SLES. This author highly recommends using the Rocks Cluster Distribution. In addition to installing all the tools necessary for a compute cluster to function, Rocks uses a great method for ‘distributing’ many instances of itself to the nodes very quickly using PXE boot and the Red Hat ‘Kick Start’ procedure.

First you will need a portable bash management system, such as the Torque Resource Manager, which allows you to break-up and distribute tasks to multiple machines. Pair Torque with the Maui Cluster Scheduler to complete the setup. Next you will need to install the message passing interface, necessary for the individual processes on the separate compute nodes to share the same data. OpenMP is a no-brainer. Don’t forget the multi-threading math libraries and compilers to build your parallel computing programs. Did I mention that you should just install Rocks?

Use a private ethernet network to connect all the nodes in the cluster. The head node can also act as a NFS, PXE, DHCP, TFTP, and NTP server over the Ethernet network. You must separate this network from public networks, which ensures that broadcast packets don’t interfere with other networks in your LAN.

You must, of course, compile from source with all possible optimization options for your platform. For example, if using AMD CPUs, compile with Open64 with -0fast optimization level. Compare your results on TOP500. org to compare your cluster to the fastest 500 supercomputers in the world!