'UNSW - Science

UNSW - Science - HPC

Getting Started Banner

What is a cluster?

A cluster is a collection of many computers or nodes. When you log into a cluster you are actually accessing the head node of the cluster. The head node is the public face of the cluster — it is the only one of the cluster nodes that is directly visible and accesible to the network. The other nodes in the cluster, including the storage node and all the compute nodes, communicate with the head node and each other via a private network.

The head node is a shared resource for all cluster users. It is used for preparing, submitting and managing jobs as well as for transferring files. Never run any computationally intensive processes on the head node. Jobs are submitted from the head node, but they actually run on one or more of the compute nodes. The procedure by which jobs are allocated to compute nodes and managed during their lifetime is the responsibility of the resource manager and the job scheduler. Katana uses a resource manager known as Torque (based on an older one called PBS) and a job scheduler called Maui.

Jobs are submitted using a queue submission command. There are two types of job that will be accepted: interactive jobs and batch jobs. An interactive job provides a login session on a compute node. This enables you to interact directly with the compute node by issuing any sequence of commands within the login session. Consequently, interactive jobs are useful for experimentation and debugging. In contrast, a batch job is a scripted job that runs from start to finish without any user intervention. The vast majority of jobs on the cluster are batch jobs. This type of job is appropriate for production runs of several hours or days.

To continue learning about High Performance Computing click on the links on the left.