Nodes, Cores, Slots

From IBERS Bioinformatics and HPC Wiki
Jump to: navigation, search

Using a HPC effectively, it is important to understand several concepts and terms.

CPU Core

Computers have a Central Processing Unit (CPU) which may have one or more CPU cores which run the code written either by yourself or by the writer of the application you're using. A CPU core can only do one thing at a time. If there are many things that a CPU core must do it will start to become overloaded. Multiple CPU cores can run multiple applications at the same time.

Node

A node is a physical machine within the HPC. A node will have one or more CPUs, lots of memory (RAM), hard disks, network connections etc. A HPC will have several types of nodes, such as a storage node which has home directories or research data stored, a master node which deals with the operations of the HPC such as scheduling and most importantly, compute nodes which actually run your applications.

The Scheduler

As you now know, the HPC comprises of various compute nodes which have a variety of resources such as RAM and CPU cores. The scheduler, in the case of the IBERS HPC is a piece of software called Sun Grid Engine, ensures that the resources available are assigned to the users correctly.

Jobs

A job is what you want the HPC to compute for you. You submit your jobs to the scheduler by writing a small script called a job submission script. This will contain information such as the amount of memory you require, how long your job will run for etc.

Slot

A slot is what the scheduler uses to represent one or more CPU cores. One slot represents one CPU core.

NOTE: The reason that cores are not simply refered to as cores is that one may wish to have two CPU cores per slot in the case of hyper threaded processors.

Queue

A queue is comprised of several slots on one or more nodes. On the IBERS HPC we have three queues to denote the different types of hardware available, (large.q, amd.q and intel.q). Often you might find that a HPC may have queues set up for the types of jobs to be run rather than the type of hardware like ours, e.g. mcore.q and score.q for multi core and single core jobs.