Slurm is a free, open source workload manager originally designed for the demanding requirements of national laboratories. It’s functionality, scalability, performance and modular architecture have resulted in widespread adoption, with active development occurring at numerous organizations.
Simple configurations can be installed in only a few minutes, while use of optional plugins can support sophisticated scheduling and reporting requirements with easy extensibility. Slurm currently manages the workload on many of the world’s most powerful computers.
Slurm was designed to manage heterogeneous clusters with up to millions of processors. It is the workload manager on Lawrence Livermore National Laboratory’s Sequoia supercomputer with 98,304 compute nodes, and has managed emulated systems over 20 times larger. Daemons and commands are extensively multi-threaded. Fault-tolerant, hierarchical communications are used for high performance.
Architectural changes in Slurm version 2.5 have dramatically increased performance. Up to 1000 job submissions per second and throughput up to 600 jobs per second are possible.
Slurm supports sophisticated and highly flexible scheduling policies including advanced reservations. A highly configurable Quality of Service (QoS) mechanism is available to satisfy Service Level Agreements (SLAs), for example to preempt lower priority jobs on demand. Resource limits can be applied to hierarchical accounts down to the level of individual users.
Multi-factor prioritization of jobs
Slurm considers many configurable factors when determining a job’s priority. When used with native accounting, Slurm can also utilize hierarchical fair-share as a contributing factor. Coordinators can be delegated to manage administration of their sub-trees and allotted limits in the hierarchy to lessen the load on system administrators.
Resource allocations are optimized with respect to the topology on a node (NUMA, sockets, cores and threads) with task binding. Resource allocations spanning multiple nodes will also be optimized with respect to the network topology between nodes. For example, the number of leaf switches used can be minimized or an allocation’s locality can be optimized with respect to a 3-dimensional interconnect.
Generic consumable resources can be managed, including GPUs.
Size and time ranges for jobs
Job size and time limits can be specified as a range. Jobs may be granted less than the maximum size and/or time specification if doing so results in earlier job initiation.
Jobs can grow and shrink on demand.
Simple configurations can be operational in a couple of minutes, while use of optional plugins provide all of the functionality required at major HPC sites. These plugins support a wide variety of architectures and configurations with easy extensibility. For example, plugins for IBM Blue Gene and Cray systems provide a system-specific interface to those systems in one place while preserving a common code kernel. New plugins have been written by customers for site-specific requirements, for example to optimize use of green energy.
Slurm can be configured to operate across multiple clusters at a site. Workload information is available and jobs can be submitted across clusters.
Graphical User Interface
Slurm’s sview tool provides a rich topology-aware system interface.
Arbitrary scripts can be executed when events occur. For example, system administrators can configure a program to notify them when nodes fail.
Jobs can specify their desired CPU frequency, and power use by job is recorded. Idle resources can be powered down until needed to reduce energy consumption and cost.
Accounting records for jobs and job steps plus node events can be recorded in a database with various tools available to generated accounting reports. These accounting records can be used to control scheduling to impose usage limits or dynamic fair share policy.
Status running jobs
Slurm is able to status jobs as they run to gather information at the level of individual tasks and help identify load imbalance or other anomalies.
Web-based configuration tools
Build slurm.conf configuration files with a web interface.
Because Slurm is open source and very modular, it can be used to test different scheduling algorithms or other constructs related to workload management without requiring the student to deal with the other components of a workload manager.
No single point of failure
Each of the Slurm daemons has an optional backup to increase reliability and decrease down time.
Free and open source
Slurm is available under the GNU General Public License v2.