|Date||Mar. 14, 2014|
|Speaker||Morris Jette (CTO, SchedMD LLC)|
|Topic||Slurm Workload Manager|
|Abstract:|| Slurm is an open-source, fault-tolerant and highly scalable workload management framework. Slurm includes an extensive suite of plugins to support a wide range of architectures and use cases ranging from managing the processes and cores on a single microprocessor to managing the workload on many of the largest computers in the world. Some of Slurm's advanced features include resource allocations optimized for network topology, gang scheduling (time-slicing of parallel jobs), hot-spare resources for failure management, energy management, and the ability to re-size running job. An overview of Slurm's architecture and capabilities will be presented along with future development plans to satisfy the needs of exascale computing.