COMPUTATIONAL RESEARCH in BOSTON and BEYOND (CRIBB)

Date	April 2, 2021
Speakers	Siddharth Samsi and Vijay Gadepally MIT-Lincoln Laboratory
Topic	An Open Datacenter Dataset for AI Enabled Optimization
Abstract	The first step in training an AI is to get the right data. In order to apply AI to the problem of data center optimization, such as identifying faults with servers, energy or cooling systems, before they become critical, the MIT Lincoln Laboratory Supercomputing Center is developing a state-of-the-art dataset. This dataset contains rich information such as: physical information about building management; system information such as scheduler and filesystem logs; and node-level information such as utilization, memory, GPU activity (both job level statistics as well as time-series monitoring collected via NVIDIA’s DCGM tool), energy utilization, etc. In this talk, we will describe the dataset, detail how developers can get access to this data, and discuss a number of open problems associated with datacenter analytics.
Biography

Acknowledgements

We thank the MIT Department of Mathematics, Student Chapter of SIAM, ORCD, and LLSC for their generous support of this series.