DateNov. 1, 2013
Speaker Albert Reuther (MIT-Lincoln Laboratory)
TopicMIT SuperCloud: HPA, Clouds, and Databases for Diverse Rapid Prototyping
Abstract:  The supercomputing and enterprise computing arenas come from very different lineages. However, the advent of commodity computing servers has brought the two arenas closer than they have ever been. Within enterprise computing, commodity computing servers have resulted in the development of a wide range of new cloud capabilities: elastic computing, virtualization, and data hosting. Similarly, the supercomputing community has developed new capabilities in heterogeneous, massively parallel hardware and software. Merging the benefits of enterprise clouds and supercomputing has been a challenging goal. Significant effort has been expended in trying to deploy supercomputing capabilities on cloud computing systems. These efforts have resulted in unreliable, low-performance solutions, which requires enormous expertise to maintain.

Over the past ten years, the LLGrid at MIT Lincoln Laboratory has evolved from a four-node prototype cluster running single-user MatlabMPI jobs to a constellation of systems serving well over 300 users each year. LLGrid was developed to enable the rapid prototyping computational needs of MIT Lincoln Laboratory, providing interactive, on-demand parallel and distributed simulation, data processing, and algorithm exploration and development capabilities across a wide range of DoD mission areas including ballistic missile defense, radar digital signal processing algorithm development, aircraft collision avoidance algorithm verification, communication channel reliability modeling, and satellite propagation simulations. With this goal in mind, the LLGrid team continues to explore novel ways to accommodate various high performance computing requirements and needs on shared HPC hardware systems. MIT SuperCloud provides a novel solution to the problem of merging enterprise cloud and supercomputing technology. More specifically, LLSuperCloud reverses the traditional paradigm of attempting to deploy supercomputing capabilities on a cloud and instead deploys cloud capabilities on a supercomputer. The result is a system that can handle heterogeneous, massively parallel workloads while also providing high performance elastic computing, virtualization, and databases. The benefits of LLSuperCloud are highlighted using a mixed workload of C MPI, parallel MATLAB, Java, databases, and virtualized web services.