Recent Advances in Using mOS for ML Workloads
The mOS multi-kernel at Intel Corp. was initially designed to support traditional supercomputing workloads and provide them with the performance, deterministic behavior, and scalability of lightweight kernels but, in a familiar Linux environment. These applications have been carefully crafted for the highest possible performance on supercomputers. Tailoring a small subset of system services for these applications promises performance and throughput gains.
The Aurora exascale system will support machine learning (ML) and data science workloads in addition to traditional modeling and simulation workloads. From a system software and OS kernel point of view, ML applications behave very different than what lightweight kernels have supported in the past. In this presentation we explore some of these differences and look at early performance results from running ML workloads on mOS.