Software Engineering Towards Exascale: Domain Specific Libraries, Communication Optimality, and Machine Learning
The value of HPC infrastructure is directly related to the productivity of the end-user. Key to this productivity is an ecosystem of user oriented software, namely scientific applications and workflows. The ETH Zurich invests significantly in the development of applications, for example, in the form of the PASC programme. CSCS collaborates with domain scientists to re-engineer their scientific software stack, readying it for the exascale, and introducing a separation of concerns between the domain scientist and the scientific software engineer, enabling the former to target scientific progress and the later performance portability more easily.
Here, we illustrate this approach with several examples from the material science domain. In particular, we present the concept of domain specific libraries that have a well defined and narrow scope, and that can be reused in various application codes. We demonstrate the application level impact of the re-engineering of basic computational primitives, such as distributed matrix multiplication, for communication optimality. Finally, we show how machine learning techniques can become part of the toolbox of the performance engineer to develop highly efficient kernels specifically optimized for a particular domain.