Programming bare-metal accelerators with heterogeneous threading models: a case study of Matrix-3000
Jianbin FANG, Peng ZHANG, Chun HUANG, Tao TANG, Kai LU, Ruibo WANG, Zheng WANG
Programming bare-metal accelerators with heterogeneous threading models: a case study of Matrix-3000
As the hardware industry moves toward using specialized heterogeneous many-core processors to avoid the effects of the power wall, software developers are finding it hard to deal with the complexity of these systems. In this paper, we share our experience of developing a programming model and its supporting compiler and libraries for Matrix-3000, which is designed for next-generation exascale supercomputers but has a complex memory hierarchy and processor organization. To assist its software development, we have developed a software stack from scratch that includes a low-level programming interface and a high-level OpenCL compiler. Our low-level programming model offers native programming support for using the bare-metal accelerators of Matrix-3000, while the high-level model allows programmers to use the OpenCL programming standard. We detail our design choices and highlight the lessons learned from developing system software to enable the programming of bare-metal accelerators. Our programming models have been deployed in the production environment of an exascale prototype system.
Heterogeneous computing / Parallel programming models / Programmability / Compilers / Runtime systems
/
〈 | 〉 |