Exploring high-performance processor architecture beyond the exascale
Xiang-hui XIE, Xun JIA
Exploring high-performance processor architecture beyond the exascale
The ever-increasing need for high performance in scientific computation and engineering applications will push high-performance computing beyond the exascale. As an integral part of a supercomputing system, highperformance processors and their architecture designs are crucial in improving system performance. In this paper, three architecture design goals for high-performance processors beyond the exascale are introduced, including effective performance scaling, efficient resource utilization, and adaptation to diverse applications. Then a high-performance many-core processor architecture with scalar processing and application-specific acceleration (Massa) is proposed, which aims to achieve the above three goals by employing the techniques of distributed computational resources and application-customized hardware. Finally, some future research directions regarding the Massa architecture are discussed.
High-performance computing / Beyond the exascale / Processor architecture / Application-customized hardware / Distributed computational resources
[1] |
Esmaeilzadeh H, Blem E, Amant RS,
|
[2] |
Fang JR, Fu HH, Zhao WL,
|
[3] |
Fu HH, Liao JF, Yang JZ,
|
[4] |
Fu HH, He CH, Chen BW,
|
[5] |
García-Flores V, Ayguade E, Peña AJ, 2017. Efficient data sharing on heterogeneous systems. Proc 46th Int Conf on Parallel Processing, p.121–130. https://doi.org/10.1109/ICPP.2017.21
|
[6] |
Hemmert S, 2016. Green HPC: from nice to necessity. Comput Sci Eng, 12(6):8–10. https://doi.org/10.1109/MCSE.2010.134
|
[7] |
Jia X, Wu GM, Xie XH, 2017. A high-performance accelerator for floating-point matrix multiplication. 15th Int Symp on Parallel and Distributed Processing with Applicatons, p.396–402. https://doi.org/10.1109/ISPA/IUCC.2017.00063
|
[8] |
Jouppi NP, Young C, Patil N,
|
[9] |
Lin H, Tang XC, Yu BW,
|
[10] |
Ozdal MM, Yesil S, Kim T,
|
[11] |
Pedram A, Gerstlauer A, van de Geijn RA, 2011. A highperformance, low-power linear algebra core. 22nd Int Conf on Application-specific System, Architecture and Processors, p.35–42. https://doi.org/10.1109/ASAP.2011.6043234
|
[12] |
Schulte MJ, Ignatowski M, Loh GH,
|
[13] |
Shalf JM, Leland R, 2015. Computing beyond Moore’s law. Computer, 48(12):14–23. https://doi.org/10.1109/MC.2015.374
|
[14] |
Silbertstein M, 2017. OmniX: an accelerator-centric OS for omni-programmable systems. 16th Workshop on Hot Topics in Operating Systems, p.69–75. https://doi.org/10.1145/3102980.3102992
|
[15] |
Williams RS, 2017. What’s next? [The end of Moore’s law] Comput Sci Eng, 19(2):7–13. https://doi.org/10.1109/MCSE.2017.31
|
[16] |
Xu ZG, Lin J, Matsuoka S, 2017. Benchmarking SW26010 many-core processor. 31st Int Conf on Parallel and Distributed Processing Symp Workshops, p.743–752. https://doi.org/10.1109/IPDPSW.2017.9
|
[17] |
Yang C, Xue W, Fu HH,
|
[18] |
Zhao B, Gao W, Zhao RC,
|
[19] |
Zheng F, Zhang K, Wu GM,
|
[20] |
Zheng F, Li HL, Lv H,
|
/
〈 | 〉 |