PDF
(356KB)
Abstract
The ever-increasing need for high performance in scientific computation and engineering applications will push high-performance computing beyond the exascale. As an integral part of a supercomputing system, highperformance processors and their architecture designs are crucial in improving system performance. In this paper, three architecture design goals for high-performance processors beyond the exascale are introduced, including effective performance scaling, efficient resource utilization, and adaptation to diverse applications. Then a high-performance many-core processor architecture with scalar processing and application-specific acceleration (Massa) is proposed, which aims to achieve the above three goals by employing the techniques of distributed computational resources and application-customized hardware. Finally, some future research directions regarding the Massa architecture are discussed.
Keywords
High-performance computing
/
Beyond the exascale
/
Processor architecture
/
Application-customized hardware
/
Distributed computational resources
Cite this article
Download citation ▾
Xiang-hui XIE, Xun JIA.
Exploring high-performance processor architecture beyond the exascale.
Front. Inform. Technol. Electron. Eng, 2018, 19(10): 1224-1229 DOI:10.1631/FITEE.1800424
| [1] |
Esmaeilzadeh H, Blem E, Amant RS, , 2011. Dark silicon and the end of multicore scaling. 38th Annual Int Symp on Computer Architecture, p.365–376.
|
| [2] |
Fang JR, Fu HH, Zhao WL, , 2017. swDNN: a library for accelerating deep learning applications on Sunway TaihuLight. 31st Int Parallel and Distributed Processing Symp, p.615–624.
|
| [3] |
Fu HH, Liao JF, Yang JZ, , 2016. The Sunway TaihuLight supercomputer: system and applications. Sci China Inform Sci, 59(7):1–15.
|
| [4] |
Fu HH, He CH, Chen BW, , 2017. 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios. 30th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.1–12.
|
| [5] |
García-Flores V, Ayguade E, Peña AJ, 2017. Efficient data sharing on heterogeneous systems. Proc 46th Int Conf on Parallel Processing, p.121–130.
|
| [6] |
Hemmert S, 2016. Green HPC: from nice to necessity. Comput Sci Eng, 12(6):8–10.
|
| [7] |
Jia X, Wu GM, Xie XH, 2017. A high-performance accelerator for floating-point matrix multiplication. 15th Int Symp on Parallel and Distributed Processing with Applicatons, p.396–402.
|
| [8] |
Jouppi NP, Young C, Patil N, , 2017. In-datacenter performance analysis of a tensor processing unit. 44th Annual Int Symp on Computer Architecture, p.1–12.
|
| [9] |
Lin H, Tang XC, Yu BW, , 2017. Scalable graph on Sunway TaihuLight with ten million cores. 31st Int Parallel and Distributed Processing Symp, p.635–645.
|
| [10] |
Ozdal MM, Yesil S, Kim T, , 2016. Energy efficient architecture for graph analytics accelerators. 43rd Int Symp on Computer Architecture, p.166–177.
|
| [11] |
Pedram A, Gerstlauer A, van de Geijn RA, 2011. A highperformance, low-power linear algebra core. 22nd Int Conf on Application-specific System, Architecture and Processors, p.35–42.
|
| [12] |
Schulte MJ, Ignatowski M, Loh GH, , 2015. Achieving exascale capabilities through heterogeneous computing. IEEE Micro, 35(4):26–36.
|
| [13] |
Shalf JM, Leland R, 2015. Computing beyond Moore’s law. Computer, 48(12):14–23.
|
| [14] |
Silbertstein M, 2017. OmniX: an accelerator-centric OS for omni-programmable systems. 16th Workshop on Hot Topics in Operating Systems, p.69–75.
|
| [15] |
Williams RS, 2017. What’s next? [The end of Moore’s law] Comput Sci Eng, 19(2):7–13.
|
| [16] |
Xu ZG, Lin J, Matsuoka S, 2017. Benchmarking SW26010 many-core processor. 31st Int Conf on Parallel and Distributed Processing Symp Workshops, p.743–752.
|
| [17] |
Yang C, Xue W, Fu HH, , 2016. 10m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics. 29th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.57–68.
|
| [18] |
Zhao B, Gao W, Zhao RC, , 2015. Performance evaluation of NPB and SPEC CPU2006 on various SIMD extensions. 1st Int Conf on Big Data Computing and Communications, p.257–272.
|
| [19] |
Zheng F, Zhang K, Wu GM, , 2014. Architecture techniques of many-core processor for energy-efficient in high performance computing. Chin J Comput, 37(10):2176–2186 (in Chinese).
|
| [20] |
Zheng F, Li HL, Lv H, , 2015. Cooperative computing techniques for a deeply fused and heterogeneous manycore processor architecture. J Comput Sci Technol, 30(1):145–162.
|
RIGHTS & PERMISSIONS
Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature
PDF
(356KB)