Exploring high-performance processor architecture beyond the exascale

Xiang-hui XIE, Xun JIA

PDF(356 KB)
PDF(356 KB)
Front. Inform. Technol. Electron. Eng ›› 2018, Vol. 19 ›› Issue (10) : 1224-1229. DOI: 10.1631/FITEE.1800424
Perspectives
Perspectives

Exploring high-performance processor architecture beyond the exascale

Author information +
History +

Abstract

The ever-increasing need for high performance in scientific computation and engineering applications will push high-performance computing beyond the exascale. As an integral part of a supercomputing system, highperformance processors and their architecture designs are crucial in improving system performance. In this paper, three architecture design goals for high-performance processors beyond the exascale are introduced, including effective performance scaling, efficient resource utilization, and adaptation to diverse applications. Then a high-performance many-core processor architecture with scalar processing and application-specific acceleration (Massa) is proposed, which aims to achieve the above three goals by employing the techniques of distributed computational resources and application-customized hardware. Finally, some future research directions regarding the Massa architecture are discussed.

Keywords

High-performance computing / Beyond the exascale / Processor architecture / Application-customized hardware / Distributed computational resources

Cite this article

Download citation ▾
Xiang-hui XIE, Xun JIA. Exploring high-performance processor architecture beyond the exascale. Front. Inform. Technol. Electron. Eng, 2018, 19(10): 1224‒1229 https://doi.org/10.1631/FITEE.1800424

References

[1]
Esmaeilzadeh H, Blem E, Amant RS, , 2011. Dark silicon and the end of multicore scaling. 38th Annual Int Symp on Computer Architecture, p.365–376. https://doi.org/10.1145/2000064.2000108
[2]
Fang JR, Fu HH, Zhao WL, , 2017. swDNN: a library for accelerating deep learning applications on Sunway TaihuLight. 31st Int Parallel and Distributed Processing Symp, p.615–624. https://doi.org/10.1109/IPDPS.2017.20
[3]
Fu HH, Liao JF, Yang JZ, , 2016. The Sunway TaihuLight supercomputer: system and applications. Sci China Inform Sci, 59(7):1–15. https://doi.org/10.1007/s11432-016-5588-7
[4]
Fu HH, He CH, Chen BW, , 2017. 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios. 30th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.1–12. https://doi.org/10.1145/3126908.3126910
[5]
García-Flores V, Ayguade E, Peña AJ, 2017. Efficient data sharing on heterogeneous systems. Proc 46th Int Conf on Parallel Processing, p.121–130. https://doi.org/10.1109/ICPP.2017.21
[6]
Hemmert S, 2016. Green HPC: from nice to necessity. Comput Sci Eng, 12(6):8–10. https://doi.org/10.1109/MCSE.2010.134
[7]
Jia X, Wu GM, Xie XH, 2017. A high-performance accelerator for floating-point matrix multiplication. 15th Int Symp on Parallel and Distributed Processing with Applicatons, p.396–402. https://doi.org/10.1109/ISPA/IUCC.2017.00063
[8]
Jouppi NP, Young C, Patil N, , 2017. In-datacenter performance analysis of a tensor processing unit. 44th Annual Int Symp on Computer Architecture, p.1–12. https://doi.org/10.1145/3079856.3080246
[9]
Lin H, Tang XC, Yu BW, , 2017. Scalable graph on Sunway TaihuLight with ten million cores. 31st Int Parallel and Distributed Processing Symp, p.635–645. https://doi.org/10.1109/IPDPS.2017.53
[10]
Ozdal MM, Yesil S, Kim T, , 2016. Energy efficient architecture for graph analytics accelerators. 43rd Int Symp on Computer Architecture, p.166–177. https://doi.org/10.1109/ISCA.2016.24
[11]
Pedram A, Gerstlauer A, van de Geijn RA, 2011. A highperformance, low-power linear algebra core. 22nd Int Conf on Application-specific System, Architecture and Processors, p.35–42. https://doi.org/10.1109/ASAP.2011.6043234
[12]
Schulte MJ, Ignatowski M, Loh GH, , 2015. Achieving exascale capabilities through heterogeneous computing. IEEE Micro, 35(4):26–36. https://doi.org/10.1109/MM.2015.71
[13]
Shalf JM, Leland R, 2015. Computing beyond Moore’s law. Computer, 48(12):14–23. https://doi.org/10.1109/MC.2015.374
[14]
Silbertstein M, 2017. OmniX: an accelerator-centric OS for omni-programmable systems. 16th Workshop on Hot Topics in Operating Systems, p.69–75. https://doi.org/10.1145/3102980.3102992
[15]
Williams RS, 2017. What’s next? [The end of Moore’s law] Comput Sci Eng, 19(2):7–13. https://doi.org/10.1109/MCSE.2017.31
[16]
Xu ZG, Lin J, Matsuoka S, 2017. Benchmarking SW26010 many-core processor. 31st Int Conf on Parallel and Distributed Processing Symp Workshops, p.743–752. https://doi.org/10.1109/IPDPSW.2017.9
[17]
Yang C, Xue W, Fu HH, , 2016. 10m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics. 29th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.57–68. https://doi.org/10.1109/SC.2016.5
[18]
Zhao B, Gao W, Zhao RC, , 2015. Performance evaluation of NPB and SPEC CPU2006 on various SIMD extensions. 1st Int Conf on Big Data Computing and Communications, p.257–272. https://doi.org/10.1007/978-3-319-22047-5_21
[19]
Zheng F, Zhang K, Wu GM, , 2014. Architecture techniques of many-core processor for energy-efficient in high performance computing. Chin J Comput, 37(10):2176–2186 (in Chinese). https://doi.org/10.3724/SP.J.1016.2014.02176
[20]
Zheng F, Li HL, Lv H, , 2015. Cooperative computing techniques for a deeply fused and heterogeneous manycore processor architecture. J Comput Sci Technol, 30(1):145–162. https://doi.org/10.1007/s11390-015-1510-9

RIGHTS & PERMISSIONS

2018 Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature
PDF(356 KB)

Accesses

Citations

Detail

Sections
Recommended

/