Analyzing time-dimension communication characterizations for representative scientific applications on supercomputer systems
Juan CHEN, Wenhao ZHOU, Yong DONG, Zhiyuan WANG, Chen CUI, Feihao WU, Enqiang ZHOU, Yuhua TANG
Analyzing time-dimension communication characterizations for representative scientific applications on supercomputer systems
Exascale computing is one of the major challenges of this decade, and several studies have shown that communications are becoming one of the bottlenecks for scaling parallel applications. The analysis on the characteristics of communications can effectively aid to improve the performance of scientific applications. In this paper, we focus on the statistical regularity in time-dimension communication characteristics for representative scientific applications on supercomputer systems, and then prove that the distribution of communication-event intervals has a power-law decay, which is common in scientific interests and human activities. We verify the distribution of communication-event intervals has really a power-lawdecay on the Tianhe-2 supercomputer, and also on the other six parallel systems with three different network topologies and two routing policies. In order to do a quantitative study on the power-law distribution, we exploit two groups of statistics: bursty vs. memory and periodicity vs. dispersion. Our results indicate that the communication events show a “strong-bursty and weak-memory” characteristic and the communication event intervals show the periodicity and the dispersion. Finally, our research provides an insight into the relationship between communication optimizations and time-dimension communication characteristics.
power-law distributions / supercomputer systems / time-dimension communication characteristics / Tianhe-2
[1] |
Liao X K, Pang Z B, Wang K F, Lu Y T, Xie M, Xia J, Dong D Z, Suo G. High performance interconnect network for tianhe system. Journal of Computer Science and Technology, 2015, 30(2): 259–272
CrossRef
Google scholar
|
[2] |
Geist A, Lucas R. Major computer science challenges at exascale. The International Journal of High Performance Computing Applications, 2009, 23(4): 427–436
CrossRef
Google scholar
|
[3] |
Shao B B M, Rao H R. A parallel hypercube algorithm for discrete resource allocation problems. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 2006, 36(1): 233–242
CrossRef
Google scholar
|
[4] |
Dongarra J J, Luszczek P, Petitet A. The linpack benchmark: past, present and future. Concurrency and Computation: Practice and Experience, 2003, 15(9): 803–820
CrossRef
Google scholar
|
[5] |
Li Y, Zhai J D, Li K Q. Communication analysis and performance prediction of parallel applications on large-scale machines. Innovative Research and Applications in Next-Generation High Performance Computing, 2016, 5: 80–105
CrossRef
Google scholar
|
[6] |
Yang X J, Du J, Wang Z Y. An effective speedup metric for measuring productivity in large-scale parallel computer systems. The Journal of Supercomputing, 2011, 56(2): 164–181
CrossRef
Google scholar
|
[7] |
Chen J, Tang Y H, Dong Y, Xue J L, Wang Z Y, and Zhou W H. Reducing static energy in supercomputer interconnection networks using topology-aware partitioning. IEEE Transactions on Computers, 2016, 65(8): 2588–2602
CrossRef
Google scholar
|
[8] |
Zhou W H, Chen J, Cui C, Wang Q, Dong D Z, Tang Y H. Detailed and clock-driven simulation for HPC interconnection network. Frontiers of Computer Science, 2016, 10(5): 797–811
CrossRef
Google scholar
|
[9] |
Raponi P G, Petrini F, Walkup R, Checconi F. Characterization of the communication patterns of scientific applications on blue gene/P. In: Proceedings of 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum. 2011, 1017–1024
CrossRef
Google scholar
|
[10] |
Almasi G, Asaad S, Bellofatto R E, Bickford H R, Blumrich M A, Brezzo B, Bright A A, Brunheroto J R, Castanos J G, Chen D. Overview of the IBM blue gene/p project. IBM Journal of Research and Development, 2008, 52(1–2): 199–220
|
[11] |
Landge A G, Levine J A, Bhatele A, Isaacs K E, Gamblin T, Schulz M, Langer S H, Bremer P T, Pascucci V. Visualizing network traffic to understand the performance of massively parallel simulations. IEEE Transactions on Visualization and Computer Graphics, 2012, 18(12): 2467–2476
CrossRef
Google scholar
|
[12] |
Yuan X, Mahapatra S, Lang M, Paki n S. LFTI: a new performance metric for assessing interconnect designs for extreme-scale HPC systems. In: Proceedings of the 28th International Parallel and Distributed Processing Symposium. 2014, 273–282
CrossRef
Google scholar
|
[13] |
Zhou W H, Chen J, Wang Z Y, Xu X H, Xu L Y, Tang Y H. Timedimension communication characterization of representative scientific applications on Tianhe-2. In: Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications. 2015, 423–429
|
[14] |
Simon H A. On a class of skew distribution functions. Biometrika, 1955, 42(3/4): 425–440
CrossRef
Google scholar
|
[15] |
Mitzenmacher M. A brief history of generative models for power law and lognormal distributions. Internet Mathematics, 2004, 1(2): 226–251
CrossRef
Google scholar
|
[16] |
Newman M E J. Power laws, pareto distributions and Zipf’s law. Contemporary Physics, 2005, 46(5): 323–351
CrossRef
Google scholar
|
[17] |
Sornette D. Critical Phenomena in Natural Sciences: Chaos, Fractals, Selforganization and Disorder: Concepts and Tools. Spnhger Science & Business Media, 2004
|
[18] |
Faloutsos M, Faloutsos P, Faloutsos C. On power-law relationships of the internet topology. ACM SIGCOMM Computer Communication Review, 1999, 29(4): 251–262
CrossRef
Google scholar
|
[19] |
Clauset A, Shalizi C R, Newman M E J. Power-law distributions in empirical data. SIAM Review, 2009, 51(4): 661–703
CrossRef
Google scholar
|
[20] |
Bailey D H, Barszcz E, Barton J T, Browning D S, Carter R L, Dagum L, Fatoohi R A, Frederickson P O, Lasinski T A, Schreiber R S, Simon H D, Venkatakrishnan V, Weeratunga S K. The nas parallel benchmarks. International Journal of High Performance Computing Applications, 1991, 5(3): 63–73
|
[21] |
Jasak H, Jemcov A, Tukovic Z. Openfoam: A C++ library for complex physics simulations. International Workshop on Coupled Methods in Numerical Dynamics, 2007, 1000: 1–20
|
[22] |
Accelerated Strategic Computing Initiative. The ASCI sweep3d benchmark code, 1995
|
[23] |
Kim J, Esler K, McMinis J, Clark B, Gergely J, Chiesa S, Delaney K, Vincent J, Ceperley D. QMCPACK simulation suite, 2014
|
[24] |
Plimpton S, Crozier P, Thompson A. Lammps-large-scale atomic/molecular massively parallel simulator. Sandia National Laboratories, 2007, 18: 43
|
[25] |
Berendsen H J C, van der Spoel D, van Drunen R. GROMACS: a message-passing parallel molecular dynamics implementation. Computer Physics Communications, 1995, 91(1): 43–56
CrossRef
Google scholar
|
[26] |
Zhai J D, Chen W G, Zheng W M. Phantom: predicting performance of parallel applications on large-scale parallel machines using a single node. ACM Sigplan Notices, 2010, 45(5): 305–314
CrossRef
Google scholar
|
[27] |
Gutenberg B, Richter C F. Frequency of earthquakes in california. Bulletin of the Seismological Society of America, 1944, 34(4): 185–188
|
[28] |
Neukum G, Ivanov B A. Crater size distributions and impact probabilities on earth from lunar, terrestrial-planet, and asteroid cratering data. Hazards due to Comets and Asteroids, 1994, 359–416
|
[29] |
Lu E T, Hamilton R J. Avalanches and the distribution of solar flares. The Astrophysical Journal, 1991, 380: L89–L92
CrossRef
Google scholar
|
[30] |
Roberts D C, Turcotte D L. Fractality and self-organized criticality of wars. Fractals, 1998, 6(4): 351–357
CrossRef
Google scholar
|
[31] |
Zipf G K. Human Behavior and The Principle of Least Effort. Addisonwesley Press, 1949, 1–721
|
[32] |
Estoup J B. Les Gammes StenographiquesInstitut Stenographique de France. Paris, 1916
|
[33] |
Zanette D H, Manrubia S C. Vertical transmission of culture and the distribution of family names. Physica A: Statistical Mechanics and its Applications, 2001, 295(1): 1–8
CrossRef
Google scholar
|
[34] |
Coile R C. Lotka’s frequency distribution of scientific productivity. Journal of the American Society for Information Science, 1977, 28(6): 366–370
CrossRef
Google scholar
|
[35] |
de Solla Price D J. Networks of scientific papers. Science, 1965, 149(3683): 510–515
CrossRef
Google scholar
|
[36] |
Cox R A K, Felton J M, Chung K H. The concentration of commercial success in popular music: an analysis of the distribution of gold records. Journal of Cultural Economics, 1995, 19(4): 333–340
CrossRef
Google scholar
|
[37] |
Kohli R, Sah R K. Market shares: some power law results and observations. Management Science, 2006, 52(11): 1792–1798
CrossRef
Google scholar
|
[38] |
Willis J C, Yule G U. Some statistics of evolution and geographical distribution in plants and animals, and their significance. Nature, 1922, 109(2728): 177–179
CrossRef
Google scholar
|
[39] |
Pareto V. Cours D’économie Politique. Librairie Droz, 1964, 1–429
CrossRef
Google scholar
|
[40] |
Adamic L A, Huberman B A. The nature of markets in the world wide Web. Quarterly Joural of Electronic Commerce, 2000, 1(1): 5–12
CrossRef
Google scholar
|
[41] |
Crovella M E, Bestavros A. Self-similarity in World Wide Web traffic: evidence and possible causes. IEEE/ACM Transactions on Networking, 1997, 5(6): 835–846
CrossRef
Google scholar
|
[42] |
Goh K I, Barabási A L. Burstiness and memory in complex systems. EPL (Europhysics Letters), 2008, 81(4): 48002
CrossRef
Google scholar
|
[43] |
Hidalgo R C A. Conditions for the emergence of scaling in the interevent time of uncorrelated and seasonal systems. Physica A: Statistical Mechanics and its Applications, 2006, 369(2): 877–883
CrossRef
Google scholar
|
[44] |
Zhou T, Zhao Z D, Yang Z M, Zhou C S. Relative clock verifies endogenous bursts of human dynamics. EPL (Europhysics Letters), 2012, 97(1): 18006
CrossRef
Google scholar
|
[45] |
Lee Rodgers J, Nicewander W A. Thirteen ways to look at the correlation coefficient. The American Statistician, 1988, 42(1): 59–66
CrossRef
Google scholar
|
[46] |
Legates D R, McCabe G J. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resources Research, 1999, 35(1): 233–241
CrossRef
Google scholar
|
[47] |
Engelen R V. Efficient symbolic analysis for optimizing compilers. In: Proceedings of the 10th International Conference on Compiler Construction. 2001, 118–132
CrossRef
Google scholar
|
[48] |
Kelefouras V, Kritikakou A, Goutis C. A methodology for speeding up loop kernels by exploiting the software information and the memory architecture. Computer Languages, Systems & Structure, 2015, 41: 21–41
CrossRef
Google scholar
|
[49] |
Liao X K, Xiao L Q, Yang C Q, Lu Y T. Milkyway-2 supercomputer: system and application. Frontiers of Computer Science, 2014, 8(3): 345–356
CrossRef
Google scholar
|
[50] |
Dally W J, Towles B P. Principles and Practices of Interconnection Networks. Elsevier, Amsterdam, 2004, 1–550
|
[51] |
Morgan J A, Tatar J F. Calculation of the residual sum of squares for all possible regressions. Technometrics, 1972, 14(2): 317–325
CrossRef
Google scholar
|
[52] |
Boccaletti S, Latora V, Moreno Y, Chavezf M, Hwang D U. Complex networks: structure and dynamics. Physics Reports, 2006, 424(4–5): 175–308
CrossRef
Google scholar
|
[53] |
Tabe T B, Stout Q F. The use of the MPI communication library in the NAS parallel benchmarks. Ann Arbor, 1999, 1001: 48109
|
[54] |
Malmgren R D, Stouffer D B, Motter A E, Amaral L A. A poissonian explanation for heavy tails in e-mail communication. Proceedings of the National Academy of Sciences, 2008, 105(47): 18153–18158.
CrossRef
Google scholar
|
[55] |
Kay S M, Marple Jr S L. Spectrum analysis — a modern perspective. Proceedings of the IEEE, 1981, 69(11): 1380–1419.
CrossRef
Google scholar
|
[56] |
Woodbury G. An Introduction to Statistics. Cengage Learning, 2001, 1–720
|
[57] |
Bland J M, Altman D G. Statistics notes: measurement error. Bmj, 1996, 313(7059): 744
CrossRef
Google scholar
|
[58] |
Gropp W, Lusk E, Skjellum A. Using MPI: Portable Parallel Programming with The Message-Passing Interface, Massachusetts: MIT press, 1999, 1–275
CrossRef
Google scholar
|
[59] |
Matsuda M, Kudoh T, Kodama Y, Takano R, Ishikawa Y. Efficient MPIcollective operations for clusters in long-and-fast networks. In: Proceedings of 2006 IEEE International Conference on Cluster Computing. 2006, 1–9
CrossRef
Google scholar
|
/
〈 | 〉 |