Storagewall for exascale supercomputing

Wei HU; Guang-ming LIU; Qiong LI; Yan-huang JIANG; Gui-lin CAI

doi:10.1631/FITEE.1601336

PDF(528 KB)

Front. Inform. Technol. Electron. Eng ›› 2016, Vol. 17 ›› Issue (11) : 1154-1175. DOI: 10.1631/FITEE.1601336

Article

Storagewall for exascale supercomputing

Author information +

History +

Abstract

The mismatch between compute performance and I/O performance has long been a stumbling block as supercomputers evolve from petaflops to exaflops. Currently, many parallel applications are I/O intensive, and their overall running times are typically limited by I/O performance. To quantify the I/O performance bottleneck and highlight the significance of achieving scalable performance in peta/exascale supercomputing, in this paper, we introduce for the first time a formal definition of the ‘storage wall’ from the perspective of parallel application scalability. We quantify the effects of the storage bottleneck by providing a storage-bounded speedup, defining the storage wall quantitatively, presenting existence theorems for the storage wall, and classifying the system architectures depending on I/O performance variation. We analyze and extrapolate the existence of the storage wall by experiments on Tianhe-1A and case studies on Jaguar. These results provide insights on how to alleviate the storage wall bottleneck in system design and achieve hardware/software optimizations in peta/exascale supercomputing.

Keywords

Storage-bounded speedup / Storage wall / High performance computing / Exascale computing

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Wei HU, Guang-ming LIU, Qiong LI, Yan-huang JIANG, Gui-lin CAI. Storagewall for exascale supercomputing. Front. Inform. Technol. Electron. Eng, 2016, 17(11): 1154‒1175 https://doi.org/10.1631/FITEE.1601336

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Agarwal, S., Garg, R., Gupta, M.S., , 2004. Adaptive incremental checkpointing for massively parallel systems. Proc. 18th Annual Int. Conf. on Supercomputing, p.277–286. http://dx.doi.org/10.1145/1006209.1006248

[2]	Agerwala, T., 2010. Exascale computing: the challenges and opportunities in the next decade. IEEE 16th Int. Symp. on High Performance Computer Architecture. http://dx.doi.org/10.1109/HPCA.2010.5416662

[3]	Alam, S.R., Kuehn, J.A., Barrett, R.F., , 2007. Cray XT4: an early evaluation for petascale scientific simulation. Proc. ACM/IEEE Conf. on Supercomputing, p.1–12. http://dx.doi.org/10.1145/1362622.1362675

[4]	Ali, N., Carns, P.H., Iskra, K., , 2009. Scalable I/O forwarding framework for high-performance computing systems. IEEE Int. Conf. on Cluster Computing and Workshops, p.1–10, http://dx.doi.org/10.1109/CLUSTR.2009.5289188

[5]	Amdahl, G.M., 1967. Validity of the single processor approach to achieving large scale computing capabilities. Proc. Spring Joint Computer Conf., p.483–485. http://dx.doi.org/10.1145/1465482.1465560

[6]	Bent, J., Gibson, G., Grider, G., , 2009. PLFS: a checkpoint file system for parallel applications. Proc. Conf. on High Performance Computing Networking, Storage and Analysis, p.21. http://dx.doi.org/10.1145/1654059.1654081

[7]	Cappello, F., Geist, A., Gropp, B., , 2009. Toward exascale resilience. Int. J. High Perform. Comput. Appl., 23(4):374–388. http://dx.doi.org/10.1177/1094342009347767

[8]	Carns, P., Harms, K., Allcock, W., , 2011. Understanding and improving computational science storage access through continuous characterization. ACM Trans. Stor., 7(3):1–26. http://dx.doi.org/10.1145/2027066.2027068

[9]	Chen, J., Tang, Y.H., Dong, Y., , 2016. Reducing static energy in supercomputer interconnection networks using topology-aware partitioning. IEEE Trans. Comput., 65(8):2588–2602. http://dx.doi.org/10.1109/TC.2015.2493523

[10]	Culler, D.E., Singh, J.P., Gupta, A., 1998. Parallel Computer Architecture: a Hardware/Software Approach. Morgan Kaufmann Publishers Inc., San Francisco, USA.

[11]	Egwutuoha, I.P., Levy, D., Selic, B., , 2013. A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J. Supercomput., 65(3):1302–1326. http://dx.doi.org/10.1007/s11227-013-0884-0

[12]	Elnozahy, E.N., Plank, J.S., 2004. Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery. IEEE Trans. Depend. Secur. Comput., 1(2):97–108. http://dx.doi.org/10.1109/TDSC.2004.15

[13]	Elnozahy, E.N., Alvisi, L., Wang, Y.M., , 2002. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375–408. http://dx.doi.org/10.1145/568522.568525

[14]	Fahey, M., Larkin, J., Adams, J., 2008. I/O performance on a massively parallel cray XT3/XT4. IEEE Int. Symp. on Parallel and Distributed Processing, p.1–12. http://dx.doi.org/10.1109/IPDPS.2008.4536270

[15]	Ferreira, K.B., Riesen, R., Bridges, P., , 2014. Accelerating incremental checkpointing for extreme-scale computing. Fut. Gener. Comput. Syst., 30:66–77. http://dx.doi.org/10.1016/j.future.2013.04.017

[16]	Frasca, M., Prabhakar, R., Raghavan, P., , 2011. Virtual I/O caching: dynamic storage cache management for concurrent workloads. Proc. Int. Conf. for High Performance Computing, Networking, Storage and Analysis, p.38. http://dx.doi.org/10.1145/2063384.2063435

[17]	Gamblin, T., de Supinski, B.R., Schulz, M., , 2008. Scalable load-balance measurement for SPMD codes. Proc. ACM/IEEE Conf. on Supercomputing, p.1–12.

[18]	Gustafson, J.L., 1988. Reevaluating Amdahl’s law. Commun. ACM, 31(5):532–533. http://dx.doi.org/10.1145/42411.42415

[19]	Hargrove, P.H., Duell, J.C., 2006. Berkeley lab checkpoint/restart (BLCR) for Linux clusters. J. Phys. Conf. Ser., 46(1):494–499. http://dx.doi.org/10.1088/1742-6596/46/1/067

[20]	Hennessy, J.L., Patterson, D.A., 2011. Computer Architecture: a Quantitative Approach. Elsevier.

[21]	HPCwire, 2010. DARPA Sets Ubiquitous HPC Program in Motion. Available from http://www.hpcwire.com/2010/08/10/darpa_sets_ubiquitous_hpc_program_in_motion/.

[22]	Hu, W., Liu, G.M., Li, Q., , 2016. Storage speedup: an effective metric for I/O-intensive parallel application. 18th Int. Conf. on Advanced Communication Technology, p.1–2. http://dx.doi.org/10.1109/ICACT.2016.7423395

[23]	Kalaiselvi, S., Rajaraman, V., 2000. A survey of checkpointing algorithms for parallel and distributed computers. Sadhana, 25(5):489–510. http://dx.doi.org/10.1007/BF02703630

[24]	Kim, Y., Gunasekaran, R., 2015. Understanding I/O workload characteristics of a peta-scale storage system. J. Supercomput., 71(3):761–780. http://dx.doi.org/10.1007/s11227-014-1321-8

[25]	Kim, Y., Gunasekaran, R., Shipman, G.M., , 2010. Workload characterization of a leadership class storage cluster. Petascale Data Storage Workshop, p.1–5. http://dx.doi.org/10.1109/PDSW.2010.5668066

[26]	Kotz, D., Nieuwejaar, N., 1994. Dynamic file-access characteristics of a production parallel scientific workload. Proc. Supercomputing, p.640–649. http://dx.doi.org/10.1109/SUPERC.1994.344328

[27]	Liao, W.K., Ching, A., Coloma, K., , 2007. Using MPI file caching to improve parallel write performance for large-scale scientific applications. Proc. ACM/IEEE Conf. on Supercomputing, p.8. http://dx.doi.org/10.1145/1362622.1362634

[28]	Liu, N., Cope, J., Carns, P., , 2012. On the role of burst buffers in leadership-class storage systems. IEEE 28th Symp. on Mass Storage Systems and Technologies, p.1–11. http://dx.doi.org/10.1109/MSST.2012.6232369

[29]	Liu, Y., Gunasekaran, R., Ma, X.S., , 2014. Automatic identification of application I/O signatures from noisy server-side traces. Proc. 12th USENIX Conf. on File and Storage Technologies, p.213–228.

[30]	Lu, K., 1999. Research on Parallel File Systems Technology Toward Parallel Computing. PhD Thesis, National University of Defense Technology, Changsha, China (in Chinese).

[31]	Lucas, R., Ang, J., Bergman, K., , 2014. DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report: Top Ten Exascale Research Challenges. USDOE Office of Science. http://dx.doi.org/10.2172/1222713

[32]	Miller, E.L., Katz, R.H., 1991. Input/output behavior of supercomputing applications. Proc. ACM/IEEE Conf. on Supercomputing, p.567–576. http://dx.doi.org/10.1145/125826.126133

[33]	Moreira, J., Brutman, M., Castano, J., , 2006. Designing a highly-scalable operating system: the blue Gene/L story. Proc. ACM/IEEE Conf. on Supercomputing, p.53–61. http://dx.doi.org/10.1109/SC.2006.23

[34]	Oldfield, R.A., Arunagiri, S., Teller, P.J., , 2007. Modeling the impact of checkpoints on next-generation systems. 24th IEEE Conf. on Mass Storage Systems and Technologies, p.30–46. http://dx.doi.org/10.1109/MSST.2007.4367962

[35]	Pasquale, B.K., Polyzos, G.C., 1993. A static analysis of I/O characteristics of scientific applications in a production workload. Proc. ACM/IEEE Conf. on Supercomputing, p.388–397. http://dx.doi.org/10.1145/169627.169759

[36]	Plank, J.S., Beck, M., Kingsley, G., , 1995. Libckpt: transparent checkpointing under Unix. Proc. USENIX Technical Conf., p.18.

[37]	Purakayastha, A., Ellis, C., Kotz, D., , 1995. Characterizing parallel file-access patterns on a large-scale multiprocessor. 9th Int. Parallel Processing Symp., p.165–172. http://dx.doi.org/10.1109/IPPS.1995.395928

[38]	Sisilli, J., 2015. Improved Solutions for I/O Provisioning and Application Acceleration. Available from http: //www.flashmemorysummit.com/English/Collaterals/Proceedings/2015/20150811_FD11_Sisilli.pdf [Accessed on <Date>Nov. 18</Date>, 2015].

[39]	Rudin, W., 1976. Principles of Mathematical Analysis. McGraw-Hill Publishing Co.

[40]	Shalf, J., Dosanjh, S., Morrison, J., 2011. Exascale computing technology challenges. 9th Int. Conf. on High Performance Computing for Computational Science, p.1–25. http://dx.doi.org/10.1007/978-3-642-19328-6_1

[41]	Strohmaier, E., Dongarra, J., Simon, H., , 2015. TOP500 Supercomputer Sites. Available from http://www.top500.org/ [Accessed on <Date>Dec. 30</Date>, 2015].

[42]	Sun, X.H., Ni, L.M., 1993. Scalable problems and memorybounded speedup. J. Parall. Distr. Comput., 19(1): 27–37. http://dx.doi.org/10.1006/jpdc.1993.1087

[43]	University of California, 2007. IOR HPC Benchmark. Available from http://sourceforge.net/projects/ior-sio/ [Accessed on <Date>Sept. 1</Date>, 2014].

[44]	Wang, F., Xin, Q., Hong, B., , 2004. File system workload analysis for large scale scientific computing applications. Proc. 21st IEEE/12th NASA Goddard Conf. on Mass Storage Systems and Technologies, p.139–152.

[45]	Wang, T., Oral, S., Wang, Y.D., , 2014. Burstmem: a high-performance burst buffer system for scientific applications. IEEE Int. Conf. on Big Data, p.71–79. http://dx.doi.org/10.1109/BigData.2014.7004215

[46]	Wang, T., Oral, S., Pritchard, M., , 2015. Development of a burst buffer system for data-intensive applications. arXiv:1505.01765. Available from http://arxiv.org/abs/1505.01765

[47]	Wang, Z.Y., 2009. Reliability speedup: an effective metric for parallel application with checkpointing. Int. Conf. on Parallel and Distributed Computing, Applications and Technologies, p.247–254. http://dx.doi.org/10.1109/PDCAT.2009.19

[48]	Xie, B., Chase, J., Dillow, D., , 2012. Characterizing output bottlenecks in a supercomputer. Int. Conf. for High Performance Computing, Networking, Storage and Analysis, p.1–11. http://dx.doi.org/10.1109/SC.2012.28

[49]	Yang, X.J., Du, J., Wang, Z.Y., 2011. An effective speedup metric for measuring productivity in large-scale parallel computer systems. J. Supercomput., 56(2):164–181. http://dx.doi.org/10.1007/s11227-009-0355-9

[50]	Yang, X.J., Wang, Z.Y., Xue, J.L., , 2012. The reliability wall for exascale supercomputing. IEEE Trans. Comput., 61(6):767–779. http://dx.doi.org/10.1109/TC.2011.106