Techniques, Tricks, and Algorithms for Efficient GPU-Based Processing of Higher Order Hyperbolic PDEs

Sethupathy Subramanian, Dinshaw S. Balsara, Deepak Bhoriya, Harish Kumar

Communications on Applied Mathematics and Computation ›› 2023, Vol. 6 ›› Issue (4) : 2336-2384. DOI: 10.1007/s42967-022-00235-9
Original Paper

Techniques, Tricks, and Algorithms for Efficient GPU-Based Processing of Higher Order Hyperbolic PDEs

Author information +
History +

Abstract

GPU computing is expected to play an integral part in all modern Exascale supercomputers. It is also expected that higher order Godunov schemes will make up about a significant fraction of the application mix on such supercomputers. It is, therefore, very important to prepare the community of users of higher order schemes for hyperbolic PDEs for this emerging opportunity.

Not every algorithm that is used in the space-time update of the solution of hyperbolic PDEs will take well to GPUs. However, we identify a small core of algorithms that take exceptionally well to GPU computing. Based on an analysis of available options, we have been able to identify weighted essentially non-oscillatory (WENO) algorithms for spatial reconstruction along with arbitrary derivative (ADER) algorithms for time extension followed by a corrector step as the winning three-part algorithmic combination. Even when a winning subset of algorithms has been identified, it is not clear that they will port seamlessly to GPUs. The low data throughput between CPU and GPU, as well as the very small cache sizes on modern GPUs, implies that we have to think through all aspects of the task of porting an application to GPUs. For that reason, this paper identifies the techniques and tricks needed for making a successful port of this very useful class of higher order algorithms to GPUs.

Application codes face a further challenge—the GPU results need to be practically indistinguishable from the CPU results—in order for the legacy knowledge bases embedded in these applications codes to be preserved during the port of GPUs. This requirement often makes a complete code rewrite impossible. For that reason, it is safest to use an approach based on OpenACC directives, so that most of the code remains intact (as long as it was originally well-written). This paper is intended to be a one-stop shop for anyone seeking to make an OpenACC-based port of a higher order Godunov scheme to GPUs.

We focus on three broad and high-impact areas where higher order Godunov schemes are used. The first area is computational fluid dynamics (CFD). The second is computational magnetohydrodynamics (MHD) which has an involution constraint that has to be mimetically preserved. The third is computational electrodynamics (CED) which has involution constraints and also extremely stiff source terms. Together, these three diverse uses of higher order Godunov methodology, cover many of the most important applications areas. In all three cases, we show that the optimal use of algorithms, techniques, and tricks, along with the use of OpenACC, yields superlative speedups on GPUs. As a bonus, we find a most remarkable and desirable result: some higher order schemes, with their larger operations count per zone, show better speedup than lower order schemes on GPUs. In other words, the GPU is an optimal stratagem for overcoming the higher computational complexities of higher order schemes. Several avenues for future improvement have also been identified. A scalability study is presented for a real-world application using GPUs and comparable numbers of high-end multicore CPUs. It is found that GPUs offer a substantial performance benefit over comparable number of CPUs, especially when all the methods designed in this paper are used.

Keywords

PDEs / Numerical schemes / Mimetic / High performance computing

Cite this article

Download citation ▾
Sethupathy Subramanian, Dinshaw S. Balsara, Deepak Bhoriya, Harish Kumar. Techniques, Tricks, and Algorithms for Efficient GPU-Based Processing of Higher Order Hyperbolic PDEs. Communications on Applied Mathematics and Computation, 2023, 6(4): 2336‒2384 https://doi.org/10.1007/s42967-022-00235-9

References

[1.]
Balsara DS. A two-dimensional HLLC Riemann solver with applications to Euler and MHD flows. J. Comp. Phys., 2012, 231: 7476-7503
[2.]
Balsara DS. Divergence-free adaptive mesh refinement for magnetohydrodynamics. J. Comput. Phys., 2001, 174: 614-648
[3.]
Balsara DS. Divergence-free reconstruction of magnetic fields and WENO schemes for magnetohydrodynamics. J. Comput. Phys., 2009, 228: 5040-5056
[4.]
Balsara DS. Higher order accurate space-time schemes for computational astrophysics – Part I: finite volume methods. Liv Rev Computat Astrophy, 2017, 3: 2,
CrossRef Google scholar
[5.]
Balsara DS. Multidimensional extension of the HLLE Riemann solver; application to Euler and magnetohydrodynamical flows. J. Comput. Phys., 2010, 229: 1970-1993
[6.]
Balsara DS. Multidimensional Riemann problem with self-similar internal structure – Part I - application to hyperbolic conservation laws on structured meshes. J. Comput. Phys., 2014, 277: 163-200
[7.]
Balsara DS. Second-order-accurate schemes for magnetohydrodynamics with divergence-free reconstruction. Astrophys. J. Suppl., 2004, 151: 149-184
[8.]
Balsara DS, Amano T, Garain S, Kim J. High order accuracy divergence-free scheme for the electrodynamics of relativistic plasmas with multidimensional Riemann solvers. J. Comput. Phys., 2016, 318: 169-200
[9.]
Balsara DS, Garain S, Shu C-W. An efficient class of WENO schemes with adaptive order. J. Comput. Phys., 2016, 326: 780-804
[10.]
Balsara DS, Meyer C, Dumbser M, Du H, Xu Z. Efficient implementation of ADER schemes for Euler and magnetohydrodynamical flows on structured meshes – comparison with Runge-Kutta methods. J. Comput. Phys., 2013, 235: 934-969
[11.]
Balsara DS, Nkonga B. Formulating multidimensional Riemann solvers in similarity variables – Part III – a multidimensional analogue of the HLLI Riemann solver for conservative hyperbolic systems. J. Comput. Phys., 2017, 346: 25-48
[12.]
Balsara DS, Rumpf T, Dumbser M, Munz C-D. Efficient, high accuracy ADER-WENO schemes for hydrodynamics and divergence-free magnetohydrodynamics. J. Comput. Phys., 2009, 228: 2480-2516
[13.]
Balsara DS, Shu C-W. Monotonicity preserving weighted non-oscillatory schemes with increasingly high order of accuracy. J. Comput. Phys., 2000, 160: 405-452
[14.]
Balsara DS, Spicer DS. A staggered mesh algorithm using high order Godunov fluxes to ensure solenoidal magnetic fields in magnetohydrodynamic simulations. J. Comput. Phys., 1999, 149: 270-292
[15.]
Balsara DS, Taflove A, Garain S, Montecinos G. Computational electrodynamics in material media with constraint-preservation, multidimensional Riemann solvers and sub-cell resolution – part I, second-order FVTD schemes. J. Comput. Phys., 2017, 349: 604-635
[16.]
Balsara DS, Taflove A, Garain S, Montecinos G. Computational electrodynamics in material media with constraint-preservation, multidimensional Riemann solvers and sub-cell resolution – part II, higher-order FVTD schemes. J. Comput. Phys., 2018, 354: 613-645
[17.]
Chandrasekaran S, Juckeland G. . OpenACC for Programmers: Concepts and Strategies, 2018 Boston Addison-Wesley
[18.]
Chapman B, Jost G, van der Pas R. . Using OpenMP: Portable Shared Memory Parallel Programming, 2008 Cambridge, MA MIT Press
[19.]
Colella P. Multidimensional upwind methods for hyperbolic conservation laws. J. Comput. Phys., 1990, 87: 171
[20.]
Dai W, Woodward PR. On the divergence-free condition and conservation laws in numerical simulations for supersonic magnetohydrodynamic flows. Astrophys. J., 1998, 494: 317-335
[21.]
Dumbser M, Balsara DS. A new, efficient formulation of the HLLEM riemann solver for general conservative and non-conservative hyperbolic systems. J. Comput. Phys., 2016, 304: 275-319
[22.]
Dumbser M, Balsara DS, Toro EF, Munz C-D. A unified framework for the construction of one-step finite volume and discontinuous Galerkin schemes on unstructured meshes. J. Comput. Phys., 2008, 227: 8209-8253
[23.]
Dumbser M, Zanotti O, Hidalgo A, Balsara DS. ADER-WENO Finite volume schemes with space-time adaptive mesh refinement. J. Comput. Phys., 2013, 248: 257-286
[24.]
Einfeldt B, Munz C-D, Roe PL, Sjogreen B. On Godunov-type methods near low densities. J. Comput. Phys., 1991, 92: 273-295
[25.]
Garain S, Balsara DS, Reid J. Comparing Coarray Fortran (CAF) with MPI for several structured mesh PDE applications. J. Comput. Phys., 2015, 297: 237-253
[26.]
Godunov SK. Finite difference methods for the computation of discontinuous solutions of the Equations of Fluid Dynamics. Mathematics of the USSR. Sbornik, 1959, 47: 271-306
[27.]
Harten A, Engquist B, Osher S, Chakravarthy S. Uniformly high order essentially non-oscillatory schemes III. J. Comput. Phys., 1987, 71: 231-303
[28.]
Harten A, Lax PD, van Leer B. On upstream differencing and Godunov-type schemes for hyperbolic conservation laws. SIAM Rev., 1983, 25: 289-315
[29.]
Jiang G-S, Shu C-W. Efficient implementation of weighted ENO schemes. J. Comput. Phys., 1996, 126: 202-228
[30.]
Roe PL. Approximate Riemann solver, parameter vectors and difference schemes. J. Comput. Phys., 1981, 43: 357-372
[31.]
Rusanov VV. Calculation of interaction of non-steady shock waves with obstacles. J. Comput. Math. Phys. USSR, 1961, 1: 267
[32.]
Ryu D, Miniati F, Jones TW, Frank A. A divergence-free upwind code for multidimensional magnetohydrodynamic flows. Astrophys. J., 1998, 509: 244-255
[33.]
Shu C-W. Total variation-diminishing time discretizations. SIAM J Sci. Stat. Comput., 1988, 9: 1073-1084
[34.]
Shu C-W, Osher SJ. Efficient implementation of essentially non-oscillatory shock capturing schemes. J. Comput. Phys., 1988, 77: 439-471
[35.]
Shu C-W, Osher SJ. Efficient implementation of essentially non-oscillatory shock capturing schemes II. J. Comput. Phys., 1989, 83: 32-78
[36.]
Subramanian S, Balsara DS, Gagne M. ud-Doula, Modeling magnetic massive stars in 3D i: isothermal simulations of a magnetic O star. Month. Note. Royal Astronom. Soc, 2022, 515(1): 237-255
[37.]
Titarev VA, Toro EF. ADER: arbitrary high order Godunov approach. J. Sci. Comput., 2002, 17(1/2/3/4): 609-618
[38.]
Titarev VA, Toro EF. ADER schemes for three-dimensional nonlinear hyperbolic systems. J. Comput. Phys., 2005, 204: 715-736
[39.]
Toro EF, Spruce M, Speares W. Restoration of contact surface in the HLL Riemann solver. Shock Waves, 1994, 4: 25-34
[40.]
Toro EF, Titarev VA. Solution of the generalized Riemann problem for advection reaction equations. Proc. R. Soc. Lond. Ser. A, 2002, 458: 271-281
[41.]
Van Leer B. Toward the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method. J. Comput. Phys., 1979, 32: 101-136
[42.]
Woodward P, Colella P. The numerical simulation of two-dimensional fluid flow with strong shocks. J. Comput. Phys., 1984, 54: 115-173
Funding
National Science Foundation(NSF-AST-2009776); National Aeronautics and Space Administration(80NSSC22K0628)

Accesses

Citations

Detail

Sections
Recommended

/