Distributed-Memory $\mathcal{H}$-Matrix Algebra I: Data Distribution andMatrix-VectorMultiplication

Yingzhou Li; Jack Poulson; Lexing Ying

doi:10.4208/csiam-am.2020-0206

CSIAM Trans. Appl. Math. ›› 2021, Vol. 2 ›› Issue (3) :431 -459. DOI: 10.4208/csiam-am.2020-0206

research-article

Distributed-Memory $\mathcal{H}$-Matrix Algebra I: Data Distribution andMatrix-VectorMultiplication

Author information +

History +

PDF (56KB)

Abstract

We introduce a data distribution scheme for $\mathcal{H}$-matrices and a distributedmemory algorithm for $\mathcal{H}$-matrix-vector multiplication. Our data distribution scheme avoids an expensive Ω(P²) scheduling procedure used in previous work, where P is the number of processes, while data balancing is well-preserved. Based on the data distribution, our distributed-memory algorithm evenly distributes all computations among P processes and adopts a novel tree-communication algorithm to reduce the latency cost. The overall complexity of our algorithm is $\mathcal{O}\left(\frac{N\mathrm{l}\mathrm{o}\mathrm{g}N}{P}+\alpha \mathrm{l}\mathrm{o}\mathrm{g}P+\beta {\mathrm{l}\mathrm{o}\mathrm{g}}^{2}P\right)$ for $\mathcal{H}$-matrices under weak admissibility condition, where N is the matrix size, α denotes the latency, and β denotes the inverse bandwidth. Numerically, our algorithm is applied to address both two- and three-dimensional problems of various sizes among various numbers of processes. On thousands of processes, good parallel efficiency is still observed.

Keywords

Parallel fast algorithm / $\mathcal{H}$-matrix / distributed-memory / parallel computing

Cite this article

Download citation ▾

Yingzhou Li, Jack Poulson, Lexing Ying. Distributed-Memory $\mathcal{H}$-Matrix Algebra I: Data Distribution andMatrix-VectorMultiplication. CSIAM Trans. Appl. Math., 2021, 2(3): 431-459 DOI:10.4208/csiam-am.2020-0206