A comprehensive survey on graph neural network accelerators
Jingyu LIU , Shi CHEN , Li SHEN
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (2) : 192104
Deep learning has gained superior accuracy on Euclidean structure data in neural networks. As a result, non-Euclidean structure data, such as graph data, has more sophisticated structural information, which can be applied in neural networks as well to address more complex and practical problems. However, actual graph data obeys a power-law distribution, so the adjacent matrix of a graph is random and sparse. Graph processing accelerator (GPA) is designed to handle the problems above. However, graph computing only processes 1-dimensional data. In graph neural networks (GNNs), graph data is multi-dimensional. Consequently, GNNs include the execution processes of both traditional graph processing and neural network, which have irregular memory access and regular computation, respectively. To obtain more information in graph data and require better model generalization ability, the layers of GNN are deeper, so the overhead of memory access and computation is considerable. At present, GNN accelerators are designed to deal with this issue. In this paper, we conduct a systematic survey regarding the design and implementation of GNN accelerators. Specifically, we review the challenges faced by GNN accelerators, and existing related works in detail to process them. Finally, we evaluate previous works and propose future directions in this booming field.
graph neural network / accelerators / graph convolutional networks / design space exploration / deep learning / domain-specific architecture
| [1] |
Cao S, Lu W, Xu Q. Deep neural networks for learning graph representations. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 1145−1152 |
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
Li R, Wang S, Zhu F, Huang J. Adaptive graph convolutional neural networks. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI 18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI 18). 2018, 434 |
| [11] |
Xu K, Hu W, Leskovec J, Jegelka S. How powerful are graph neural networks? 2018, arXiv preprint arXiv: 1810.00826 |
| [12] |
Zhang M, Cui Z, Neumann M, Chen Y. An end-to-end deep learning architecture for graph classification. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18). 2018, 4438−4445 |
| [13] |
Yin L, Wang J, Zheng H. Exploring architecture, dataflow, and sparsity for gcn accelerators: a holistic framework. In: Proceedings of the Great Lakes Symposium on VLSI 2023. 2023, 489–495 |
| [14] |
Garg R, Qin E, Munoz-Matrinez F, Guirado R, Jain A, Abadal S, Abellan J L, Acacio M E, Alarcon E, Rajamanickam S, Krishna T. Understanding the design space of sparse/dense multiphase gnn dataflows on spatial accelerators. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium. 2022, 571–582 |
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
Yang J, Tang D, Song X, Wang L, Yin Q, Chen R, Yu W, Zhou J. GNNLab: a factored system for sample based GNN training over GPUs. In: Proceedings of the 17th European Conference on Computer Systems. 2022, 417−434 |
| [21] |
Wang L, Yin Q, Tian C, Yang J, Chen R, Yu W, Yao Z, Zhou J. FlexGraph: a flexible and efficient distributed framework for GNN training. In: Proceedings of the 16th European Conference on Computer Systems. 2021, 67−82 |
| [22] |
Tailor S A, Fernández-Marqués J, Lane N D. Degree-quant: quantization-aware training for graph neural networks. In: Proceedings of the 9th International Conference on Learning Representations. 2021 |
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
Kung H T. Why systolic architectures? Computer, 1982, 15(1): 37−46 |
| [54] |
|
| [55] |
|
| [56] |
|
| [57] |
|
| [58] |
Zhou Z, Shi B, Zhang Z, Guan Y, Sun G, Luo G. BlockGNN: towards efficient GNN acceleration using block-circulant weight matrices. In: Proceedings of the 58th ACM/IEEE Design Automation Conference. 2021, 1009−1014 |
| [59] |
|
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
|
| [64] |
|
| [65] |
|
| [66] |
|
| [67] |
|
Higher Education Press
/
| 〈 |
|
〉 |