Optimizing non-coalescedmemory access for irregular applications withGPUcomputing

Ran ZHENG; Yuan-dong LIU; Hai JIN

doi:10.1631/FITEE.1900262

PDF(731 KB)

Front. Inform. Technol. Electron. Eng ›› 2020, Vol. 21 ›› Issue (9) : 1285-1301. DOI: 10.1631/FITEE.1900262

Orginal Article

Optimizing non-coalescedmemory access for irregular applications withGPUcomputing

Ran ZHENG¹^,²^,³^,⁴ ,
Yuan-dong LIU¹^,²^,³^,⁴ ,
Hai JIN¹^,²^,³^,⁴

Author information +

History +

Abstract

General purpose graphics processing units (GPGPUs) can be used to improve computing performance considerably for regular applications. However, irregular memory access exists in many applications, and the benefits of graphics processing units (GPUs) are less substantial for irregular applications. In recent years, several studies have presented some solutions to remove static irregular memory access. However, eliminating dynamic irregular memory access with software remains a serious challenge. A pure software solution without hardware extensions or offline profiling is proposed to eliminate dynamic irregular memory access, especially for indirect memory access. Data reordering and index redirection are suggested to reduce the number of memory transactions, thereby improving the performance of GPU kernels. To improve the efficiency of data reordering, an operation to reorder data is offloaded to a GPU to reduce overhead and thus transfer data. Through concurrently executing the compute unified device architecture (CUDA) streams of data reordering and the data processing kernel, the overhead of data reordering can be reduced. After these optimizations, the volume of memory transactions can be reduced by 16.7%–50% compared with CUSPARSE-based benchmarks, and the performance of irregular kernels can be improved by 9.64%–34.9% using an NVIDIA Tesla P4 GPU.

Keywords

General purpose graphics processing units / Memory coalescing / Non-coalesced memory access / Data reordering

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Ran ZHENG, Yuan-dong LIU, Hai JIN. Optimizing non-coalescedmemory access for irregular applications withGPUcomputing. Front. Inform. Technol. Electron. Eng, 2020, 21(9): 1285‒1301 https://doi.org/10.1631/FITEE.1900262