Multi-core optimization for conjugate gradient benchmark on heterogeneous processors

Lin Deng , Yong Dou

Journal of Central South University ›› 2011, Vol. 18 ›› Issue (2) : 490 -498.

PDF
Journal of Central South University ›› 2011, Vol. 18 ›› Issue (2) : 490 -498. DOI: 10.1007/s11771-011-0722-6
Article

Multi-core optimization for conjugate gradient benchmark on heterogeneous processors

Author information +
History +
PDF

Abstract

Developing parallel applications on heterogeneous processors is facing the challenges of ‘memory wall’, due to limited capacity of local storage, limited bandwidth and long latency for memory access. Aiming at this problem, a parallelization approach was proposed with six memory optimization schemes for CG, four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20, the parallelization approach can reach up to 21 and 133 times speedups with size A and B, respectively, compared with single power processor element. Finally, the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV, simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.

Keywords

multi-core processor / NAS parallelization / CG / memory optimization

Cite this article

Download citation ▾
Lin Deng, Yong Dou. Multi-core optimization for conjugate gradient benchmark on heterogeneous processors. Journal of Central South University, 2011, 18(2): 490-498 DOI:10.1007/s11771-011-0722-6

登录浏览全文

4963

注册一个新账户 忘记密码

References

AI Summary AI Mindmap
PDF

82

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/