Fast weighting method for plasma PIC simulation on GPU-accelerated heterogeneous systems

Can-qun Yang , Qiang Wu , Hui-li Hu , Zhi-cai Shi , Juan Chen , Tao Tang

Journal of Central South University ›› 2013, Vol. 20 ›› Issue (6) : 1527 -1535.

PDF
Journal of Central South University ›› 2013, Vol. 20 ›› Issue (6) : 1527 -1535. DOI: 10.1007/s11771-013-1644-2
Article

Fast weighting method for plasma PIC simulation on GPU-accelerated heterogeneous systems

Author information +
History +
PDF

Abstract

Particle-in-cell (PIC) method has got much benefits from GPU-accelerated heterogeneous systems. However, the performance of PIC is constrained by the interpolation operations in the weighting process on GPU (graphic processing unit). Aiming at this problem, a fast weighting method for PIC simulation on GPU-accelerated systems was proposed to avoid the atomic memory operations during the weighting process. The method was implemented by taking advantage of GPU’s thread synchronization mechanism and dividing the problem space properly. Moreover, software managed shared memory on the GPU was employed to buffer the intermediate data. The experimental results show that the method achieves speedups up to 3.5 times compared to previous works, and runs 20.08 times faster on one NVIDIA Tesla M2090 GPU compared to a single core of Intel Xeon X5670 CPU.

Keywords

GPU computing / heterogeneous computing / plasma physics simulations / particle-in-cell (PIC)

Cite this article

Download citation ▾
Can-qun Yang, Qiang Wu, Hui-li Hu, Zhi-cai Shi, Juan Chen, Tao Tang. Fast weighting method for plasma PIC simulation on GPU-accelerated heterogeneous systems. Journal of Central South University, 2013, 20(6): 1527-1535 DOI:10.1007/s11771-013-1644-2

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

BirdsallC, LangdonA. Plasma physics via computer simulation [M]. New York: Adam Hilger, 199123-24

[2]

PrzebindaV, CaryJSome improvements in PIC performance through sorting, caching, and dynamic load balancing [M], 2005Boulder, ColoradoUniversity of Colorado1-14

[3]

QiangJ, RyneR, HabibS, DecykV. An object-oriented parallel particle-in-cell code for beam dynamics simulation in linear accelerators [J]. Journal of Computational Physics, 2000, 16(3): 434-451

[4]

GermaschewskiK, RuhlH, BhattacharjeeA. Dynamic load-balancing and GPU computing with the particle-in-cell code PSC [J]. Bulletin of the American Physical Society, 2011, 56(1): 13-23

[5]

MadduriK, ImE, IbrahimK, WilliamsS, EthierS, OlikerL. Gyrokinetic particle-in-cell optimization on emerging multi-and manycore platforms [J]. Parallel Computing, 2011, 37(9): 501-520

[6]

FanZ, QiuF, KaufmanA, Yoakum-StoverS. GPU cluster for high performance computing [C]. Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, 2004Washington DC, USAIEEE Computer Society47-58

[7]

NvidiaCCompute unified device architecture programming guide [M], 2010Santa Clara, CANVIDIA Coorperation3-5

[8]

StantchevG, DorlandW, GumerovN. Fast parallel particle-to-grid interpolation for plasma PIC simulations on the GPU [J]. Journal of Parallel and Distributed Computing, 2008, 68(10): 1339-1349

[9]

BurauH, WideraR, HonigW, JuckelandG, DebusA, KlugeT, SchrammU, CowanT, SauerbreyR, BussmannM. PIConGPU: A fully relativistic particle-in-cell code for a GPU cluster [J]. IEEE Transactions on Plasma Science, 2010, 38(10): 2831-2839

[10]

KongX, HuangM, RenC, DecykV. Particle-in-cell simulations with charge-conserving current deposition on graphic processing units [J]. Journal of Computational Physics, 2011, 230(4): 1676-1685

[11]

CookeS, LevushB, ChernyavskiyI, AntonsenT. GPU-accelerated 3d electromagnetic PIC simulations [C]. IEEE International Conference on Plasma Science (ICOPS), 2011Washing DC, USAIEEE Press1-2

[12]

MertmannP, EreminD, MussenbrockT, BrinkmannR, AwakowiczP. Fine-sorting one-dimensional particle-in-cell algorithm with montecarlo collisions on a graphics processing unit [J]. Computer Physics Communications, 2011, 18(2): 2161-2167

[13]

HillS, CollinD. Practical, dynamic visibility for games [J]. GPU Pro, 2011, 2(1): 329-330

[14]

OwensJ, HoustonM, LuebkeD, GreenS, StoneJ, PhillipsJ. GPU computing [J]. Proceedings of the IEEE, 2008, 96(5): 879-899

[15]

RyooS, RodriguesC, BaghsorkhiS, StoneS, KirkD, HwuW. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA [C]. Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010UT, USAACM Press73-82

[16]

TangT, YangX-j, LinY-fei. Cache miss analysis for GPU programs based on stack distance profile [C]. Proceedings of the 31st International Conference on Distributed Computing Systems, 2011Minneapolis, USAICDCS ICDCS’ 11623-634

[17]

CallahanD, KoblenzB. Register allocation via hierarchical graph coloring [C]. ACM SIGPLAN Notices, 1991CA, USAACM Press182-203

[18]

BriggsP, CooperK, TorczonL. Improvements to graph coloring register allocation [J]. ACM Transactions on Programming Languages and Systems, 1994, 16(3): 428-455

[19]

ZhangA-q, MoZ-yao. Parallelization of LARED-P codes for simulation of laser plasma interactions [R]. GF Report, Technical Report, ZW-J-2002045, IAPCM, 2002

[20]

MoZ-y, XuL-b, ZhangB-l, ShenL-jun. parallel computing and performance analysis for 2-dimensional plasma simulations with particle clouds in cells method [J]. Chinese Journal of Computational Physics, 1999, 16(5): 496-504

[21]

ZhengC-y, ZhuS-p, HeX-tu. Quasistatic magnetic field generation by an intense ultrashort laser pulse in underdense plasma [J]. Chinese Physics Letters, 2000, 17(10): 746-748

[22]

ZhengC-y, HeX-t, ZhuS-ping. Magnetic field generation and relativistic electron dynamics in circularly polarized intense laser interaction with dense plasma [J]. Physics of plasmas. Physics of Plasmas, 2005, 12(4): 44-55

[23]

ZhengC-y, ZhangA-q, ZhuS-p, HeX-tu. Simulation of electron beam instabilities in collisionless plasmas [J]. Journal of Plasma Physics, 2006, 72(2): 249-258

[24]

ChenM, ShengZ-m, ZhengJ, MaY-y, ZhangJie. Development and application of multi-dimensional particle-in-cell codes for investigation of laser plasma interactions [J]. Chinese Journal of Computational Physics, 2008, 25(1): 50

[25]

GarlandM, GrandS, NickollsJ, AndersonJ, HardwickJ, MortonS, PhillipsE, ZhangY, VolkovV. Parallel computing experiences with CUDA [J]. Micro, 2008, 28(4): 13-27

AI Summary AI Mindmap
PDF

96

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/