A graphics processing unit-based parallel simplified swarm optimization algorithm for enhanced performance and precision

Zhu Wenbo , Huang Shang-Ke , Yeh Wei-Chang , Liu Zhenyao , Huang Chia-Ling

International Journal of Systematic Innovation ›› 2025, Vol. 9 ›› Issue (5) : 23 -42.

PDF (1241KB)
International Journal of Systematic Innovation ›› 2025, Vol. 9 ›› Issue (5) :23 -42. DOI: 10.6977/IJoSI.202510_9(5).0003
ARTICLE
research-article

A graphics processing unit-based parallel simplified swarm optimization algorithm for enhanced performance and precision

Author information +
History +
PDF (1241KB)

Abstract

Graphics processing units (GPUs) have emerged as powerful platforms for parallel computing, enabling personal computers to solve complex optimization tasks effectively. Although swarm intelligence algorithms naturally lend themselves to parallelization, a GPU-based implementation of the simplified swarm optimization (SSO) algorithm has not been reported in the literature. This paper introduces a compute CUDA-SSO algorithm on the CUDA platform, with a time complexity analysis of O (Ngen × Nsol × Nvar), where Ngen is the number of iterations, Nsol is the population size (i.e., number of fitness function evaluations), and Nvar represents the required pairwise comparisons. By eliminating resource preemption of personal best and global best updates, CUDA-SSO significantly reduces the overall complexity and prevents concurrency conflicts. Numerical experiments demonstrate that the proposed approach achieves an order-of-magnitude improvement in run time with superior solution precision relative to central processing unit-based SSO, making it a compelling methodology for large-scale, data-parallel optimization tasks.

Keywords

Compute Unified Device Architecture / Graphics Processing Unit / Parallelism / Simplified Swarm Optimization / Swarm Intelligence Algorithms

Cite this article

Download citation ▾
Zhu Wenbo, Huang Shang-Ke, Yeh Wei-Chang, Liu Zhenyao, Huang Chia-Ling. A graphics processing unit-based parallel simplified swarm optimization algorithm for enhanced performance and precision. International Journal of Systematic Innovation, 2025, 9(5): 23-42 DOI:10.6977/IJoSI.202510_9(5).0003

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Abbasi, M., Rafiee, M., Khosravi, M.R., Jolfaei, A., Menon, V.G., & Koushyar, J. M. (2020). An efficient parallel genetic algorithm solution for vehicle routing problem in cloud implementation of the intelligent transportation systems. Journal of Cloud Computing, 9(6), 6. https://doi.org/10.1186/s13677-020-0157-4

[2]

AlZubi, S., Shehab, M., Al-Ayyoub, M., Jararweh, Y., & Gupta, B. (2020). Parallel implementation for 3D medical volume fuzzy segmentation. Pattern Recognition Letters, 130, 312-318. https://doi.org/10.1016/j.patrec.2018.07.026

[3]

Chung, Y.Y., & Wahid, N. (2012). A hybrid network intrusion detection system using simplified swarm optimization (SSO). Applied Soft Computing, 12(9), 3014-3022. https://doi.org/10.1016/j.asoc.2012.04.020

[4]

Corley, H.W., Rosenberger, J., Yeh, W.C., & Sung, T.K. (2006). The cosine simplex algorithm. The International Journal of Advanced Manufacturing Technology, 27, 1047-1050. https://doi.org/10.1007/s00170-004-2278-1

[5]

Corral, J.M.R., Civit-Masot, J., Luna-Perejón, F., Díaz-Cano, I., Morgado-Estévez, A., & Domínguez-Morales, M. (2024). Energy efficiency in edge TPU vs. Embedded GPU for computer-aided medical imaging segmentation and classification. Engineering Applications of Artificial Intelligence, 127, 107298. https://doi.org/10.1016/j.engappai.2023.107298

[6]

Friedman, J.H. (1994). An overview of predictive learning and function approximation. In: From Statistics to Neural Networks. Vol. 136. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-79119-2_1

[7]

Gordon, V.S., & Whitley, D. (1993). Serial and parallel genetic algorithms as function optimizers. In: Proceedings of the 5th International Conference on Genetic Algorithms (ICGA). Urbana-Champaign, IL, USA.

[8]

Hachaj, T., & Piekarczyk, M. (2023). High-level hessian-based image processing with the frangi neuron. Electronics, 12(19), 4159. https://doi.org/10.3390/electronics12194159

[9]

Hadley, G. (1964). A Nonlinear and Dynamics Programming. Addison-Wesley Professional, Reading, MA, USA.

[10]

Hager, G., Zeiser, T., & Wellein, G. (2008). Data Access Optimizations for Highly Threaded Multi-Core Cpus with Multiple Memory Controllers. In: Proceedings of 2008 IEEE International Symposium on Parallel and Distributed Processing, p1-7. https://doi.org/10.1109/IPDPS.2008.4536341

[11]

Hussain, M.M., Hattori, H., & Fujimoto, N. (2016). A CUDA implementation of the standard particle swarm optimization. In: Proceedings of 2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) IEEE, Timisoara, Romania. p24-27. https://doi.org/10.1109/SYNASC.2016.043

[12]

Krömer, P., Platoš J., Snášel, V., & Abraham, A. (2011). A comparison of many-threaded differential evolution and genetic algorithms on CUDA. In: Proceedings of 2011 Third World Congress on Nature and Biologically Inspired Computing: IEEE, Salamanca, Spain, p19-21. https://doi.org/10.1109/NaBIC.2011.6089641

[13]

Lee, W.C., Chuang, M.C., & Yeh, W.C. (2012). Uniform parallel-machine scheduling to minimize makespan with position-based learning curves. Computers and Industrial Engineering, 63(4), 813-818. https://doi.org/10.1016/j.cie.2012.05.003

[14]

Li, W., & Zhang, Z. (2011). A Cuda-Based Multi-Channel Particle Swarm Algorithm. In: Proceedings of 2011 International Conference on Control, Automation and Systems Engineering (CASE): IEEE, Singapore. https://doi.org/10.1109/ICCASE.2011.5997786

[15]

Luo, C., Sun, B., Yang, K., Lu, T., & Yeh, W.C. (2019). Thermal infrared and visible sequences fusion tracking based on a hybrid tracking framework with adaptive weighting scheme. Infrared Physics and Technology, 99, 265-276. https://doi.org/10.1016/j.infrared.2019.04.017

[16]

Mittal, S., & Vetter, J.S. (2014). A survey of methods for analyzing and improving GPU energy efficiency. ACM Computing Surveys (CSUR), 47(2), 1-23. https://doi.org/10.1145/263634

[17]

Mortezazadeh, M., Wang, L.L., Albettar, M., & Yang, S. (2022). CityFED - city fast fluid dynamics for Urban microclimate simulations on graphics processing units. Urban Climate, 41, 101063. https://doi.org/10.1016/j.uclim.2021.101063

[18]

Mussi, L., Daolio, F., & Cagnoni, S. (2011). Evaluation of parallel particle swarm optimization algorithms within the CUDA™ architecture. Information Sciences, 181(20), 4642-4657. https://doi.org/10.1016/j.ins.2010.08.045

[19]

Navarro, C.A., Hitschfeld-Kahler, N., & Mateu, L. (2014). A survey on parallel computing and its applications in data-parallel problems using GPU architectures. Communications in Computational Physics, 15(2), 285-329. https://doi.org/10.4208/cicp.110113.010813a

[20]

NVIDIA, Corporation. (2012). CUDA C Best Practices Guide. Ver. 4.1. [Technical Report]. United States: NVIDIA Corporation.

[21]

NVIDIA. (n.d.). CUDA C Programming Guide. NVIDIA Documentation. Available from: https://docs.nvidia.com/cuda/cuda-c-programming-guide [Last accessed on 2024 Nov 01].

[22]

Pllana, S., & Xhafa, F. (2017). Programming Multicore and Many-Core Computing Systems. John Wiley and Sons, United States. https://doi.org/10.1002/9781119332015

[23]

Ruder, S. (2016). An Overview of Gradient Descent Optimization Algorithms. [arXiv Preprint].

[24]

Tan, Y., & Ding, K. (2015). A survey on GPU-based implementation of swarm intelligence algorithms. IEEE Transactions on Cybernetics, 46(9), 2028-2041. https://doi.org/10.1109/TCYB.2015.2460261

[25]

Tsutsui, S., & Fujimoto, N. (2009). Solving quadratic assignment problems by genetic algorithms with GPU computation:A case study. In: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers. Montreal, Québec, Canada, 2009. p8-12. https://doi.org/10.1145/1570256.157035

[26]

Wolpert, D.H., & Macready, W.G. (1995). No Free Lunch Theorems for Search. Technical Report SFI-TR-95-02-010. Santa Fe Institute, United States.

[27]

Wong, M.L. (2009). Parallel multi-objective evolutionary algorithms on graphics processing units. In: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers. Montreal, QC, Canada. https://doi.org/10.1145/1570256.1570354

[28]

Yeh, W.C. (2009). A two-stage discrete particle swarm optimization for the problem of multiple multi-level redundancy allocation in series systems. Expert Systems with Applications, 36(5), 9192-9200. https://doi.org/10.1016/j.eswa.2008.12.024

[29]

Yeh, W.C. (2012). Simplified swarm optimization in disassembly sequencing problems with learning effects. Computers and Operations Research, 39(9), 2168-2177. https://doi.org/10.1016/j.cor.2011.10.027

[30]

Yeh, W.C. (2013). New parameter-free simplified swarm optimization for artificial neural network training and its application in the prediction of time series. IEEE Transactions on Neural Networks and Learning Systems, 24(4), 661-665. https://doi.org/10.1109/TNNLS.2012.2232678

[31]

Yeh, W.C. (2014). Orthogonal simplified swarm optimization for the series-parallel redundancy allocation problem with a mix of components. Knowledge Based Systems, 64, 1-12. https://doi.org/10.1016/j.knosys.2014.03.011

[32]

Yeh, W.C. (2015). An improved simplified swarm optimization. Knowledge Based Systems, 82, 60-69. https://doi.org/10.1016/j.knosys.2015.02.022

[33]

Yeh, W.C. (2017). A new exact solution algorithm for a novel generalized redundancy allocation problem. Information Sciences, 408, 182-197. https://doi.org/10.1016/j.ins.2017.04.019

[34]

Yeh, W.C., & Wei, S.C. (2012). Economic-based resource allocation for reliable grid-computing service based on Grid Bank. Future Generation Computer Systems, 28(7), 989-1002. https://doi.org/10.1016/j.future.2012.03.005

[35]

Yildirim, E., Arslan, E., Kim, J., & Kosar, T. (2015). Application-level optimization of big data transfers through pipelining, parallelism and concurrency. IEEE Transactions on Cloud Computing, 4(1), 63-75. https://doi.org/10.1109/TCC.2015.2415804

[36]

Zhu, H., Guo, Y., Wu, J., Gu, J., & Eguchi, K. (2011). Paralleling euclidean particle swarm optimization in CUDA. In: Proceedings of 2011 4th International Conference on Intelligent Networks and Intelligent Systems: IEEE, Kuming, China. https://doi.org/10.1109/ICINIS.2011.35

[37]

Zhu, W. (2011). Massively parallel differential evolution-pattern search optimization with graphics hardware acceleration: An investigation on bound constrained optimization problems. Journal of Global Optimization, 50(3), 417-437. https://doi.org/10.1007/s10898-010-9590-0

PDF (1241KB)

100

Accesses

0

Citation

Detail

Sections
Recommended

/