Autonomous maneuver strategy of swarm air combat based on DDPG

Luhe Wang, Jinwen Hu, Zhao Xu, Chunhui Zhao

Autonomous Intelligent Systems ›› 2021, Vol. 1 ›› Issue (1) : 15. DOI: 10.1007/s43684-021-00013-z
Original Article

Autonomous maneuver strategy of swarm air combat based on DDPG

Author information +
History +

Abstract

Unmanned aerial vehicles (UAVs) have been found significantly important in the air combats, where intelligent and swarms of UAVs will be able to tackle with the tasks of high complexity and dynamics. The key to empower the UAVs with such capability is the autonomous maneuver decision making. In this paper, an autonomous maneuver strategy of UAV swarms in beyond visual range air combat based on reinforcement learning is proposed. First, based on the process of air combat and the constraints of the swarm, the motion model of UAV and the multi-to-one air combat model are established. Second, a two-stage maneuver strategy based on air combat principles is designed which include inter-vehicle collaboration and target-vehicle confrontation. Then, a swarm air combat algorithm based on deep deterministic policy gradient strategy (DDPG) is proposed for online strategy training. Finally, the effectiveness of the proposed algorithm is validated by multi-scene simulations. The results show that the algorithm is suitable for UAV swarms of different scales.

Keywords

Deep reinforcement learning / Cooperative air combat / Swarm / Maneuver strategy

Cite this article

Download citation ▾
Luhe Wang, Jinwen Hu, Zhao Xu, Chunhui Zhao. Autonomous maneuver strategy of swarm air combat based on DDPG. Autonomous Intelligent Systems, 2021, 1(1): 15 https://doi.org/10.1007/s43684-021-00013-z

References

[1]
LiY., QiuX., LiuX., XiaQ.. Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of ucavs. J. Syst. Eng. Electron., 2020, 31(4):734-742
CrossRef Google scholar
[2]
HuD., YangR., ZuoJ., ZhangZ., WangY.. Application of deep reinforcement learning in maneuver planning of beyond-visual-range air combat. IEEE Access, 2021, PP(99):1-1
[3]
XuA., ChenX., LiZ. W., HuX. D.. A method of situation assessment for beyond-visual-range air combat based on tactical attack area. Fire Control Command Control, 2020, 45(9):97-102
[4]
HuZ. H., LvY., XuA.. A threat assessment method for beyond-visual-range air combat based on situation prediction. Electron. Opt. Control, 2020, 27(3):8-1226
[5]
WuW. H., ZhouS. Y., GaoL., LiuJ. T.. Improvements of situation assessment for beyond-visual-range air combat based on missile launching envelope analysis. Syst. Eng. Electron., 2011, 33(12):2679-2685
[6]
H. Luo, Target detection method in short coherent integration time for sky wave over-the-horizon radar. Sadhana. 45(1) (2020).
[7]
T. Liu, R. W. Mei, in Proceedings of 2019 International Conference on Computer Science, Communications and Multimedia Engineering (CSCME 2019), Shanghai, China. Over-the-horizon radar impulsive interference detection with pseudo-music algorithm, (2019). Computer Science and Engineering (ISSN 2475-8841).
[8]
H. Wu, H. Li, R. Xiao, J. Liu, Modeling and simulation of dynamic ant colony’s labor division for task allocation of uav swarm. Phys. A Stat. Mech. Appl., 0378437117308166 (2017). https://doi.org/10.1016/j.physa.2017.08.094.
[9]
AustinF., CarboneG., HinzH., LewisM., FalcoM.. Game theory for automated maneuvering during air-to-air combat. J. Guid. Control Dyn., 1990, 13(6):1143-1149
CrossRef Google scholar
[10]
J. S. Ha, H. J. Chae, H. L. Choi, A stochastic game-theoretic approach for analysis of multiple cooperative air combat. Am. Autom. Control Counc., 3728–3733 (2015). https://doi.org/10.1109/acc.2015.7171909.
[11]
WangR. P., GaoZ. H.. Research on decision system in air combat simulation using maneuver library. Flight Dyn., 2009, 27(6):72-75
[12]
KaiV., RaivioT., HmlinenR. P.. Modeling pilot’s sequential maneuvering decisions by a multistage influence diagram. J. Guidance Control Dyn., 2004, 27(4):665-677
CrossRef Google scholar
[13]
VirtanenK., KarelahtiJ., RaivioT.. Modeling air combat by a moving horizon influence diagram game. J. Guidance Control Dyn., 2004, 29(5):5
[14]
EhtamoH., RaivioT.. On applied nonlinear and bilevel programming or pursuit-evasion games. J. Optim. Theory Appl., 2001, 108(1):65-96
CrossRef Google scholar
[15]
ZhongL., TongM., ZhongW.. Application of multistage influence diagram game theory for multiple cooperative air combat. J. Beijing Univ. Aeronaut. Astronaut., 2007, 33(4):450-453
[16]
LiuZ., LiangA., JiangC., WuQ. X.. Application of multistage influence diagram in maneuver decision-making of ucav cooperative combat. Electron. Opt. Control, 2010, 33(4):450-453
[17]
J. Kaneshige, K. Krishnakumar, in Proceedings of SPIE - The International Society for Optical Engineering, 6560:656009. Artificial immune system approach for air combat maneuvering, (2007).
[18]
N. Ernest, D. Carroll, C. Schumacher, M. Clark, G. Lee, Genetic fuzzy based artificial intelligence for unmanned combat aerialvehicle control in simulated air combat missions. J. Defense Manag.06(1) (2016).
[19]
ErnestN., CarrollD., SchumacherC., ClarkM., LeeG.. Genetic fuzzy based artificial intelligence for unmanned combat aerialvehicle control in simulated air combat missions. J. Defense Manag., 2016, 06(1):1-7
[20]
FallatiL., PolidoriA., SalvatoreC., SaponariL., SaviniA., GalliP.. Anthropogenic marine debris assessment with unmanned aerial vehicle imagery and deep learning: A case study along the beaches of the republic of maldives. Sci. Total Environ., 2019, 693: 133581
CrossRef Google scholar
[21]
NeupaneB., HoranontT., HungN. D.. Deep learning based banana plant detection and counting using high-resolution red-green-blue (rgb) images collected from unmanned aerial vehicle (uav). PLoS ONE, 2019, 14(10):0223906
CrossRef Google scholar
[22]
JiaoZ., JiaC. G., CaiC. Y.. A new approach to oil spill detection that combines deep learning with unmanned aerial vehicles. Comput. Ind. Eng., 2018, 135: 1300-1311
CrossRef Google scholar
[23]
X. Zhao, Y. Yuan, M. Song, Y. Ding, F. Lin, D. Liang, D. Zhang, Use of unmanned aerial vehicle imagery and deep learning unet to extract rice lodging. Sensors (Basel, Switzerland). 19(18) (2019). https://doi.org/10.3390/s19183859.
[24]
QuC., GaiW., ZhongM., ZhangJ.. A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (uavs) path planning. Appl. Soft Comput. J., 2020, 89: 106099
CrossRef Google scholar
[25]
Z. X, Q. Zong, B. Tian, B. Zhang, M. You, Fast task allocation for heterogeneous unmanned aerial vehicles through reinforcement learning. Aerosp. Sci. Technol.92: (2019). https://doi.org/10.1016/j.ast.2019.06.024.
[26]
YangJ., XY., WuG., HassanM. M., AlmogrenA., GunaJ.. Application of reinforcement learning in uav cluster task scheduling. Futur. Gener. Comput. Syst., 2019, 95: 140-148
CrossRef Google scholar
[27]
DS., HuangA., MaddisonC. J., GuezA., SifreL., DriesscheG., SchrittwieserJ., AntonoglouI., PanneershelvamV., LanctotM.. Mastering the game of go with deep neural networks and tree search. Nature, 2016, 529(7587):484-489
CrossRef Google scholar
[28]
SilverD., SchrittwieserJ., SimonyanK., AntonoglouI., HassabisD.. Mastering the game of go without human knowledge. Nature, 2017, 550(7676):354-359
CrossRef Google scholar
[29]
MaY., ZhuW., BentonM. G., RomagnoliJ.. Continuous control of a polymerization system with deep reinforcement learning. J. Process Control, 2019, 75: 40-47
CrossRef Google scholar
[30]
ZhangQ., YangR., YuL. X., ZhangT., JZ.. Bvr air combat maneuvering decision by using q-network reinforcement learning. J. Air Force Eng. Univ. (Nat. Sci. Ed.), 2018, 19(6):8-14
[31]
ChithapuramC. U., CherukuriA. K., JeppuY. V.. Aerial vehicle guidance based on passive machine learning technique. Int. J. Intell. Comput. Cybern., 2016, 9(3):255-273
CrossRef Google scholar
[32]
ZhangX., LiuG., YangC., JiangW.. Research on air combat maneuver decision-making method based on reinforcement learning. Electronics, 2018, 7(11):279
CrossRef Google scholar
[33]
B. Kurniawan, P. Vamplew, M. Papasimeon, R. Dazeley, C. Foale, in AI 2019: Advances in Artificial Intelligence, 32nd Australasian Joint Conference, Adelaide, SA, Australia, December 2–5, 2019, Proceedings. An empirical study of reward structures for actor-critic reinforcement learning in air combatmanoeuvring simulation (Springer, 2019), pp. 2–5.
[34]
YangQ., ZhangJ., ShiG., HuJ., WuY.. Maneuver decision of uav in short-range air combat based on deep reinforcement learning. IEEE Access, 2019, PP(99):1-1
[35]
Q. Yang, Y. Zhu, J. Zhang, S. Qiao, J. Liu, in 2019 IEEE 15th International Conference on Control and Automation (ICCA). Uav air combat autonomous maneuver decision based on ddpg algorithm, (2019), pp. 16–19. https://doi.org/10.1109/icca.2019.8899703.
[36]
TienH. C., BattadA., BryceE. A., FullerJ., SimorA.. Multi-drug resistant acinetobacter infections in critically injured canadian forces soldiers. BMC Infect. Dis., 2007, 7(1):1-6
CrossRef Google scholar
[37]
R. Z. Xie, J. Y. Li, D. L. Luo, in 2014 11th IEEE International Conference on Control and Automation (ICCA). Research on maneuvering decisions for multi-uavs air combat (IEEE, 2014).
[38]
VolodymyrM., KorayK., DavidS., RusuA. A., JoelV., BellemareM. G., AlexG., MartinR., FidjelandA. K., GeorgO.. Human-level control through deep reinforcement learning. Nature, 2019, 518(7540):529-33
[39]
LillicrapT. P., HuntJ. J., PritzelA., HeessN., ErezT., TassaY., SilverD., WierstraD.. Continuous control with deep reinforcement learning. Comput. ence, 2015, 8(6):187-200
Funding
foundation of cetc key laboratory of data link technology(CLDL-20202101); national natural science foundation of china(61803309); the key research and development project of shaanxi province(2020ZDLGY06-02); the aeronautical science foundation of china(2019ZA053008); the china postdoctoral science foundation(2018M633574)

Accesses

Citations

Detail

Sections
Recommended

/