UAV swarm communication networking and routing optimization for high-demand users: a graph attention multi-agent reinforcement learning approach

Zhaopeng Ning , Gang Li , Wei Li

Autonomous Intelligent Systems ›› 2026, Vol. 6 ›› Issue (1) : 9

PDF
Autonomous Intelligent Systems ›› 2026, Vol. 6 ›› Issue (1) :9 DOI: 10.1007/s43684-026-00131-6
Original Article
research-article
UAV swarm communication networking and routing optimization for high-demand users: a graph attention multi-agent reinforcement learning approach
Author information +
History +
PDF

Abstract

Unmanned aerial vehicle swarms serving ground high-demand communication users in dynamic environments must simultaneously optimize three-dimensional trajectories, communication network topology, and routing strategies while considering limited energy, link quality fluctuations, and collision avoidance constraints. This problem faces three core challenges: routing decisions under dynamic topology require real-time adaptation to vehicle position changes and channel variations; end-to-end delay and throughput optimization in multi-hop communication demands coordinated forwarding strategies across all vehicles; high-dimensional continuous action spaces and partial observability make traditional optimization methods difficult to solve. This paper models the problem as a multi-agent partially observable Markov decision process and proposes a graph attention-based multi-agent deep deterministic policy gradient algorithm to jointly optimize velocity vectors, communication power, and routing decisions for each vehicle. The reward function comprehensively considers user quality of service, system throughput, end-to-end delay, and energy consumption while ensuring safety distance and energy margins through constraint penalties. Simulation results demonstrate that compared to single-agent deep deterministic policy gradient and independent Q-learning baseline methods, the proposed method achieves approximately 50% improvement in convergence speed, 12% to 18% increase in user service satisfaction, 25% to 40% improvement in system throughput, 30% to 45% reduction in end-to-end delay, and 39% to 102% improvement in energy efficiency. The framework dynamically adjusts network topology and routing strategies according to user demands, providing a deployable solution for large-scale vehicle swarm communication networks.

Keywords

Unmanned aerial vehicle swarms / Communication networking / Routing optimization / Multi-agent deep reinforcement learning / Graph attention network / Trajectory planning

Cite this article

Download citation ▾
Zhaopeng Ning, Gang Li, Wei Li. UAV swarm communication networking and routing optimization for high-demand users: a graph attention multi-agent reinforcement learning approach. Autonomous Intelligent Systems, 2026, 6(1): 9 DOI:10.1007/s43684-026-00131-6

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

M. Ahmed, A.A. Nasir, M. Masood, K.A. Memon, K.K. Qureshi, F. Khan, W.U. Khan, F. Xu, Z. Han, Advancements in uav-based integrated sensing and communication: a comprehensive survey (2025). arXiv preprint. arXiv:2501.06526

[2]

Zhou P., Lai S., Cui J., Chen B.M.. Formation control of unmanned rotorcraft systems with state constraints and inter-agent collision avoidance. Auton. Intell. Syst., 2023, 3(1): 4

[3]

K. Meng, C. Masouros, A.P. Petropulu, L. Hanzo, Cooperative Isac networks: Opportunities and challenges (2025). arXiv preprint. arXiv:2405.06305

[4]

K. Han, K. Meng, X.-Y. Wang, C. Masouros, Network-level Isac design: State-of-the-art, challenges, and opportunities (2025). arXiv preprint. arXiv:2505.01295

[5]

Z. Zhai, W. Ni, X. Wang, D. Niyato, E. Hossain, Integrated sensing and communication with uav swarms via decentralized consensus admm (2025). arXiv preprint. arXiv:2511.03283

[6]

Liu X., Liu Y., Chen Y., Hanzo L.. Uav-enabled integrated sensing and communication: opportunities and challenges. IEEE Commun. Surv. Tutor., 2023, 25(2): 601-639

[7]

J. Beuster, C. Andrich, S. Giehl, M. Miranda, L. Mohr, D. Novotny, T. Kaufmann, Enhancing situational awareness in Isac networks via drone swarms: a real-world channel sounding data set (2025). arXiv preprint. arXiv:2507.12010

[8]

Chen F., Yu Q.. Multiple unmanned ship coverage and exploration in complex sea areas. Auton. Intell. Syst., 2024, 4(1): 14

[9]

Ma L., Zhang J., Zhao N., Niyato D.. Multi-agent deep reinforcement learning for ris-assisted secure uav communication. J. King Saud Univ, Comput. Inf. Sci., 2025, 37(1): 133

[10]

Liu Y., Wang X., Zhang H.. Collaborative decision-making in heterogeneous uav swarms based on multi-agent deep reinforcement learning. 2024 IEEE International Conference on Communications, 2024, 1-6

[11]

Zhou L., Pu W., Jiang Y., You M.-Y., Zhang R., Shi Q.. Joint optimization of UAV deployment and directional antenna orientation for multi-UAV cooperative sensing system. IEEE Trans. Wirel. Commun., 2024, 23(10): 14052-14065

[12]

Karimi-Bidhendi S., Geraci G., Jafarkhani H.. Optimizing cellular networks for UAV corridors via quantization theory. IEEE Trans. Wirel. Commun., 2024, 23(10): 14924-14939

[13]

Zhang J., Xu J., Lu W., Zhao N., Wang X., Niyato D.. Secure transmission for IRS-aided UAV-ISAC networks. IEEE Trans. Wirel. Commun., 2024, 23(9): 12256-12269

[14]

Deng D., Zhou W., Li X., Costa D.B., Ng D.W.K., Nallanathan A.. Joint beamforming and UAV trajectory optimization for covert communications in ISAC networks. IEEE Trans. Wirel. Commun., 2025, 24(2): 1016-1030

[15]

Pang X., Guo S., Tang J., Zhao N., Al-Dhahir N.. Dynamic ISAC beamforming design for UAV-enabled vehicular networks. IEEE Trans. Wirel. Commun., 2024, 23(11): 16852-16864

[16]

Zhou S., Yang H., Xiang L., Yang K.. Temporal-assisted beamforming and trajectory prediction in sensing-enabled UAV communications. IEEE Trans. Commun., 2025, 73(7): 5408-5419

[17]

Eskandari M., Huang H.L., Savkin A.V.. Enhanced deep reinforcement learning for integrated navigation in multi-uav systems. Chin. J. Aeronaut., 2025, 38(8): 1-15

[18]

Liu C.-H., Ma Z., Ma J., Cui S.. Bayesian optimization enhanced deep reinforcement learning for trajectory planning and network formation in multi-uav networks. IEEE Trans. Wirel. Commun., 2024, 23(4): 3018-3033

[19]

K.K. Nguyen, T.Q. Duong, T. Do-Duy, H. Claussen, L. Hanzo, 3d UAV trajectory and data collection optimisation via deep reinforcement learning (2021). arXiv preprint. arXiv:2106.03129

[20]

Yang F., Wang C., Xiong J., Deng N., Zhao N., Li Y.. Uav-enabled robust covert communication against active wardens. IEEE Trans. Veh. Technol., 2024, 73(6): 9159-9164

[21]

Yu K., Feng Z., Yu J., Chen T., Peng J., Li D.. Secure ultra-reliable and low latency communication in uav-enabled noma wireless networks. IEEE Trans. Veh. Technol., 2024, 73(10): 14908-14922

[22]

Gao M., Xu G., Song Z., Cheng Y., Niyato D.. Performance analysis of random 3d mmwave-assisted uav communication system. IEEE Trans. Veh. Technol., 2024, 73(12): 19169-19185

[23]

Wang L., Zhang H., Guo S., Li D., Yuan D.. Learning to deployment: data-driven on-demand uav placement for throughput maximization. IEEE Trans. Veh. Technol., 2024, 73(6): 8007-8012

[24]

Lai L., Zheng F.-C., Luo J.. Flight direction-based handover in cellular-connected uav communications. IEEE Trans. Veh. Technol., 2024, 73(11): 17771-17775

[25]

Tran D.-D., Ha V.N., Sharma S.K., Nguyen T.T., Chatzinotas S., Popovski P.. Energy-efficient noma for 5g heterogeneous services: a joint optimization and deep reinforcement learning approach. IEEE Trans. Commun., 2025, 73(4): 2448-2465

[26]

Hevesli M., Seid A.M., Erbad A., Abdallah M.. Task offloading optimization in digital twin assisted mec-enabled air-ground iiot 6g networks. IEEE Trans. Veh. Technol., 2024, 73(11): 17527-17542

[27]

Song X., Hua Y., Yang Y., Xing G., Liu F., Xu L., Song T.. Distributed resource allocation with federated learning for delay-sensitive iov services. IEEE Trans. Veh. Technol., 2024, 73(3): 4326-4336

[28]

C.-W. Fu, M.-L. Ku, Energy-efficient federated learning for uav communications (2025). arXiv preprint. arXiv:2508.03171

[29]

Mastorakis G., Mavromoustakis C.X., Batalla J.M., Pallis E.. Uav-assisted iot network framework with hybrid deep reinforcement and federated learning. Sci. Rep., 2025, 15: 1-18

[30]

Pan F., Kang Y., Song J., Song Q., Guo L.. Hybrid sharing scheme for distributed reinforcement learning in vehicular networks. IEEE Trans. Veh. Technol., 2025, 74(4): 6674-6678

[31]

Cheng P., Chen Y., Ding M., Chen Z., Liu S., Chen Y.-P.P.. Deep reinforcement learning for online resource allocation in iot networks: technology, development, and future challenges. IEEE Commun. Mag., 2023, 61(6): 111-117

[32]

Liu C.-H., Ma Z., Ma J., Cui S.. Multi-agent deep reinforcement learning for trajectory design and power allocation in multi-uav networks. IEEE Internet Things J., 2024, 8(10): 7948-7961

[33]

Wang L., Wang K., Pan C., Xu W., Aslam N., Hanzo L.. Collaborative reinforcement learning based unmanned aerial vehicle (uav) trajectory design for 3d uav tracking. IEEE Trans. Mob. Comput., 2024, 23(12): 13636-13651

[34]

Loutfi S.I., Alraih S., Shayea I., Alhammadi A.. Joint trajectory and offloading optimization in uav-assisted mec via federated multi-agent reinforcement learning and potential fields. Comput. Netw., 2025, 256 110886

[35]

X. Yang, M. Liwang, L. Fu, Y. Su, S. Hosseinalipour, Adaptive uav-assisted hierarchical federated learning: optimizing energy, latency, and resilience for dynamic smart iot (2025). arXiv preprint. arXiv:2503.06145

[36]

Ji Z., Qin Z., Tao X.. Meta federated reinforcement learning for distributed resource allocation. IEEE Trans. Wirel. Commun., 2024, 23(7): 7865-7876

[37]

Sohaib R.M., Onireti O., Sambo Y., Swash R., Imran M.. Energy efficient resource allocation framework based on dynamic meta-transfer learning for v2x communications. IEEE Trans. Netw. Serv. Manag., 2024, 21(4): 4343-4356

[38]

Kim M., Jang J., Choi Y., Yang H.J.. Distributed task offloading and resource allocation for latency minimization in mobile edge computing networks. IEEE Trans. Mob. Comput., 2024, 23(12): 15149-15166

[39]

Wang Y., Kong M., Zhang G., Wang W., Nakachi T., Liou J.. Adaptive task offloading for mobile edge computing with forecast information. IEEE Trans. Veh. Technol., 2025, 74(3): 4132-4147

[40]

Xu D., Duan L., Zhao H., Zhu H.. Fair computation offloading for rsma-assisted mobile edge computing networks. IEEE Trans. Wirel. Commun., 2024, 23(12): 19505-19521

[41]

Wang D., Zhu H., Qiu C., Zhou Y., Lu J.. Distributed task offloading in cooperative mobile edge computing networks. IEEE Trans. Veh. Technol., 2024, 73(7): 10487-10501

[42]

Xu Y., Peng Z., Song N., Qiu Y., Zhang C., Zhang Y.. Joint optimization of service caching and task offloading for customer application in mec: a hybrid sac scheme. IEEE Trans. Consum. Electron., 2025, 71(2): 6548-6560

[43]

Zhang C., Lin B., Chen Z., Cai L.X., Duan J.. Mobile edge deployment and resource management for maritime wireless networks. IEEE Trans. Veh. Technol., 2025, 74(5): 7928-7939

[44]

Zhou Z., Zhang Q., Ge J., Liang Y.-C.. Hierarchical cognitive spectrum sharing in space-air-ground integrated networks. IEEE Trans. Wirel. Commun., 2025, 24(2): 1430-1447

[45]

Sun Y., Ye Y., Ding Z., Zhou M., Liu L.. Age of information analysis for cr-noma aided uplink systems with randomly arrived packets. IEEE Trans. Commun., 2025, 73(7): 5433-5449

Funding

National Natural Science Foundation of China(Nos. 62273262, 62422314 and 62088101)

Science and Technology Commission of Shanghai Municipality(Nos. 24ZR1492700 and 2021SHZDZX0100)

Industry-University-Research Cooperation Fund of the Eighth Research Institute of China Aerospace Science and Technology Corporation(SAST2023-019)

RIGHTS & PERMISSIONS

The Author(s)

PDF

0

Accesses

0

Citation

Detail

Sections
Recommended

/