Adaptive Reinforcement Learning with Multi-Modal Perception for Autonomous Formation Control and Exploration in Large-Scale Multi-UAV Swarms

Ziyuan Ma; Huajun Gong; Xinhua Wang

doi:10.15918/j.jbit1004-0579.2025.043

Journal of Beijing Institute of Technology ›› 2026, Vol. 35 ›› Issue (1) :63 -83. DOI: 10.15918/j.jbit1004-0579.2025.043

Adaptive Reinforcement Learning with Multi-Modal Perception for Autonomous Formation Control and Exploration in Large-Scale Multi-UAV Swarms

Ziyuan Ma, Huajun Gong, Xinhua Wang

Author information +

History +

PDF (3120KB)

Abstract

To address the challenge of achieving decentralized, scalable, and adaptive control for large-scale multiple unmanned aerial vehicle (multi-UAV) swarms in dynamic urban environments with obstacles and wind perturbations, we proposed a hybrid framework integrating adaptive reinforcement learning (RL), multi-modal perception fusion, and enhanced pigeon flock optimization (PFO) with curiosity-driven exploration to enable robust autonomous and formation control. The framework leverages meta-learning to optimize RL policies for real-time adaptation, fuses sensor data for precise state estimation, and enhances PFO with learned leader-follower dynamics and exploration rewards to maintain cohesive formations and explore uncertain areas. For swarms of 10–30 UAVs, it achieves 34% faster convergence, 61% reduced stability root mean square error (RMSE), 88% fewer collisions and 85.6%–92.3% success rates in target detection and encirclement, outperforming standard multi-agent RL, pure PFO, and single-modality RL. Three-dimensional trajectory visualizations confirm cohesive formations, collision-free maneuvers, and efficient exploration in urban search-and-rescue scenarios. Innovations include meta-RL for rapid adaptation, multi-modal fusion for robust perception, and curiosity-driven PFO for scalable, decentralized control, advancing real-world multi-UAV swarm autonomy and coordination.

Keywords

multiple unmanned aerial vehicle (multi-UAV) swarm / autonomous control / reinforcement learning (RL) / multi-modal perception / pigeon flock optimization (PFO)

Cite this article

Download citation ▾

Ziyuan Ma, Huajun Gong, Xinhua Wang. Adaptive Reinforcement Learning with Multi-Modal Perception for Autonomous Formation Control and Exploration in Large-Scale Multi-UAV Swarms. Journal of Beijing Institute of Technology, 2026, 35(1): 63-83 DOI:10.15918/j.jbit1004-0579.2025.043