Learning-based cooperative decision-making and control for multiple autonomous vehicles in unsignalized intersections
Ronghua Zhang , Xincheng Xu , Yang Lu , Xin Xu , Xinglong Zhang , Qingwen Ma
Intelligence & Robotics ›› 2025, Vol. 5 ›› Issue (3) : 695 -716.
Learning-based cooperative decision-making and control for multiple autonomous vehicles in unsignalized intersections
Cooperative navigation of multiple autonomous vehicles (MAVs) at unsignalized intersections remains a core challenge in intelligent transportation systems. This paper proposes a learning-based cooperative decision-making and control (LCDMC) method for MAVs, which improves policy learning efficiency and ensures safe and efficient cooperative navigation. In the proposed LCDMC algorithm, the global value function is decomposed into two components: a local utility function and a joint-action utility function among vehicles, which incorporates both the offline policy learning phase and the online deployment phase. During the offline phase, the kernel-based least-squares policy iteration method is employed to learn localized decision-making policies from high-dimensional samples. In the online deployment phase, a coordination graph for MAVs is developed, and a collaborative utility function characterizing joint action performance is designed. To solve optimized decision actions, the local utility function is integrated with a message propagation mechanism, and then the decision actions are converted into velocity commands. Furthermore, a receding-horizon reinforcement learning approach is designed to achieve trajectory tracking control of the autonomous vehicles in MAVs. Finally, to verify the effectiveness of the proposed method, numerical simulations of MAVs are performed, and the results demonstrate that the proposed LCDMC method exhibits superior performance in both traffic efficiency and safety for cooperative navigation of MAVs at unsignalized intersections.
Multiple autonomous vehicles / unsignalized intersection / decision and control / reinforcement learning / coordination graphs
/
| 〈 |
|
〉 |