Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic

Wei Zhou, Dong Chen, Jun Yan, Zhaojian Li, Huilin Yin, Wanchen Ge

Autonomous Intelligent Systems ›› 2022, Vol. 2 ›› Issue (1) : 5. DOI: 10.1007/s43684-022-00023-5
Original Article

Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic

Author information +
History +

Abstract

Autonomous driving has attracted significant research interests in the past two decades as it offers many potential benefits, including releasing drivers from exhausting driving and mitigating traffic congestion, among others. Despite promising progress, lane-changing remains a great challenge for autonomous vehicles (AV), especially in mixed and dynamic traffic scenarios. Recently, reinforcement learning (RL) has been widely explored for lane-changing decision makings in AVs with encouraging results demonstrated. However, the majority of those studies are focused on a single-vehicle setting, and lane-changing in the context of multiple AVs coexisting with human-driven vehicles (HDVs) have received scarce attention. In this paper, we formulate the lane-changing decision-making of multiple AVs in a mixed-traffic highway environment as a multi-agent reinforcement learning (MARL) problem, where each AV makes lane-changing decisions based on the motions of both neighboring AVs and HDVs. Specifically, a multi-agent advantage actor-critic (MA2C) method is proposed with a novel local reward design and a parameter sharing scheme. In particular, a multi-objective reward function is designed to incorporate fuel efficiency, driving comfort, and the safety of autonomous driving. A comprehensive experimental study is made that our proposed MARL framework consistently outperforms several state-of-the-art benchmarks in terms of efficiency, safety, and driver comfort.

Keywords

Multi-agent deep reinforcement learning / Lane-changing / Connected autonomous vehicles / Mixed traffic

Cite this article

Download citation ▾
Wei Zhou, Dong Chen, Jun Yan, Zhaojian Li, Huilin Yin, Wanchen Ge. Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic. Autonomous Intelligent Systems, 2022, 2(1): 5 https://doi.org/10.1007/s43684-022-00023-5

References

[1]
PadenB., CápM., YongS.Z., YershovD.S., FrazzoliE.. A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Veh., 2016, 1(1):33-55
CrossRef Google scholar
[2]
DesirajuD., ChantemT., HeaslipK.. Minimizing the disruption of traffic flow of automated vehicles during lane changes. IEEE Trans. Intell. Transp. Syst., 2015, 16(3):1249-1258
CrossRef Google scholar
[3]
LiT., WuJ., ChanC.-Y., LiuM., ZhuC., LuW., HuK.. A cooperative lane change model for connected and automated vehicles. IEEE Access, 2020, 8: 54940-54951
CrossRef Google scholar
[4]
ChenD., JiangL., WangY., LiZ.. Autonomous driving using safe reinforcement learning by incorporating a regret-based human lane-changing decision model. American Control Conference (ACC), 2020 4355-4361
[5]
WangP., LiH., ChanC.-Y.. Continuous control for automated lane change behavior based on deep deterministic policy gradient algorithm. IEEE Intelligent Vehicles Symposium (IV), 2019 1454-1460
[6]
XiC., ShiT., WuY., SunL.. Efficient motion planning for automated lane change based on imitation learning and mixed-integer optimization. 23rd International Conference on Intelligent Transportation Systems (ITSC), 2020 1-6
[7]
G. Wang, J. Hu, Z. Li, L. Li, Harmonious lane changing via deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. PP(99), 1–9 (2021)
[8]
R. Du, S. Chen, Y. Li, J. Dong, P.Y.J. Ha, S. Labi, A cooperative control framework for CAV lane change in a mixed traffic environment (2020). CoRR. arXiv:2010.05439
[9]
HoelC.-J., WolffK., LaineL.. ZhangW.-B., BayenA.M., Sánchez MedinaJ.J., BarthM.J.. Automated speed and lane change decision making using deep reinforcement learning. 21st International Conference on Intelligent Transportation Systems (ITSC), 2018 2148-2155
[10]
VinyalsO., BabuschkinI., CzarneckiW.M., MathieuM., DudzikA., ChungJ., ChoiD.H., PowellR., EwaldsT., GeorgievP., et al.. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 2019, 575(7782):350-354
CrossRef Google scholar
[11]
ChuT., WangJ., CodecàL., LiZ.. Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans. Intell. Transp. Syst., 2020, 21(3):1086-1095
CrossRef Google scholar
[12]
LinK., ZhaoR., XuZ., ZhouJ.. Efficient large-scale fleet management via multi-agent deep reinforcement learning. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018 1774-1783
CrossRef Google scholar
[13]
S. Shalev-Shwartz, Shaked Shammah, and Amnon Shashua. Safe, multi-agent, reinforcement learning for autonomous driving (2016). CoRR. arXiv:1610.03295
[14]
WangJ., ShiT., WuY., Miranda-MorenoL., SunL.. Multi-agent graph reinforcement learning for connected automated driving. Proceedings of the 37th International Conference on Machine Learning (ICML), 2020 1-6
[15]
P. Young Joun Ha, S. Chen, J. Dong, R. Du, Y. Li, S. Labi, Leveraging the capabilities of connected and autonomous vehicles and multi-agent reinforcement learning to mitigate highway bottleneck congestion (2020). CoRR. arXiv:2010.05436
[16]
PalanisamyP.. Multi-agent connected autonomous driving using deep reinforcement learning. International Joint Conference on Neural Networks (IJCNN), 2020 1-7
[17]
ChenS., DongJ., HaP., LiY., LabiS.. Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles. Comput.-Aided Civ. Infrastruct. Eng., 2021, 36(7):838-857
CrossRef Google scholar
[18]
HoM.L., ChanP.T., RadA.B.. Lane change algorithm for autonomous vehicles via virtual curvature method. J. Adv. Transp., 2009, 43(1):47-70
CrossRef Google scholar
[19]
NilssonJ., SilvlinJ., BrannstromM., CoelinghE., FredrikssonJ.. If, when, and how to perform lane change maneuvers on highways. IEEE Intell. Transp. Syst. Mag., 2016, 8(4):68-78
CrossRef Google scholar
[20]
NilssonJ., BrännströmM., CoelinghE., FredrikssonJ.. Lane change maneuvers for automated vehicles. IEEE Intell. Transp. Syst. Mag., 2016, 18(5):1087-1096
CrossRef Google scholar
[21]
ChenY., DongC., PalanisamyP., MudaligeP., MuellingK., DolanJ.M.. Attention-based hierarchical deep reinforcement learning for lane change behaviors in autonomous driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019 137-145
[22]
J. Dong, S. Chen, P. Young Joun Ha, Y. Li, S. Labi, A drl-based multiagent cooperative control framework for CAV networks: a graphic convolution Q network (2020). CoRR. arXiv:2010.05437
[23]
D. Chen, Z. Li, Y. Wang, L. Jiang, Y. Wang, Deep multi-agent reinforcement learning for highway on-ramp merging in mixed traffic (2021). CoRR. arXiv:2105.05701
[24]
KipfT.N., WellingM.. Semi-supervised classification with graph convolutional networks. 5th International Conference on Learning Representations (ICLR), 2017
[25]
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M.A. Riedmiller, Playing atari with deep reinforcement learning (2013). CoRR. arXiv:1312.5602
[26]
SpaanM.T.. Partially observable Markov decision processes. Reinforcement Learning, 2012 Berlin Springer 387-414
CrossRef Google scholar
[27]
MnihV., BadiaA.P., MirzaM., GravesA., LillicrapT., HarleyT., SilverD., KavukcuogluK.. Asynchronous methods for deep reinforcement learning. International Conference on Machine Learning (ICML), 2016 1928-1937
[28]
M. Kaushik, N. Singhania, K.M. Krishna, Parameter sharing reinforcement learning architecture for multi agent driving behaviors (2018). CoRR. arXiv:1811.07214
[29]
SchesterL., OrtizL.E.. Longitudinal position control for highway on-ramp merging: a multi-agent approach to automated driving. 22nd IEEE Intelligent Transportation Systems Conference (ITSC), 2019 3461-3468
[30]
PalanisamyP.. Multi-agent connected autonomous driving using deep reinforcement learning. International Joint Conference on Neural Networks (IJCNN), 2020 1-7
[31]
WangP., LiH., ChanC.-Y.. Continuous control for automated lane change behavior based on deep deterministic policy gradient algorithm. IEEE Intelligent Vehicles Symposium (IV), 2019 1454-1460
[32]
A. Mavrogiannis, R. Chandra, D. Manocha, B-GAP: behavior-guided action prediction for autonomous navigation (2020). CoRR. arXiv:2011.03748
[33]
ChuT., ChinchaliS., KattiS.. Multi-agent reinforcement learning for networked system control. 8th International Conference on Learning Representations (ICLR), 2020 1-17
[34]
SuttonR.S., BartoA.G.. Reinforcement Learning: An Introduction, 2018 Cambridge MIT press
[35]
M. Kaushik, S. Phaniteja, K.M. Krishna, Parameter sharing reinforcement learning architecture for multi agent driving behaviors (2018). CoRR. arXiv:1811.07214
[36]
TreiberM., HenneckeA., HelbingD.. Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E., 2000, 62: 1805-1824
CrossRef Google scholar
[37]
KestingA., TreiberM., HelbingD.. Connectivity statistics of store-and-forward intervehicle communication. IEEE Trans. Intell. Transp. Syst., 2010, 11(1):172-181
CrossRef Google scholar
[38]
E. Leurent, An environment for autonomous driving decision-making (2018). https://github.com/eleurent/highway-env
[39]
GraesserL., KengW.L.. Foundations of Deep Reinforcement Learning: Theory and Practice in Python, 2019 Reading Addison-Wesley
[40]
JiG., YanJ., DuJ., YanW., ChenJ., LuY., RojasJ., ChengS.S.. Towards safe control of continuum manipulator using shielded multiagent reinforcement learning. IEEE Robot. Autom. Lett., 2021, 6(4):7461-7468
CrossRef Google scholar
[41]
WuY., MansimovE., GrosseR.B., LiaoS., BaJ.. WallachH.M., FergusR., VishwanathanS.V.N., GarnettR.. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. Isabelle Guyon, Ulrike Von Luxburg, Samy Bengio, 2017 5279-5288
[42]
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms (2017). CoRR. arXiv:1707.06347
[43]
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms (2017). CoRR. arXiv:1707.06347
[44]
SchulmanJ., LevineS., AbbeelP., JordanM., MoritzP.. Trust region policy optimization. International Conference on Machine Learning (ICML), 2015 1889-1897
Funding
National Natural Science Foundation of China(61701348)

Accesses

Citations

Detail

Sections
Recommended

/