A Survey on Reinforcement Learning for Optimal Decision-Making and Control of Intelligent Vehicles

Yixing Lan , Xin Xu , Jiahang Liu , Xinglong Zhang , Yang Lu , Long Cheng

CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) : 1593 -1615.

PDF (1869KB)
CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) :1593 -1615. DOI: 10.1049/cit2.70073
REVIEW
research-article

A Survey on Reinforcement Learning for Optimal Decision-Making and Control of Intelligent Vehicles

Author information +
History +
PDF (1869KB)

Abstract

Reinforcement learning (RL) has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties. In recent years, the applications of RL in optimised decision-making and motion control of intelligent vehicles have received increasing attention. Due to the complex and dynamic operating environments of intelligent vehicles, it is necessary to improve the learning efficiency and generalisation ability of RL-based decision and control algorithms under different conditions. This survey systematically examines the theoretical foundations, algorithmic advancements and practical challenges of applying RL to intelligent vehicle systems operating in complex and dynamic environments. The major algorithm frameworks of RL are first introduced, and the recent advances in RL-based decision-making and control of intel-ligent vehicles are overviewed. In addition to self-learning decision and control approaches using state measurements, the developments of DRL methods for end-to-end driving control of intelligent vehicles are summarised. The open problems and directions for further research works are also discussed.

Keywords

adaptive dynamic programming / intelligent vehicles / learning control / optimal decision-making / reinforcement learning

Cite this article

Download citation ▾
Yixing Lan, Xin Xu, Jiahang Liu, Xinglong Zhang, Yang Lu, Long Cheng. A Survey on Reinforcement Learning for Optimal Decision-Making and Control of Intelligent Vehicles. CAAI Transactions on Intelligence Technology, 2025, 10(6): 1593-1615 DOI:10.1049/cit2.70073

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

C. J. C. H. Watkins and P. Dayan, “Technical Note: Q-Learning,” Machine Learning 8, no. 3-4 (1992): 279-292, https://doi.org/10.1023/a:1022676722315.

[2]

R. S. Sutton, “Learning to Predict by the Method of Temporal Dif-ferences,” Machine Learning 3, no. 1 (1988): 9-44, https://doi.org/10.1023/a:1022633531479.

[3]

G. Peng, C. L. P. Chen, and C. Yang, “Robust Admittance Control of Optimized Robot-Environment Interaction Using Reference Adapta-tion,” IEEE Transactions on Neural Networks and Learning Systems 34, no. 9 (2023): 5804-5815, https://doi.org/10.1109/tnnls.2021.3131261.

[4]

L. Shi and Y. Chi, “Distributionally Robust Model-Based Offiine Reinforcement Learning With Near-Optimal Sample Complexity,” Journal of Machine Learning Research 25, no. 200 (2024): 1-91, https://www.jmlr.org/papers/v25/22-1482.html.

[5]

L. Jin, “Serialized Recommendation Prediction for Steering Point Behavior of Intelligent Transportation Vehicles Based on Deep Learning,” IEEE Transactions on Intelligent Transportation Systems 24, no. 11 (2023): 13.350-13.358, https://doi.org/10.1109/tits.2022.3215501.

[6]

X. Zhang, Y. Peng, B. Luo, W. Pan, X. Xu, and H. Xie, “Model-based Safe Reinforcement Learning With Time-Varying Constraints: Appli-cations to Intelligent Vehicles,” IEEE Transactions on Industrial Elec-tronics 71, no. 10 (2024): 12.744-12.753, https://doi.org/10.1109/tie.2023.3317853.

[7]

C. Wen, J. Qian, J. Lin, J. Teng, D. Jayaraman, and Y. Gao, “Fighting Fire With Fire: Avoiding DNN Shortcuts Through Priming,” in Pro-ceedings of the 39th International Conference on Machine Learning, PMLR, Vol. 162 (17-23 July 2022), 23.723-23.750.

[8]

J. Liu, X. Qi, P. Hang, and J. Sun, “Enhancing Social Decision-Making of Autonomous Vehicles: A Mixed-Strategy Game Approach With Interaction Orientation Identification,” IEEE Transactions on Vehicular Technology 73, no. 9 (2024): 12.385-12.398, https://doi.org/10.1109/tvt.2024.3385750.

[9]

H. Taghavifar, C. Wei, and L. Taghavifar, “Socially Intelligent Reinforcement Learning for Optimal Automated Vehicle Control in Traffic Scenarios,” IEEE Transactions on Automation Science and Engineering 22 (2025): 129-140, https://doi.org/10.1109/tase.2023.3347264.

[10]

Y. Jiang, Y. Ding, X. Zhang, X. Xu, and J. Huang, “A Self-Learning Human-Machine Cooperative Control Method Based on Driver Intention Recognition,” CAAI Transactions on Intelligence Technology 9, no. 5 (2024): 1101-1115, https://doi.org/10.1049/cit2.12313.

[11]

R. Chai, A. Tsourdos, S. Chai, Y. Xia, A. Savvaris, and C. L. P. Chen, “Multiphase Overtaking Maneuver Planning for Autonomous Ground Vehicles via a Desensitized Trajectory Optimization Approach,” IEEE Transactions on Industrial Informatics 19, no. 1 (2023): 74-87, https://doi.org/10.1109/tii.2022.3168434.

[12]

H. Kumar, A. Koppel, and A. Ribeiro, “On the Sample Complexity of Actor-Critic Method for Reinforcement Learning With Function Approximation,” Machine Learning 112, no. 7 (2023): 2433-2467, https://doi.org/10.1007/s10994-023-06303-2.

[13]

M. Gargiani, A. Zanelli, A. Martinelli, T. Summers, and J. Lygeros, “PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method With Probabilistic Gradient Estimation,” in Proceedings of the 39th International Conference on Machine Learning, Vol. 162 (PMLR, 17-23 July 2022), 7223-7240.

[14]

K. Ciosek and S. Whiteson, “Expected Policy Gradients for Rein-forcement Learning,” Journal of Machine Learning Research 21, no. 52 (2020): 1-51, https://www.jmlr.org/papers/v21/18-012.html.

[15]

Y. Jia and X. Y. Zhou, “Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach,” Journal of Machine Learning Research 23, no. 154 (2022): 1-55, https://www.jmlr.org/papers/v23/21-0947.html.

[16]

D. Dutta and S. R. Upreti, “A Survey and Comparative Evaluation of Actor-Critic Methods in Process Control,” Canadian Journal of Chem-ical Engineering 100, no. 9 (2022): 2028-2056, https://doi.org/10.1002/cjce.24508.

[17]

J. Ren, Y. Lan, X. Xu, Y. Zhang, Q. Fang, and Y. Zeng, “Deep Reinforcement Learning Using Least-Squares Truncated Temporal-Difference,” CAAI Transactions on Intelligence Technology 9, no. 2 (2024): 425-439, https://doi.org/10.1049/cit2.12202.

[18]

T. Hu, B. Luo, C. Yang, and T. Huang, “MO-MIX: Multi-Objective Multi-Agent Cooperative Decision-Making With Deep Reinforcement Learning,” IEEE Transactions on Pattern Analysis and Machine Intelli-gence 45, no. 10 (2023): 12.098-12.112, https://doi.org/10.1109/tpami.2023.3283537.

[19]

Y. Lan, X. Xu, Q. Fang, and J. Hao, “Sample Efficient Deep Rein-forcement Learning With Online State Abstraction and Causal Trans-former Model Prediction,” IEEE Trans. Neural Networks Learn. Syst. 35, no. 11 (2024): 16.574-16.588, https://doi.org/10.1109/tnnls.2023.3296642.

[20]

T. Chaffre, J. Wheare, A. Lammas, et al., “Sim-To-Real Transfer of Adaptive Control Parameters for AUV Stabilisation Under Current Disturbance,” International Journal of Robotics Research 44, no. 3 (2025): 407-430, https://doi.org/10.1177/02783649241272115.

[21]

M. Laskin, A. Srinivas, and P. Abbeel, “CURL: Contrastive Unsu-pervised Representations for Reinforcement Learning,” in Proceedings of the 37th International Conference on Machine Learning, PMLR, Vol. 119 (13-18 July 2020), 5639-5650.

[22]

D. Hafner, T. Lillicrap, I. Fischer, et al., “Learning Latent Dynamics for Planning From Pixels,” in Proceedings of the 36th International Conference on Machine Learning, Vol. 97 (PMLR, 09-15 June 2019), 2555-2565.

[23]

Z. Zhu, K. Lin, A. K. Jain, and J. Zhou, “Transfer Learning in Deep Reinforcement Learning: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence 45, no. 11 (2023): 13.344-13.362, https://doi.org/10.1109/tpami.2023.3292075.

[24]

Y. Lan, X. Xu, Q. Fang, Y. Zeng, X. Liu, and X. Zhang, “Transfer Reinforcement Learning via Meta-Knowledge Extraction Using Auto-Pruned Decision Trees,” Knowledge-Based Systems 242 (2022): 108221, https://doi.org/10.1016/j.knosys.2022.108221.

[25]

O. Marom and B. Rosman, “Transferable Dynamics Models for Efficient Object-Oriented Reinforcement Learning,” Artificial Intelli-gence 329 (2024): 104079, https://doi.org/10.1016/j.artint.2024.104079.

[26]

H. Shu, T. Liu, X. Mu, and D. Cao, “Driving Tasks Transfer Using Deep Reinforcement Learning for Decision-Making of Autonomous Vehicles in Unsignalized Intersection,” IEEE Transactions on Vehicular Technology 71, no. 1 (2022): 41-52, https://doi.org/10.1109/tvt.2021.3121985.

[27]

S. Gu, L. Yang, Y. Du,et al., “A Review of Safe Reinforcement Learning: Methods, Theories, and Applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence 46, no. 12 (2024): 11.216-11.235, https://doi.org/10.1109/tpami.2024.3457538.

[28]

A. Verma, “Verifiable and Interpretable Reinforcement Learning Through Program Synthesis,” Proceedings of the AAAI Conference on Artificial Intelligence 33, no. 1 (2019): 9902-9903, https://doi.org/10.1609/aaai.v33i01.33019902.

[29]

D. Hein, S. Udluft, and T. A. Runkler, “Interpretable Policies for Reinforcement Learning by Genetic Programming,” Engineering Appli-cations of Artificial Intelligence 76 (2018): 158-169, https://doi.org/10.1016/j.engappai.2018.09.007.

[30]

S. Aradi, “Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles,” IEEE Transactions on Intelligent Transportation Systems 23, no. 2 (2022): 740-759, https://doi.org/10.1109/tits.2020.3024655.

[31]

F. Ye, S. Zhang, P. Wang, and C. Chan, “A Survey of Deep Rein-forcement Learning Algorithms for Motion Planning and Control of Autonomous Vehicles,” in IEEE Intelligent Vehicles Symposium, IV 2021, Nagoya, Japan, July 11-17, 2021 (IEEE, 2021), 1073-1080.

[32]

J. Dinneweth, A. Boubezoul, R. Mandiau, and S. Espié “Multi-Agent Reinforcement Learning for Autonomous Vehicles: A Survey,” Autonomous Intelligent Systems 2, no. 1 (2022): 27, https://doi.org/10.1007/s43684-022-00045-z.

[33]

B. Hegde and M. Bouroche, “Multi-agent Reinforcement Learning for Safe Lane Changes by Connected and Autonomous Vehicles: A Survey,” AI Communications 37, no. 2 (2024): 203-222, https://doi.org/10.3233/aic-220316.

[34]

R.-E. Precup, S. Preitl, J. K. Tar, et al., “Fuzzy Control System Performance Enhancement by Iterative Learning Control,” IEEE Transactions on Industrial Electronics 55, no. 9 (2008): 3461-3475, https://doi.org/10.1109/tie.2008.925322.

[35]

G. Du, Y. Zou, X. Zhang, Z. Li, and Q. Liu, “Hierarchical Motion Planning and Tracking for Autonomous Vehicles Using Global Heu-ristic Based Potential Field and Reinforcement Learning Based Predic-tive Control,” IEEE Transactions on Intelligent Transportation Systems 24, no. 8 (2023): 8304-8323, https://doi.org/10.1109/tits.2023.3266195.

[36]

E. Candela, O. Doustaly, L. Parada, F. Feng, Y. Demiris, and P. Angeloudis, “Risk-aware Controller for Autonomous Vehicles Using Model-Based Collision Prediction and Reinforcement Learning,” Arti-ficial Intelligence 320 (2023): 103923, https://doi.org/10.1016/j.artint.2023.103923.

[37]

J. Li, D. Isele, K. Lee, J. Park, K. Fujimura, and M. J. Kochenderfer, “Interactive Autonomous Navigation With Internal State Inference and Interactivity Estimation,” IEEE Transactions on Robotics 40 (2024): 2932-2949, https://doi.org/10.1109/tro.2024.3400937.

[38]

S. B. Sarkar and B. C. Mohan, “Review on Autonomous Vehicle Challenges,” in First International Conference on Artificial Intelligence and Cognitive Computing (Springer, 2019), 593-603.

[39]

A. Likmeta, A. M. Metelli, A. Tirinzoni, R. Giol, M. Restelli, and D. Romano, “Combining Reinforcement Learning With Rule-Based Con-trollers for Transparent and General Decision-Making in Autonomous Driving,” Robotics and Autonomous Systems 131 (2020): 103568, https://doi.org/10.1016/j.robot.2020.103568.

[40]

M. R. Bachute and J. M. Subhedar, “Autonomous Driving Archi-tectures: Insights of Machine Learning and Deep Learning Algorithms,” Machine Learning with Applications 6 (2021): 100164, https://doi.org/10.1016/j.mlwa.2021.100164.

[41]

W. Huang, H. Liu, Z. Huang, and C. Lv, “Safety-aware Human-In-The-Loop Reinforcement Learning With Shared Control for Autonomous Driving,” IEEE Transactions on Intelligent Transportation Systems 25, no. 11 (2024): 16.181-16.192, https://doi.org/10.1109/tits.2024.3420959.

[42]

J. Ziegler, P. Bender, M. Schreiber, et al., “Making Bertha Drive—An Autonomous Journey on a Historic Route,” IEEE Intelligent Transportation Systems Magazine 6, no. 2 (2014): 8-20, https://doi.org/10.1109/mits.2014.2306552.

[43]

S. Hwang, K. Lee, H. Jeon, and D. Kum, “Autonomous Vehicle Cut-In Algorithm for Lane-Merging Scenarios via Policy-Based Reinforce-ment Learning Nested Within Finite-State Machine,” IEEE Transactions on Intelligent Transportation Systems 23, no. 10 (2022): 17.594-17.606, https://doi.org/10.1109/tits.2022.3153848.

[44]

Y. Shu, J. Zhou, and F. Zhang, “Safety-critical Decision-Making and Control for Autonomous Vehicles With Highest Priority,” in 2023 IEEE Intelligent Vehicles Symposium (IV), (2023), 1-8.

[45]

X. Zheng, H. Li, Q. Zhang, et al., “Intelligent Decision-Making Method for Vehicles in Emergency Conditions Based on Artificial Po-tential Fields and Finite State Machines,” Journal of Intelligent and Connected Vehicles 7, no. 1 (2024): 19-29, https://doi.org/10.26599/jicv.2023.9210025.

[46]

Z. Huang, J. Wang, X. Fan, R. Yue, C. Xiang, and S. Gao, “Economical Electric Vehicle Charging Scheduling via Deep Imitation Learning,” IEEE Transactions on Intelligent Transportation Systems 25, no. 11 (2024): 18.196-18.210, https://doi.org/10.1109/tits.2024.3434734.

[47]

J. Cheng, Y. Chen, X. Mei, B. Yang, B. Li, and M. Liu, “Rethinking Imitation-Based Planners for Autonomous Driving,” in 2024 IEEE In-ternational Conference on Robotics and Automation (ICRA), (2024), 14.123-14.130.

[48]

R. Rajamani, Vehicle Dynamics and Control (Springer US, 2006).

[49]

H. Yao, “A New Gradient TD Algorithm with Only One Step-Size: Convergence Rate Analysis Using $L-{\lambda}$ Smoothness,” arXiv preprint arXiv:2307.15892 (2023).

[50]

S. Ma, Y. Zhou, and S. Zou, “Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis,” in Advances in Neural Information Processing Systems, Vol. 33 (Curran Associates Inc., 2020), 14.796-14.806.

[51]

S. Tu and B. Recht, “Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator,” in Proceedings of the 35th Inter-national Conference on Machine Learning, PMLR, Vol. 80 (10-15 July 2018), 5005-5014.

[52]

M. Salimibeni, A. Mohammadi, P. Malekzadeh, and K. N. Plata-niotis, “Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation,” Sensors 22, no. 4 (2022): 1393, https://doi.org/10.3390/s22041393.

[53]

P. Malekzadeh, M. Salimibeni, A. Mohammadi, A. Assa, and K. N. Plataniotis, “MM-KTD: Multiple Model Kalman Temporal Differences for Reinforcement Learning,” IEEE Access 8 (2020): 128.716-128.729, https://doi.org/10.1109/access.2020.3007951.

[54]

R. S. Sutton, A. R. Mahmood, and M. White, “An Emphatic Approach to the Problem of Off-Policy Temporal-Difference Learning,” Journal of Machine Learning Research 17, no. 73 (2016): 1-29, https://www.jmlr.org/papers/v17/14-488.html.

[55]

M. Riedmiller, “Neural Fitted Q Iteration-First Experiences With a Data Efficient Neural Reinforcement Learning Method,” in Machine Learning: ECML 2005 (Springer, 2005), 317-328.

[56]

H. Byeon, “Advances in Value-Based, Policy-Based, and Deep Learning-Based Reinforcement Learning,” International Journal of Advanced Computer Science and Applications 14, no. 8 (2023), https://doi.org/10.14569/ijacsa.2023.0140838.

[57]

B. Luo, D. Liu, and H.-N. Wu, “Adaptive Constrained Optimal Control Design for Data-Based Nonlinear Discrete-Time Systems With Critic-Only Structure,” IEEE Transactions on Neural Networks and Learning Systems 29, no. 6 (2018): 2099-2111, https://doi.org/10.1109/tnnls.2017.2751018.

[58]

J. Zhang, J. Kim, B. O’Donoghue, and S. Boyd, “Sample Efficient Reinforcement Learning With Reinforce,” Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 2021): 10.887-10.895, https://doi.org/10.1609/aaai.v35i12.17300.

[59]

Q. Liu, Y. Li, X. Shi, K. Lin, Y. Liu, and Y. Lou, “Distributional Policy Gradient With Distributional Value Function,” IEEE Trans-actions on Neural Networks and Learning Systems (2024): 1-13.

[60]

G. Chen, Y. Peng, and M. Zhang, “Constrained Expectation-Maximization Methods for Effective Reinforcement Learning,” in 2018 International Joint Conference on Neural Networks (IJCNN), (2018), 1-8.

[61]

J. Peters and S. Schaal, “Reinforcement Learning by Reward-Weighted Regression for Operational Space Control,” in Proceedings of the 24th International Conference on Machine Learning (Association for Computing Machinery, 2007), 745-750.

[62]

Z. Liu and P. Xu, “Distributionally Robust Off-Dynamics Rein-forcement Learning: Provable Efficiency With Linear Function Approximation,” in Proceedings of the 27th International Conference on Artificial Intelligence and Statistics, Ser, Vol. 238 (Proceedings of Ma-chine Learning Research, PMLR, May 2024), 2719-2727.

[63]

S. Cayci, N. He, and R. Srikant, “Convergence of Entropy-Regularized Natural Policy Gradient With Linear Function Approxi-mation,” SIAM Journal on Optimization 34, no. 3 (2024): 2729-2755, https://doi.org/10.1137/22m1540156.

[64]

M. G. Lagoudakis and R. Parr, “Least-squares Policy Iteration,” Journal of Machine Learning Research 4 (2003): 1107-1149, https://www.jmlr.org/papers/v4/lagoudakis03a.html.

[65]

T. Lattimore, C. Szepesvari, and G. Weisz, “Learning With Good Feature Representations in Bandits and in RL With a Generative Model,” in Proceedings of the 37th International Conference on Machine Learning, Ser. Proceedings of Machine Learning Research, PMLR, Vol. 119 (13-18 July 2020), 5662-5670.

[66]

X. Xu, D. W. Hu, and X. C. Lu, “Kernel-Based Least Squares Policy Iteration for Reinforcement Learning,” IEEE Trans on Neural Networks 18, no. 4 (2007): 973-992, https://doi.org/10.1109/tnn.2007.899161.

[67]

J. Liu, L. Zuo, X. Xu, et al., “Efficient Batch-Mode Reinforcement Learning Using Extreme Learning Machines,” IEEE Transactions on Systems, Man, and Cybernetics: Systems (2019): 1-14, https://doi.org/10.1109/tsmc.2018.2870983.

[68]

R. Chai, A. Tsourdos, A. Savvaris, S. Chai, Y. Xia, and C. L. P. Chen, “Design and Implementation of Deep Neural Network-Based Control for Automatic Parking Maneuver Process,” IEEE Transactions on Neural Networks and Learning Systems 33, no. 4 (2022): 1400-1413, https://doi.org/10.1109/tnnls.2020.3042120.

[69]

R. Chai, D. Liu, T. Liu, A. Tsourdos, Y. Xia, and S. Chai, “Deep Learning-Based Trajectory Planning and Control for Autonomous Ground Vehicle Parking Maneuver,” IEEE Transactions on Automation Science and Engineering 20, no. 3 (2023): 1633-1647, https://doi.org/10.1109/tase.2022.3183610.

[70]

R. Chai, H. Niu, J. Carrasco, F. Arvin, H. Yin, and B. Lennox, “Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Un-known Environment,” IEEE Transactions on Neural Networks and Learning Systems 35, no. 4 (2024): 5778-5792, https://doi.org/10.1109/tnnls.2022.3209154.

[71]

I. Alexandru Zamfirache, R.-E. Precup, R.-C. Roman, and E. M. Petriu, “Neural Network-Based Control Using Actor-Critic Reinforce-ment Learning and Grey Wolf Optimizer With Experimental Servo System Validation,” Expert Systems with Applications 225 (2023): 120112, https://doi.org/10.1016/j.eswa.2023.120112.

[72]

C. Wang, S. Lei, P. Ju, C. Chen, C. Peng, and Y. Hou, “Mdp-based Distribution Network Reconfiguration With Renewable Distributed Generation: Approximate Dynamic Programming Approach,” IEEE Transactions on Smart Grid 11, no. 4 (2020): 3620-3631, https://doi.org/10.1109/tsg.2019.2963696.

[73]

G. Chen, X. Zhao, Z. Gao, and M. Hua, “Dynamic Drifting Control for General Path Tracking of Autonomous Vehicles,” IEEE Transactions on Intelligent Vehicles 8, no. 3 (2023): 2527-2537, https://doi.org/10.1109/tiv.2023.3235007.

[74]

A. Perrusquía, W. Yu, and X. Li, “Robust Control in the Worst Case Using Continuous Time Reinforcement Learning,” in 2020 IEEE Inter-national Conference on Systems, Man, and Cybernetics (SMC), (2020), 1951-1954.

[75]

G. Rigatos, P. Siano, D. Selisteanu, and R. Precup, “Nonlinear Optimal Control of Oxygen and Carbon Dioxide Levels in Blood,” Intelligent Industrial Systems 3, no. 2 (2017): 61-75, https://doi.org/10.1007/s40903-016-0060-y.

[76]

R.-E. Precup, R.-C. Roman, T.-A. Teban, A. Albu, E. Petriu, and C. Pozna, “Model-Free Control of Finger Dynamics in Prosthetic Hand Myoelectric-Based Control Systems,” Studies in Informatics and Control 29, no. 4 (12 2020): 399-410, https://doi.org/10.24846/v29i4y202002.

[77]

Z.-M. Zhai, M. Moradi, L.-W. Kong, B. Glaz, M. Haile, and Y.-C. Lai, “Model-Free Tracking Control of Complex Dynamical Trajectories With Machine Learning,” Nature Communications 14, no. 1 (2023): 5698, https://doi.org/10.1038/s41467-023-41379-3.

[78]

R.-C. Roman, R.-E. Precup, and R.-C. David, “Second Order Intel-ligent Proportional-Integral Fuzzy Control of Twin Rotor Aerodynamic Systems,” Procedia Computer Science 139 (2018): 372-380, https://doi.org/10.1016/j.procs.2018.10.277.

[79]

A. Turnip and J. H. Panggabean, “Hybrid Controller Design Based Magneto-Rheological Damper Lookup Table for Quarter Car Suspen-sion,” International Journal of Artificial Intelligence 18, no. 1 (2020): 193-206, https://www.aut.upt.ro/~rprecup/IJAI_68.pdf.

[80]

Z. Lin, J. Ma, J. Duan, et al., “Policy Iteration Based Approximate Dynamic Programming toward Autonomous Driving in Constrained Dynamic Environment,” IEEE Transactions on Intelligent Trans-portation Systems 24, no. 5 (2023): 5003-5013, https://doi.org/10.1109/tits.2023.3237568.

[81]

B. Kiumarsi, K. G. Vamvoudakis, H. Modares, and F. L. Lewis, “Optimal and Autonomous Control Using Reinforcement Learning: A Survey,” IEEE Transactions on Neural Networks and Learning Systems 29, no. 6 (2018): 2042-2062, https://doi.org/10.1109/tnnls.2017.2773458.

[82]

Q. Wei, T. Zhou, J. Lu, Y. Liu, S. Su, and J. Xiao, “Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 53, no. 10 (2023): 6375-6387, https://doi.org/10.1109/tsmc.2023.3284612.

[83]

C. Hu, Z. Wang, X. Bu, J. Zhao, J. Na, and H. Gao, “Optimal Tracking Control for Autonomous Vehicle With Prescribed Perfor-mance via Adaptive Dynamic Programming,” IEEE Transactions on Intelligent Transportation Systems 25, no. 9 (2024): 12.437-12.449, https://doi.org/10.1109/tits.2024.3384113.

[84]

X. Yang and Q. Wei, “Adaptive Dynamic Programming for Robust Event-Driven Tracking Control of Nonlinear Systems With Asymmetric Input Constraints,” IEEE Transactions on Cybernetics 54, no. 11 (2024): 6333-6344, https://doi.org/10.1109/tcyb.2024.3418904.

[85]

Z. Hendzel and M. Kołodziej, “Neural Dynamic Programming With Application to Wheeled Mobile Robot,” in Automation 2022: New Solutions and Technologies for Automation, Robotics and Measurement Techniques (Springer International Publishing, 2022), 213-222.

[86]

Q. Zhao, J. Si, and J. Sun, “Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: From Time-Driven to Event-Driven,” IEEE Transactions on Neural Networks and Learning Systems 33, no. 8 (2022): 4139-4144, https://doi.org/10.1109/tnnls.2021.3053037.

[87]

H. Tan, “Reinforcement Learning With Deep Deterministic Policy Gradient,” in 2021 International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA), (2021), 82-85.

[88]

Y. Gu, Y. Cheng, C. L. P. Chen, and X. Wang, “Proximal Policy Optimization With Policy Feedback,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 52, no. 7 (2022): 4600-4610, https://doi.org/10.1109/tsmc.2021.3098451.

[89]

T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning With a Stochastic Actor,” in International Conference on Machine Learning, PMLR (2018), 1861-1870.

[90]

D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Ried-miller, “Deterministic Policy Gradient Algorithms,” in International Conference on Machine Learning, PMLR (2014), 387-395.

[91]

S. E. Li, Reinforcement Learning for Sequential Decision and Optimal Control (Springer, 2023).

[92]

L. Ye, Z. Zhao, and F. Liu, “Reinforcement Learning for Finite-Horizon {H_{\infty} } Tracking Control of Unknown Discrete Linear Time-Varying System,” IEEE Transactions on Systems, Man, and Cy-bernetics: Systems 54, no. 10 (2024): 6385-6396, https://doi.org/10.1109/tsmc.2024.3431453.

[93]

Z. Lin, J. Duan, S. E. Li, et al., “Policy-Iteration-Based Finite-Horizon Approximate Dynamic Programming for Continuous-Time Nonlinear Optimal Control,” IEEE Transactions on Neural Networks and Learning Systems 34, no. 9 (2023): 5255-5267, https://doi.org/10.1109/tnnls.2022.3225090.

[94]

K. Wang and C. Mu, “Asynchronous Learning for Actor-Critic Neural Networks and Synchronous Triggering for Multiplayer Sys-tem,” ISA Transactions 129 (2022): 295-308, https://doi.org/10.1016/j.isatra.2022.02.007.

[95]

S. Zhao, J. Wang, H. Xu, and H. Wang, “Finite Horizon Robust Optimal Tracking Control Based on Approximate Dynamic Program-ming for Switched Systems With Uncertainties,” International Journal of Control, Automation and Systems 20, no. 4 (2022): 1051-1062, https://doi.org/10.1007/s12555-020-0982-8.

[96]

H. Kumar, A. Koppel, and A. Ribeiro, “On the Sample Complexity of Actor-Critic Method for Reinforcement Learning With Function Approximation,” Machine Learning 112, no. 7 (February 2023): 2433-2467, https://doi.org/10.1007/s10994-023-06303-2.

[97]

R. Chai, A. Tsourdos, H. Gao, Y. Xia, and S. Chai, “Dual-Loop Tube-Based Robust Model Predictive Attitude Tracking Control for Spacecraft With System Constraints and Additive Disturbances,” IEEE Trans-actions on Industrial Electronics 69, no. 4 (2022): 4022-4033, https://doi.org/10.1109/tie.2021.3076729.

[98]

R. Chai, A. Tsourdos, H. Gao, S. Chai, and Y. Xia, “Attitude Tracking Control for Reentry Vehicles Using Centralised Robust Model Predictive Control,” Automatica 145 (2022): 110561, https://doi.org/10.1016/j.automatica.2022.110561.

[99]

X. Xu, H. Chen, C. Lian, and D. Li, “Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Stochastic Distur-bances,” IEEE Transactions on Neural Networks and Learning Systems 29, no. 12 (2018): 6202-6213, https://doi.org/10.1109/tnnls.2018.2820019.

[100]

Y. Lu, X. Xu, X. Zhang, L. Qian, and X. Zhou, “Hierarchical Reinforcement Learning for Autonomous Decision Making and Motion Planning of Intelligent Vehicles,” IEEE Access 8 (2020): 209.776-209.789, https://doi.org/10.1109/access.2020.3034225.

[101]

C. Lian, X. Xu, H. Chen, and H. He, “Near-optimal Tracking Control of Mobile Robots via Receding-Horizon Dual Heuristic Pro-gramming,” IEEE Transactions on Cybernetics 46, no. 11 (2017): 2484-2496, https://doi.org/10.1109/tcyb.2015.2478857.

[102]

D. Silver, J. Schrittwieser, K. Simonyan, et al., “Mastering the Game of Go Without Human Knowledge,” Nature 550, no. 7676 (2017): 354-359, https://doi.org/10.1038/nature24270.

[103]

P. Krauss, Game-playing Artificial Intelligence (Springer Berlin Heidelberg, 2024), 125-129.

[104]

H. Zhu, H. Zhong, B. Yan, and X. Li, “Dynamic Formation Plan-ning and Control for Robot Soccer Game With Multi-Agent Reinforce-ment Learning and Behavioral Model,” in 2024 International Conference on Advanced Robotics and Mechatronics (ICARM), (2024), 837-842.

[105]

H. Shi, Z. Lin, S. Zhang, X. Li, and K.-S. Hwang, “An Adaptive Decision-Making Method With Fuzzy Bayesian Reinforcement Learning for Robot Soccer,” Information Sciences 436 (2018): 268-281, https://doi.org/10.1016/j.ins.2018.01.032.

[106]

K. Kurach, A. Raichuk, P. Stańczyk, et al., “Google Research Football: A Novel Reinforcement Learning Environment,” Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 4 (April 2020): 4501-4510, https://doi.org/10.1609/aaai.v34i04.5878.

[107]

X. Liu, “Research on Decision-Making Strategy of Soccer Robot Based on Multi-Agent Reinforcement Learning,” International Journal of Advanced Robotic Systems 17, no. 3 (2020), https://doi.org/10.1177/1729881420916960.

[108]

E. Chisari, A. Liniger, A. Rupenyan, L. Van Gool, and J. Lygeros, “Learning From Simulation, Racing in Reality,” in 2021 IEEE Interna-tional Conference on Robotics and Automation (ICRA), (2021), 8046-8052.

[109]

L. Gao, Z. Gu, C. Qiu, et al., “Cola-HRL: Continuous-Lattice Hi-erarchical Reinforcement Learning for Autonomous Driving,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2022), 13.143-13.150.

[110]

B. Wohner, F. Sevilla, B. Grueter, J. Diepolder, R. Afonso, and F. Holzapfel, “Hierarchical Nonlinear Model Predictive Control for an Autonomous Racecar,” in 2021 20th International Conference on Advanced Robotics (ICAR), (2021), 113-120.

[111]

H. Guo, S. Hu, S. Duan, and J. Liu, “Decision-Making for Vehicle Overtaking on Urban Roads: A Level-K Game Theory Approach,” in 2024 8th CAA International Conference on Vehicular Control and Intel-ligence (CVCI), (2024), 1-6.

[112]

X. Ma, Y. Xie, and C. Chigan, “Graph Convolutional Network Based Multi-Objective Meta-Deep Q-Learning for Eco-Routing,” IEEE Transactions on Intelligent Transportation Systems 25, no. 7 (2024):7323-7338, https://doi.org/10.1109/tits.2023.3348034.

[113]

B. Zhang, L. Chang, T. Teng, et al., “Multi-Objective Optimization With Q-Learning for Cruise and Power Allocation Control Parameters of Connected Fuel Cell Hybrid Vehicles,” Applied Energy 373 (2024): 123910, https://doi.org/10.1016/j.apenergy.2024.123910.

[114]

S. Shalev-Shwartz, S. Shammah, and A. Shashua, “Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving,” Artificial Intelligence (2016): arXiv preprint arXiv:1610.03295, https://arxiv.org/abs/1610.03295.

[115]

M. Rokonuzzaman, N. Mohajer, S. Nahavandi, and S. Mohamed, “Learning-Based Model Predictive Control for Path Tracking Control of Autonomous Vehicle,” in 2020 IEEE International Conference on Sys-tems, Man, and Cybernetics (SMC), (2020), 2913-2918.

[116]

X. Xu, L. Zuo, X. Li, L. Qian, J. Ren, and Z. Sun, “A Reinforcement Learning Approach to Autonomous Decision Making of Intelligent Vehicles on Highways,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 50, no. 10 (2020): 3884-3897, https://ieeexplore.ieee.org/abstract/document/8571191.

[117]

C.-H. Min and Y. M. Kim, “Asimo: Agent-Centric Scene Represen-tation in Multi-Object Manipulation,” International Journal of Robotics Research 44, no. 1 (2025): 22-64, https://doi.org/10.1177/02783649241257537.

[118]

C. Liu, X. Xu, and D. Hu, “Multiobjective Reinforcement Learning: A Comprehensive Overview,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 45, no. 3 (2014): 385-398, https://ieeexplore.ieee.org/abstract/document/6918520.

[119]

Z. Zhang, Z. Wu, H. Zhang, and J. Wang, “Meta-learning-based Deep Reinforcement Learning for Multiobjective Optimization Prob-lems,” IEEE Transactions on Neural Networks and Learning Systems 34, no. 10 (2023): 7978-7991, https://doi.org/10.1109/tnnls.2022.3148435.

[120]

L. Le Mero, D. Yi, M. Dianati, and A. Mouzakitis, “A Survey on Imitation Learning Techniques for End-To-End Autonomous Vehicles,” IEEE Transactions on Intelligent Transportation Systems 23, no. 9 (2022): 14.128-14.147, https://doi.org/10.1109/tits.2022.3144867.

[121]

L. Chen, P. Wu, K. Chitta, B. Jaeger, A. Geiger, and H. Li, “End-To-End Autonomous Driving: Challenges and Frontiers,” IEEE Trans-actions on Pattern Analysis and Machine Intelligence 46, no. 12 (2024): 10.164-10.183, https://doi.org/10.1109/tpami.2024.3435937.

[122]

A. E. Sallab, M. Abdou, E. Perot, and S. Yogamani, “Deep Rein-forcement Learning Framework for Autonomous Driving,” Electronic Imaging 2017, no. 19 (2017): 70-76, https://doi.org/10.2352/issn.2470-1173.2017.19.avm-023.

[123]

P. Wolf, C. Hubschneider, M. Weber, et al., “Learning How to Drive in a Real World Simulation With Deep Q-Networks,” in 2017 IEEE Intelligent Vehicles Symposium (IV), (2017), 244-250.

[124]

J. Chen, B. Yuan, and M. Tomizuka, “Model-free Deep Rein-forcement Learning for Urban Autonomous Driving,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC) (IEEE, 2019), 2765-2771.

[125]

J. Chen, S. E. Li, and M. Tomizuka, “Interpretable End-To-End Urban Autonomous Driving With Latent Deep Reinforcement Learning,” arXiv preprint arXiv:2001.08726 23, no. 6 (2022): 5068-5078, https://doi.org/10.1109/tits.2020.3046646.

[126]

H. Xu, Y. Gao, F. Yu, and T. Darrell, “End-To-End Learning of Driving Models From Large-Scale Video Datasets,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2017), 2174-2182.

[127]

J. Chen, S. E. Li, and M. Tomizuka, “Interpretable End-To-End Urban Autonomous Driving With Latent Deep Reinforcement Learning,” IEEE Transactions on Intelligent Transportation Systems 23, no. 6 (2022): 5068-5078, https://doi.org/10.1109/tits.2020.3046646.

[128]

L. Qian, X. Xu, Y. Zeng, and J. Huang, “Deep, Consistent Behav-ioral Decision Making With Planning Features for Autonomous Vehi-cles,” Electronics 8, no. 12 (2019): 1492, https://doi.org/10.3390/electronics8121492.

[129]

M. Zhu, X. Wang, and Y. Wang, “Human-Like Autonomous Car-Following Model With Deep Reinforcement Learning,” Transportation Research Part C: Emerging Technologies 97 (2018): 348-368, https://doi.org/10.1016/j.trc.2018.10.024.

[130]

S. Albeaik, A. Bayen, M. T. Chiri, et al., “Limitations and Im-provements of the Intelligent Driver Model (IDM),” SIAM Journal on Applied Dynamical Systems 21, no. 3 (2022): 1862-1892, https://doi.org/10.1137/21m1406477.

[131]

X. Zhu, Y. Luo, A. Liu, W. Tang, and M. Z. A. Bhuiyan, “A Deep Learning-Based Mobile Crowdsensing Scheme by Predicting Vehicle Mobility,” IEEE Transactions on Intelligent Transportation Systems 22, no. 7 (2021): 4648-4659, https://doi.org/10.1109/tits.2020.3023446.

[132]

C.-J. Hoel, K. Wolff, and L. Laine, “Automated Speed and Lane Change Decision Making Using Deep Reinforcement Learning,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC) (IEEE, 2018), 2148-2155.

[133]

J. Wang, Q. Zhang, D. Zhao, and Y. Chen, “Lane Change Decision-Making through Deep Reinforcement Learning With Rule-Based Con-straints,” in 2019 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2019), 1-6.

[134]

A. Alizadeh, M. Moghadam, Y. Bicer, N. K. Ure, U. Yavas, and C. Kurtulus, “Automated Lane Change Decision Making Using Deep Reinforcement Learning in Dynamic and Uncertain Highway Envi-ronment,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC) (IEEE, 2019), 1399-1404.

[135]

D. Dauner, M. Hallgarten, A. Geiger, and K. Chitta, “Parting With Misconceptions About Learning-Based Vehicle Motion Planning,” in Conference on Robot Learning (CoRL), (2023), 1268-1281.

[136]

J. Cheng, Y. Chen, X. Mei, B. Yang, B. Li, and M. Liu, “Rethinking Imitation-Based Planners for Autonomous Driving,” in IEEE Interna-tional Conference on Robotics and Automation, ICRA 2024, Yokohama, Japan, May 13-17, 2024 (IEEE, 2024), 14.123-14.130.

[137]

Z. Huang, H. Liu, and C. Lv, “Gameformer: Game-Theoretic Modeling and Learning of Transformer-Based Interactive Prediction and Planning for Autonomous Driving,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (2023), 3903-3913.

[138]

C. Chi, S. Feng, Y. Du, et al., “Diffusion Policy: Visuomotor Policy Learning via Action Diffusion,” in Robotics: Science and Systems XIX, Daegu, Republic of Korea, (July 10-4, 2023).

[139]

B. Yang, H. Su, N. Gkanatsios, et al., “Diffusion-es: Gradient-Free Planning With Diffusion for Autonomous Driving and Zero-Shot In-struction Following,” arXiv preprint (2024): arXiv:2402.06559, https://arxiv.org/abs/2402.06559.

[140]

Q. Sun, S. Zhang, D. Ma, et al., “Large Trajectory Models Are Scalable Motion Predictors and Planners,” arXiv preprint (2023): arXiv:2310.19620, https://arxiv.org/abs/2310.19620.

[141]

Y. Hu, S. Chai, Z. Yang, et al., “Solving Motion Planning Tasks With a Scalable Generative Model,” in Computer Vision-ECCV 2024-18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part XLII, Ser. Lecture Notes in Computer Science, Vol. 15100 (Springer, 2024), 386-404, https://doi.org/10.1007/978-3-031-72946-1_22.

[142]

L. D. Pyeatt and A. E. Howe, “Learning to Race: Experiments With a Simulated Race Car,” in FLAIRS Conference (Citeseer, 1998), 357-361.

[143]

D. Loiacono, A. Prete, P. L. Lanzi, and L. Cardamone, “Learning to Overtake in Torcs Using Simple Reinforcement Learning,” in IEEE Congress on Evolutionary Computation (IEEE, 2010), 1-8.

[144]

D. C. Ngai and N. H. Yung, “Automated Vehicle Overtaking Based on a Multiple-Goal Reinforcement Learning Framework,” in 2007 IEEE Intelligent Transportation Systems Conference (IEEE, 2007), 818-823.

[145]

R. Zheng, C. Liu, and Q. Guo, “A Decision-Making Method for Autonomous Vehicles Based on Simulation and Reinforcement Learning,” in 2013 International Conference on Machine Learning and Cybernetics, Vol. 1 (IEEE, 2013), 362-369.

[146]

S. Zhang, Z. Zhao, J. Zhao, et al., “Control Strategy for Dynamic Voltage Restorer under Distorted and Unbalanced Voltage Conditions,” in 2019 IEEE International Conference on Industrial Technology (ICIT), (2019), 411-416.

[147]

R.-C. Roman, R.-E. Precup, E.-L. Hedrea, et al., “Iterative Feedback Tuning Algorithm for Tower Crane Systems,” Procedia Computer Sci-ence 199 (2022): 157-165, https://doi.org/10.1016/j.procs.2022.01.020.

[148]

R.-E. Precup, R.-C. Roman, and A. Safaei, Data-driven Model-Free Controllers (CRC Press, 2021).

[149]

X. Xu, H. Zhang, B. Dai, and H. G. He, “Self-Learning Path-Tracking Control of Autonomous Vehicles Using Kernel-Based Approximate Dynamic Programming,” in 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence, 2008), 2182-2189.

[150]

Y. Shan, B. Zheng, L. Chen, L. Chen, and D. Chen, “A Rein-forcement Learning-Based Adaptive Path Tracking Approach for Autonomous Driving,” IEEE Transactions on Vehicular Technology 69, no. 10 (2020): 10.581-10.595, https://doi.org/10.1109/tvt.2020.3014628.

[151]

N. M. Mohammad Rokonuzzaman and S. Nahavandi, “Effective Adoption of Vehicle Models for Autonomous Vehicle Path Tracking: A Switched Mpc Approach,” Vehicle System Dynamics 61, no. 5 (2023): 1236-1259, https://doi.org/10.1080/00423114.2022.2071300.

[152]

Z. Huang, C. Lian, X. Xu, and J. Wang, “Lateral Control for Autonomous Land Vehicles via Dual Heuristic Programming,” Inter-national Journal of Robotics and Automation 31, no. 10 (2016), https://doi.org/10.2316/journal.206.2016.6.206-4878.

[153]

J. Liu, Z. Huang, X. Xu, X. Zhang, S. Sun, and D. Li, “Multi-kernel Online Reinforcement Learning for Path Tracking Control of Intelligent Vehicles,” IEEE Transactions on Systems, Man, and Cybernetics: Systems (2020): 1-14.

[154]

M. Reda, A. Onsy, A. Y. Haikal, and A. Ghanbari, “Path Planning Algorithms in the Autonomous Driving System: A Comprehensive Re-view,” Robotics and Autonomous Systems 174 (2024): 104630, https://doi.org/10.1016/j.robot.2024.104630.

[155]

J. M. Snider, et al., “Automatic Steering Methods for Autonomous Automobile Path Tracking,” in Tech. Rep. CMU-RITR-09-08 (Robotics Institute, 2009).

[156]

Z. Sun, Q. Chen, Y. Nie, D. Liu, and H. He, “Ribbon Model Based Path Tracking Method for Autonomous Land Vehicle,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, (2012), 1220-1226.

[157]

A. Folkers, M. Rick, and C. Büskens, “Controlling an Autonomous Vehicle With Deep Reinforcement Learning,” in 2019 IEEE Intelligent Vehicles Symposium (IV), (2019), 2025-2031.

[158]

J. Liu, Y. Cui, J. Duan, et al., “Reinforcement Learning-Based High-Speed Path Following Control for Autonomous Vehicles,” IEEE Transactions on Vehicular Technology 73, no. 6 (2024): 7603-7615, https://doi.org/10.1109/tvt.2024.3352543.

[159]

J. Wang, X. Xu, D. Liu, Z. Sun, and Q. Chen, “Self-learning Cruise Control Using Kernel-Based Least Squares Policy Iteration,” IEEE Transactions on Control Systems Technology 22, no. 3 (2014): 1078-1087, https://doi.org/10.1109/tcst.2013.2271276.

[160]

Z. Huang, X. Xu, H. He, J. Tan, and Z. Sun, “Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 49, no. 4 (2019): 730-741, https://doi.org/10.1109/tsmc.2017.2712561.

[161]

X. Xu, D. Hu, and X. Lu, “Kernel-Based Least Squares Policy Iteration for Reinforcement Learning,” IEEE Transactions on Neural Networks 18, no. 4 (2007): 973-992, https://doi.org/10.1109/tnn.2007.899161.

[162]

Y. Zhang, L. Guo, B. Gao, T. Qu, and H. Chen, “Deterministic Promotion Reinforcement Learning Applied to Longitudinal Velocity Control for Automated Vehicles,” IEEE Transactions on Vehicular Technology 69, no. 1 (2020): 338-348, https://doi.org/10.1109/tvt.2019.2955959.

[163]

D. Zhao, B. Wang, and D. Liu, “A Supervised Actor—Critic Approach for Adaptive Cruise Control,” Soft Computing 17, no. 11 (2013): 2089-2099, https://doi.org/10.1007/s00500-013-1110-y.

[164]

G. Li and D. Gőrges, “Ecological Adaptive Cruise Control and Energy Management Strategy for Hybrid Electric Vehicles Based on Heuristic Dynamic Programming,” IEEE Transactions on Intelligent Transportation Systems 20, no. 9 (2019): 3526-3535, https://doi.org/10.1109/tits.2018.2877389.

[165]

C. Desjardins and B. Chaib-draa, “Cooperative Adaptive Cruise Control: A Reinforcement Learning Approach,” IEEE Transactions on Intelligent Transportation Systems 12, no. 4 (2011): 1248-1260, https://doi.org/10.1109/tits.2011.2157145.

[166]

W. Gao, J. Gao, K. Ozbay, and Z. Jiang, “Reinforcement-Learning-Based Cooperative Adaptive Cruise Control of Buses in the Lincoln Tunnel Corridor With Time-Varying Topology,” IEEE Transactions on Intelligent Transportation Systems 20, no. 10 (2019): 3796-3805, https://doi.org/10.1109/tits.2019.2895285.

[167]

C.-J. Hoel, K. Driggs-Campbell, K. Wolff, L. Laine, and M. J. Kochenderfer, “Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving,” IEEE Trans-actions on Intelligent Vehicles 5, no. 2 (June 2020): 294-305, https://doi.org/10.1109/tiv.2019.2955905.

[168]

S. Kuutti, R. Bowden, H. Joshi, R. d. Temple, and S. Fallah, “End-To-End Reinforcement Learning for Autonomous Longitudinal Control Using Advantage Actor Critic With Temporal Context,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC), (2019), 2456-2462.

[169]

P. Cai, X. Mei, L. Tai, Y. Sun, and M. Liu, “High-Speed Autono-mous Drifting With Deep Reinforcement Learning,” IEEE Robotics and Automation Letters 5, no. 2 (2020): 1247-1254, https://doi.org/10.1109/lra.2020.2967299.

[170]

X. Dai, C.-K. Li, and A. Rad, “An Approach to Tune Fuzzy Con-trollers Based on Reinforcement Learning for Autonomous Vehicle Control,” IEEE Transactions on Intelligent Transportation Systems 6, no. 3 (2005): 285-293, https://doi.org/10.1109/tits.2005.853698.

[171]

M. R. Zofka, F. Kuhnt, R. Kohlhaas, and J. M. Zőllner, “Simulation Framework for the Development of Autonomous Small Scale Vehicles,” in 2016 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR), (2016), 318-324.

[172]

H. Porav and P. Newman, “Imminent Collision Mitigation With Reinforcement Learning and Vision,” in 2018 21st International Con-ference on Intelligent Transportation Systems (ITSC), (2018), 958-964.

[173]

M. Jaritz, R. de Charette, M. Toromanoff, E. Perot, and F. Nashashibi, “End-To-End Race Driving With Deep Reinforcement Learning,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), (2018), 2070-2075.

[174]

W. Xia, H. Li, and B. Li, “A Control Strategy of Autonomous Ve-hicles Based on Deep Reinforcement Learning,” in 2016 9th Interna-tional Symposium on Computational Intelligence and Design (ISCID), Vol. 2, (2016), 198-201, https://doi.org/10.1109/iscid.2016.2054.

[175]

Y. Ye, X. Zhang, and J. Sun, “Automated Vehicle’s Behavior De-cision Making Using Deep Reinforcement Learning and High-Fidelity Simulation Environment,” Transportation Research Part C: Emerging Technologies 107 (2019): 155-170, https://doi.org/10.1016/j.trc.2019.08.011.

[176]

M. Zhou, Y. Yu, and X. Qu, “Development of an Efficient Driving Strategy for Connected and Automated Vehicles at Signalized In-tersections: A Reinforcement Learning Approach,” IEEE Transactions on Intelligent Transportation Systems 21, no. 1 (2020): 433-443, https://doi.org/10.1109/tits.2019.2942014.

[177]

Y. Yuan, H. Cheng, and M. Sester, “Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of Autonomous Driving,” IEEE Robotics and Automation Letters 7, no. 2 (2022): 3054-3061, https://doi.org/10.1109/lra.2022.3143299.

[178]

S. Manivasagam, S. Wang, K. Wong, et al., “Lidarsim: Realistic Lidar Simulation by Leveraging the Real World,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (June 2020).

[179]

H. Xu, Y. Gao, F. Yu, and T. Darrell, “End-To-End Learning of Driving Models From Large-Scale Video Datasets,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (July 2017).

[180]

T. Liang, H. Xie, K. Yu, et al., “Bevfusion: A Simple and Robust Lidar-Camera Fusion Framework,” in Advances in Neural Information Processing Systems, Vol. 35 (Curran Associates Inc., 2022), 10.421-10.434.

[181]

L. Dong, J. Yan, X. Yuan, H. He, and C. Sun, “Functional Nonlinear Model Predictive Control Based on Adaptive Dynamic Pro-gramming,” IEEE Transactions on Cybernetics 49, no. 12 (2019): 4206-4218, https://doi.org/10.1109/tcyb.2018.2859801.

[182]

H. Pang, Z. Wang, and G. Li, “Large Language Model Guided Deep Reinforcement Learning for Decision Making in Autonomous Driving,” Robotics (2024): arXiv preprint arXiv:2412.18511, https://arxiv.org/abs/2412.18511.

[183]

Z. Xu, Y. Zhang, E. Xie, et al., “DriveGPT4: Interpretable End-To-End Autonomous Driving via Large Language Model,” IEEE Robotics and Automation Letters 9, no. 10 (2024): 8186-8193, https://doi.org/10.1109/lra.2024.3440097.

[184]

W. Wang, J. Xie, C. Hu, et al., “Drivemlm: Aligning Multi-Modal Large Language Models With Behavioral Planning States for Autono-mous Driving,” Computer Vision and Pattern Recognition (2023): arXiv preprint arXiv:2312.09245, https://arxiv.org/abs/2312.09245.

AI Summary AI Mindmap
PDF (1869KB)

36

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/