Supervisory control of the hybrid off-highway vehicle for fuel economy improvement using predictive double Q-learning with backup models
Bin Shuai , Yan-fei Li , Quan Zhou , Hong-ming Xu , Shi-jin Shuai
Journal of Central South University ›› 2022, Vol. 29 ›› Issue (7) : 2266 -2278.
Supervisory control of the hybrid off-highway vehicle for fuel economy improvement using predictive double Q-learning with backup models
This paper studied a supervisory control system for a hybrid off-highway electric vehicle under the charge-sustaining (CS) condition. A new predictive double Q-learning with backup models (PDQL) scheme is proposed to optimize the engine fuel in real-world driving and improve energy efficiency with a faster and more robust learning process. Unlike the existing “model-free” methods, which solely follow on-policy and off-policy to update knowledge bases (Q-tables), the PDQL is developed with the capability to merge both on-policy and off-policy learning by introducing a backup model (Q-table). Experimental evaluations are conducted based on software-in-the-loop (SiL) and hardware-in-the-loop (HiL) test platforms based on real-time modelling of the studied vehicle. Compared to the standard double Q-learning (SDQL), the PDQL only needs half of the learning iterations to achieve better energy efficiency than the SDQL at the end learning process. In the SiL under 35 rounds of learning, the results show that the PDQL can improve the vehicle energy efficiency by 1.75% higher than SDQL. By implementing the PDQL in HiL under four predefined real-world conditions, the PDQL can robustly save more than 5.03% energy than the SDQL scheme.
supervisory charge-sustaining control / hybrid electric vehicle / reinforcement learning / predictive double Q-learning
| [1] |
|
| [2] |
|
| [3] |
European Commission. Proposal for post — 2020 CO2 targets for cars and vans Climate action [R]. 2017. |
| [4] |
GREGOR E. EU legislation in progress CO2 emission standards for heavy-duty vehicles [R]. December 2018. |
| [5] |
APC. Roadmap 2020 heavy goods >3.5 t and off-highway vehicle [R]. 2020. |
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
SHUAI B, ZHOU Q, WILLIAMS H, et al. Impact of exploration-to-exploitation ratio on energy-saving potential of plug-in hybrid vehicles controlled, 2021. |
| [16] |
|
| [17] |
|
| [18] |
LIU Ruo-ze, GUO Hai-feng, JI Xiao-zhong, et al. Efficient reinforcement learning for StarCraft by abstract forward models and transfer learning [J]. IEEE Transactions on Games, 2021: 1. DOI: https://doi.org/10.1109/tg.2021.3071162. |
| [19] |
|
| [20] |
ZHU Zhao-xuan, PIVARO N, GUPTA S, et al. Safe model-based off-policy reinforcement learning for eco-driving in connected and automated hybrid electric vehicles [J]. IEEE Transactions on Intelligent Vehicles, 2022: 1. DOI: https://doi.org/10.1109/tiv.2022.3150668. |
| [21] |
ZHAO Dong-bin, WANG Hai-tao, KUN Shao, et al. Deep reinforcement learning with experience replay based on SARSA [C]//2016 IEEE Symposium Series on Computational Intelligence. December 6–9, 2016, Athens. IEEE, 2016: 1–6. DOI: https://doi.org/10.1109/SSCI.2016.7849837. |
| [22] |
LLORENTE F, MARTINO L, READ J, et al. A survey of Monte Carlo methods for noisy and costly densities with application to reinforcement learning [EB/OL]. 2021: arXiv: 2108.00490[cs.LG]. https://arxiv.org/abs/2108.00490. |
| [23] |
|
| [24] |
LUO W, TANG Q, FU C, et al. Deep-sarsa based multi-uav path planning and obstacle avoidance in a dynamic environment [J]. Lect Notes Comput Sci, 2018: 102–111. DOI: https://doi.org/10.1007/978-3-319-93818-9_10. |
| [25] |
LIU T, TANG X, CHEN J, et al. Transferred energy management strategies for hybrid electric vehicles based on driving conditions recognition [C]//2020 IEEE Veh Power Propuls Conf. VPPC 2020-Proc, 2020. |
| [26] |
KOUCHE-BIYOUKI S A, NASERI-JAVARESHK S M A, NOORI A et al. Power management strategy of hybrid vehicles using Sarsa method [C]//Electrical Engineering (ICEE), Iranian Conference on. 2018: 946–950. |
| [27] |
ZHU Z, GUPTA S, GUPTA A, et al. A deep reinforcement learning framework for eco-driving in connected and automated hybrid electric vehicles [OL]. arXiv: 2101.05372v2. |
| [28] |
|
| [29] |
|
| [30] |
van HASSELT H. Double Q-learning [R]. Multi-agent and Adaptive Computation Group, Centrum Wiskunde & Information, 2010: 1–9. |
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
REN Z, ZHU G, HU H, et al. On the estimation bias in double Q-learning [OL]. arXiv:2109.14419. |
| [36] |
|
| [37] |
|
/
| 〈 |
|
〉 |