Multi-agent deep reinforcement learning based resource management in heterogeneous V2X networks

Zhao Junhui , Hu Fajin , Li Jiahang , Nie Yiwen

›› 2025, Vol. 11 ›› Issue (1) : 182 -190.

PDF
›› 2025, Vol. 11 ›› Issue (1) : 182 -190. DOI: 10.1016/j.dcan.2023.06.003
Original article

Multi-agent deep reinforcement learning based resource management in heterogeneous V2X networks

Author information +
History +
PDF

Abstract

In Heterogeneous Vehicle-to-Everything Networks (HVNs), multiple users such as vehicles and handheld devices and infrastructure can communicate with each other to obtain more advanced services. However, the increasing number of entities accessing HVNs presents a huge technical challenge to allocate the limited wireless resources. Traditional model-driven resource allocation approaches are no longer applicable because of rich data and the interference problem of multiple communication modes reusing resources in HVNs. In this paper, we investigate a wireless resource allocation scheme including power control and spectrum allocation based on the resource block reuse strategy. To meet the high capacity of cellular users and the high reliability of Vehicle-to-Vehicle (V2V) user pairs, we propose a data-driven Multi-Agent Deep Reinforcement Learning (MADRL) resource allocation scheme for the HVN. Simulation results demonstrate that compared to existing algorithms, the proposed MADRL-based scheme achieves a high sum capacity and probability of successful V2V transmission, while providing close-to-limit performance.

Keywords

Data-driven / Deep reinforcement learning / Resource allocation / V2X communications

Cite this article

Download citation ▾
Zhao Junhui, Hu Fajin, Li Jiahang, Nie Yiwen. Multi-agent deep reinforcement learning based resource management in heterogeneous V2X networks. , 2025, 11(1): 182-190 DOI:10.1016/j.dcan.2023.06.003

登录浏览全文

4963

注册一个新账户 忘记密码

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The work presented in this paper is funded in part by the National Key Research and Development of China Project (2020YFB1807204), in part by National Natural Science Foundation of China (U2001213 and 61971191), in part by the Beijing Natural Science Foundation under Grant L201011, in part by the key project of Natural Science Foundation of Jiangxi Province (20202ACBL202006).

References

[1]

L. Liang, H. Peng, G.Y. Li, X. Shen, Vehicular communications: a physical layer per-spective, IEEE Trans. Veh. Technol. 66 (12) (2017) 10647-10659.

[2]

H. Peng, L. Liang, X. Shen, G.Y. Li, Vehicular communications: a network layer per-spective, IEEE Trans. Veh. Technol. 68 (2) (2019) 1064-1078.

[3]

Technical specification group radio access network, study enhancement 3GPP sup-port for 5G V2X services, (Release 15), Tech. Rep., 3GPP TR 22.886 V15.1.0, 3rd Generation Partnership Project, 2017.

[4]

C.-X. Wang, M.D. Renzo, S. Stanczak, S. Wang, E.G. Larsson, Artificial intelligence enabled wireless networking for 5G and beyond: recent advances and future chal-lenges, IEEE Wirel. Commun. 27 (1) (2020) 16-23.

[5]

Y.S. Nasir, D. Guo, Deep reinforcement learning for joint spectrum and power allo-cation in cellular networks, in: 2021 IEEE Globecom Workshops (GC Wkshps), 2021, pp. 1-6.

[6]

J. Zhao, L. Yang, M. Xia, M. Motani, Unified analysis of coordinated multipoint transmissions in mmwave cellular networks, IEEE Int. Things J. 9 (14) (2022) 12166-12180.

[7]

Y.S. Nasir, D. Guo, Multi-agent deep reinforcement learning for dynamic power allo-cation in wireless networks, IEEE J. Sel. Areas Commun. 37 (10) (2019) 2239-2250.

[8]

K. Lin, C. Li, J.J.P.C. Rodrigues, P. Pace, G. Fortino, Data-driven joint resource al-location in large-scale heterogeneous wireless networks, IEEE Netw. 34 (3) (2020) 163-169.

[9]

M.I. Ashraf, C.-F. Liu, M. Bennis, W. Saad, C.S. Hong, Dynamic resource allocation for optimized latency and reliability in vehicular networks, IEEE Access 6 (2018) 63843-63858.

[10]

J. Zhao, X. Sun, Q. Li, X. Ma, Edge caching and computation management for real-time Internet of vehicles: an online and distributed approach, IEEE Trans. Intell. Transp. Syst. 22 (4) (2021) 2183-2197.

[11]

S. Guo, X. Zhou, Robust resource allocation with imperfect channel estimation in NOMA-based heterogeneous vehicular networks, IEEE Trans. Commun. 67 (3) (2019) 2321-2332.

[12]

Y. Zhi, J. Tian, X. Deng, J. Qiao, D. Lu, Deep reinforcement learning-based resource allocation for d2d communications in heterogeneous cellular networks, Digit. Com-mun. Netw. 8 (5) (2022) 834-842.

[13]

J. Zhao, Q. Li, Y. Gong, K. Zhang, Computation offloading and resource allocation for cloud assisted mobile edge computing in vehicular networks, IEEE Trans. Veh. Technol. 68 (8) (2019) 7944-7956.

[14]

Z. Xiong, Y. Zhang, D. Niyato, R. Deng, P. Wang, L.-C. Wang, Deep reinforcement learning for mobile 5G and beyond: fundamentals, applications, and challenges, IEEE Veh. Technol. Mag. 14 (2) (2019) 44-52.

[15]

J. Zhao, J. Liu, L. Yang, B. Ai, S. Ni, Future 5G-oriented system for urban rail transit: opportunities and challenges, China Commun. 18 (2) (2021) 1-12.

[16]

Y. Sinan Nasir, D. Guo, Deep Actor-Critic learning for distributed power control in wireless mobile networks, in: 2020 54th Asilomar Conference on Signals, Systems, and Computers, 2020, pp. 398-402.

[17]

R. Sutton, A. Barto, Reinforcement Learning:An Introduction, MIT Press, Cambridge, MA, 1998.

[18]

Z. Qin, H. Ye, G.Y. Li, B.-H.F. Juang, Deep learning in physical layer communications, IEEE Wirel. Commun. 26 (2) (2019) 93-99.

[19]

T. Fu, C. Wang, N. Cheng, Deep-learning-based joint optimization of renewable en-ergy storage and routing in vehicular energy network, IEEE Int. Things J. 7 (7) (2020) 6229-6241.

[20]

F. Tang, Y. Zhou, N. Kato, Deep reinforcement learning for dynamic uplink/downlink resource allocation in high mobility 5G hetnet, IEEE J. Sel. Areas Commun. 38 (12) (2020) 2773-2782.

[21]

J. Tan, Y.-C. Liang, L. Zhang, G. Feng, Deep reinforcement learning for joint channel selection and power control in D2D networks, IEEE Trans. Wirel. Commun. 20 (2) (2021) 1363-1378.

[22]

L. Liang, G.Y. Li, W. Xu, Resource allocation for D2D-enabled vehicular communi-cations, IEEE Trans. Commun. 65 (7) (2017) 3186-3197.

[23]

G. Araniti, C. Campolo, M. Condoluci, A. Iera, A. Molinaro, LTE for vehicular net-working: a survey, IEEE Commun. Mag. 51 (5) (2013) 148-157.

[24]

H. Ye, G.Y. Li, B.-H.F. Juang, Deep reinforcement learning based resource allocation for V2V communications, IEEE Trans. Veh. Technol. 68 (4) (2019) 3163-3173.

[25]

J. Li, J. Zhao, X. Sun,Deep reinforcement learning based wireless resource alloca-tion for v2x communications, in:2021 13th International Conference on Wireless Communications and Signal Processing (WCSP), 2021, pp. 1-5.

[26]

J. Zhao, S. Ni, L. Yang, Z. Zhang, Y. Gong, X. You, Multiband cooperation for 5G hetnets: a promising network paradigm, IEEE Veh. Technol. Mag. 14 (4) (2019) 85-93.

[27]

L. Liang, H. Ye, G.Y. Li, Spectrum sharing in vehicular networks based on multi-agent reinforcement learning, IEEE J. Sel. Areas Commun. 37 (10) (2019) 2282-2292.

[28]

P.K. Donta, T. Amgoth, C.S.R. Annavarapu, Delay-aware data fusion in duty-cycled wireless sensor networks: a q-learning approach, Sustain. Comput. Inform. Syst. 33 (2022) 100642.

[29]

A. Galindo-Serrano, L. Giupponi, Distributed Q-learning for interference control in OFDMA-based femtocell networks, in: 2010 IEEE 71st Vehicular Technology Con-ference, 2010.

[30]

J. Foerster, N. Nardelli, G. Farquhar, T. Afouras, P.H.S. Torr, P. Kohli, S. White-son, Stabilising experience replay for deep multi-agent reinforcement learning, in: Proceedings of the 34th International Conference on Machine Learning,in: PMLR, vol. 70, 2017, pp. 1146-1155.

[31]

G. Tesauro, Extending Q-Learning to General Adaptive Multi-Agent Systems, Ad-vances in Neural Information Processing Systems, vol. 16, MIT Press, 2003.

[32]

C. Qiu, Y. Hu, Y. Chen, B. Zeng, Deep deterministic policy gradient (ddpg)- based energy harvesting wireless communications, IEEE Int. Things J. 6 (5) (2019) 8577-8588.

[33]

Y.d.J. Bultitude, T. Rautiainen, IST-4- 027756 WINNER II D1.1.2 v1.2 WINNER II channel models, Tech. Rep. EBITG, TUI, UOULU, CU/CRC, NOKIA, 2007.

[34]

H. Ye, G.Y. Li,Deep reinforcement learning based distributed resource allocation for V2V broadcasting, in:2018 14th International Wireless Communications Mobile Computing Conference (IWCMC), 2018, pp. 440-445.

[35]

Technical specification group radio access network; study LTE-based V2X services (Release 14), Tech. Rep., 3GPP TR 36.885 V14.0.0, 3rd Generation Partnership Project, 2016.

AI Summary AI Mindmap
PDF

1197

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/