AQROM: A quality of service aware routing optimization mechanism based on asynchronous advantage actor-critic in software-defined networks

Wei Zhou , Xing Jiang , Qingsong Luo , Bingli Guo , Xiang Sun , Fengyuan Sun , Lingyu Meng

›› 2024, Vol. 10 ›› Issue (5) : 1405 -1414.

PDF
›› 2024, Vol. 10 ›› Issue (5) :1405 -1414. DOI: 10.1016/j.dcan.2022.11.016
Research article
research-article

AQROM: A quality of service aware routing optimization mechanism based on asynchronous advantage actor-critic in software-defined networks

Author information +
History +
PDF

Abstract

In Software-Defined Networks (SDNs), determining how to efficiently achieve Quality of Service (QoS)-aware routing is challenging but critical for significantly improving the performance of a network, where the metrics of QoS can be defined as, for example, average latency, packet loss ratio, and throughput. The SDN controller can use network statistics and a Deep Reinforcement Learning (DRL) method to resolve this challenge. In this paper, we formulate dynamic routing in an SDN as a Markov decision process and propose a DRL algorithm called the Asynchronous Advantage Actor-Critic QoS-aware Routing Optimization Mechanism (AQROM) to determine routing strategies that balance the traffic loads in the network. AQROM can improve the QoS of the network and reduce the training time via dynamic routing strategy updates; that is, the reward function can be dynamically and promptly altered based on the optimization objective regardless of the network topology and traffic pattern. AQROM can be considered as one-step optimization and a black-box routing mechanism in high-dimensional input and output sets for both discrete and continuous states, and actions with respect to the operations in the SDN. Extensive simulations were conducted using OMNeT++ and the results demonstrated that AQROM 1) achieved much faster and stable convergence than the Deep Deterministic Policy Gradient (DDPG) and Advantage Actor-Critic (A2C), 2) incurred a lower packet loss ratio and latency than Open Shortest Path First (OSPF), DDPG, and A2C, and 3) resulted in higher and more stable throughput than OSPF, DDPG, and A2C.

Keywords

Software-defined networks / Asynchronous advantage actor-critic / QoS-aware routing optimization mechanism

Cite this article

Download citation ▾
Wei Zhou, Xing Jiang, Qingsong Luo, Bingli Guo, Xiang Sun, Fengyuan Sun, Lingyu Meng. AQROM: A quality of service aware routing optimization mechanism based on asynchronous advantage actor-critic in software-defined networks. , 2024, 10(5): 1405-1414 DOI:10.1016/j.dcan.2022.11.016

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

X. Sun, N. Ansari, Latency aware workload offloading in the cloudlet network, IEEE Commun. Lett. 21 (7) (2017) 1481-1484.

[2]

K. Arulkumaran, M.P. Deisenroth, M. Brundage, A.A. Bharath, Deep reinforcement learning: a brief survey, IEEE Signal Process. Mag. 34 (6) (2017) 26-38.

[3]

W.G. Hatcher, W. Yu, A survey of deep learning: platforms, applications and emerging research trends, IEEE Access 6 (2018) 24411-24432.

[4]

R.S. Sutton, A.G. Barto, Reinforcement Learning: an Introduction, MIT press, 2018.

[5]

V. Mnih, K. Kavukcuoglu, D. Silver, Human-level control through deep reinforcement learning, Nature 518 (7540) (2015) 529—-533.

[6]

C. Yu, J. Lan, Z. Guo, Y. Hu, DROM: optimizing the routing in software-defined networks with deep reinforcement learning, IEEE Access 6 (2018) 64533-64539.

[7]

Z. Tu, H. Zhou, K. Li, G. Li, Q. Shen, A routing optimization method for softwaredefined sgin based on deep reinforcement learning, in: 2019 IEEE Globecom Workshops, IEEE, 2019, pp. 1-6.

[8]

T. Mahboob, Y.R. Jung, M.Y. Chung, Optimized routing in software defined networks-a reinforcement learning approach, in: International Conference on Ubiquitous Information Management and Communication, Springer, 2019, pp. 267-278.

[9]

S. Troia, F. Sapienza, L. Varé, G. Maier, On deep reinforcement learning for traffic engineering in SD-WAN, IEEE J. Sel. Area. Commun. 39 (7) (2020) 2198-2212.

[10]

W. Liu, Intelligent routing based on deep reinforcement learning in softwaredefined data-center networks, in: 2019 IEEE Symposium on Computers and Communications, IEEE, 2019, pp. 1-6.

[11]

Y. Zhang, I. Clavera, B. Tsai, P. Abbeel, Asynchronous Methods for Model-Based Reinforcement Learning, 2019 arXiv e-prints arXiv: 1910.12453.

[12]

X. Sun, N. Ansari, EdgeIoT, Mobile edge computing for the internet of things, IEEE Commun. Mag. 54 (12) (2016) 22-29.

[13]

A. Mestres, A. Rodriguez-Natal, J. Carner, P. Barlet-Ros, E. Alarc_on M. Sol_e V. Munt_es-Mulero, D. Meyer, S. Barkai, M.J. Hibbett, et al. Knowledge-defined networking, Comput. Commun. Rev. 47 (3) (2017) 2-10.

[14]

A. Varga, R. Hornig, An overview of the OMNeTtt simulation environment, in: Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems, 2010.

[15]

M. Roughan, Simplifying the synthesis of Internet traffic matrices, Comput. Commun. Rev. 35 (5) (2005) 93-96.

[16]

P. Tune, M. Roughan, Spatiotemporal traffic matrix synthesis, in: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, ACM, 2015.

[17]

S. Iqbal, F. Sha, Actor-attention-critic for multi-agent reinforcement learning, in: International Conference on Machine Learning, PMLR, 2019, pp. 2961-2970.

PDF

83

Accesses

0

Citation

Detail

Sections
Recommended

/