PDF
Abstract
This paper investigates the multi-Unmanned Aerial Vehicle (UAV)-assisted wireless-powered Mobile Edge Computing (MEC) system, where UAVs provide computation and powering services to mobile terminals. We aim to maximize the number of completed computation tasks by jointly optimizing the offloading decisions of all terminals and the trajectory planning of all UAVs. The action space of the system is extremely large and grows exponentially with the number of UAVs. In this case, single-agent learning will require an overlarge neural network, resulting in insufficient exploration. However, the offloading decisions and trajectory planning are two subproblems performed by different executants, providing an opportunity for problem-solving. We thus adopt the idea of decomposition and propose a 2-Tiered Multi-agent Soft Actor-Critic (2T-MSAC) algorithm, decomposing a single neural network into multiple small-scale networks. In the first tier, a single agent is used for offloading decisions, and an online pretrained model based on imitation learning is specially designed to accelerate the training process of this agent. In the second tier, UAVs utilize multiple agents to plan their trajectories. Each agent exerts its influence on the parameter update of other agents through actions and rewards, thereby achieving joint optimization. Simulation results demonstrate that the proposed algorithm can be applied to scenarios with various location distributions of terminals, outperforming existing benchmarks that perform well only in specific scenarios. In particular, 2T-MSAC increases the number of completed tasks by 45.5% in the scenario with uneven terminal distributions. Moreover, the pretrained model based on imitation learning reduces the convergence time of 2T-MSAC by 58.2%.
Keywords
Mobile-edge computing
/
Multi-agent reinforcement learning
/
Offloading decision
/
Trajectory planning
/
Unmanned aerial vehicle
/
Wireless power transfer
Cite this article
Download citation ▾
Xiaoyi Zhou, Liang Huang, Tong Ye, Weiqiang Sun.
Decomposition-based learning in drone-assisted wireless-powered mobile edge computing networks.
, 2024, 10(6): 1769-1781 DOI:10.1016/j.dcan.2023.11.010
| [1] |
Y. Ai, M. Peng, K. Zhang, Edge computing technologies for Internet of things: a primer, Digit. Commun. Netw. 4(2) (2018) 77-86.
|
| [2] |
K. Sha, T.A. Yang, W. Wei, S. Davari, A survey of edge computing-based designs for iot security, Digit. Commun. Netw. 6(2) (2020) 195-202.
|
| [3] |
Matrice 600 pro user manual, https://dl.djicdn.com/downloads/m600%20pro/1208EN/Matrice_600_Pro_User_Manual_v1.0_EN_1208.pdf, 2018. (Accessed 17 Oc-tober 2022).
|
| [4] |
J. Gallego-Madrid, A. Molina-Zarca, R. Sanchez-Iborra, J. Bernal-Bernabe, J. Santa, P. M. Ruiz, A.F. Skarmeta-Gómez, Enhancing extensive and remote LoRa deploy-ments through MEC-powered drone gateways, Sensors 20 (15) (2020) 4109-4124.
|
| [5] |
A. Hermosilla, A.M. Zarca, J.B. Bernabe, J. Ortiz, A. Skarmeta, Security orchestra-tion and enforcement in NFV/SDN-aware UAV deployments, IEEE Access 8 (2020) 131779-131795.
|
| [6] |
N. Shinohara, N. Kamiyoshikawa,Study of flat beam in near-field for beam-type wireless power transfer via microwaves, in:Proceedings of the 2017 11th European Conference on Antennas and Propagation, 2017, pp. 780-782.
|
| [7] |
B. Griffin, C. Detweiler, Resonant wireless power transfer to ground sensors from a UAV, in: Proceedings of the 2012 IEEE International Conference on Robotics and Automation, IEEE, 2012, pp. 2660-2665.
|
| [8] |
X. Lu, P. Wang, D. Niyato, D.I. Kim, Z. Han, Wireless networks with RF energy har-vesting: a contemporary survey, IEEE Commun. Surv. Tuts. 17 (2) (2015) 757-789.
|
| [9] |
F. Zhou, Y. Wu, R.Q. Hu, Y. Qian, Computation rate maximization in UAV-enabled wireless-powered mobile-edge computing systems, IEEE J. Sel. Areas Commun. 36 (9) (2018) 1927-1941.
|
| [10] |
F. Zhou, Y. Wu, H. Sun, Z. Chu, UAV-enabled mobile edge computing: offloading optimization and trajectory design,in:Proceedings of the 2018 IEEE International Conference on Communications, IEEE, 2018, pp. 1-6.
|
| [11] |
J. Wang, C. Jin, Q. Tang, N.N. Xiong, G. Srivastava, Intelligent ubiquitous network accessibility for wireless-powered MEC in UAV-assisted B5G, IEEE Trans. Netw. Sci. Eng. 8(4) (2021) 2801-2813.
|
| [12] |
Y. Du, K. Yang, K. Wang, G. Zhang, Y. Zhao, D. Chen, Joint resources and workflow scheduling in UAV-enabled wirelessly-powered MEC for IoT systems, IEEE Trans. Veh. Technol. 68 (10) (2019) 10187-10200.
|
| [13] |
W. Feng, J. Tang, N. Zhao, X. Zhang, X. Wang, K. Wong, J.A. Chambers, Hybrid beamforming design and resource allocation for UAV-aided wireless-powered mo-bile edge computing networks with NOMA, IEEE J. Sel. Areas Commun. 39 (11)(2021) 3271-3286.
|
| [14] |
Y. Liu, K. Xiong, Q. Ni, P. Fan, K.B. Letaief, UAV-assisted wireless powered cooper-ative mobile edge computing: joint offloading, cpu control, and trajectory optimiza-tion, IEEE Int. Things J. 7(4) (2020) 2777-2790.
|
| [15] |
X. Hu, K. Wong, Y. Zhang, Wireless-powered edge computing with cooperative UAV: task, time scheduling and trajectory design, IEEE Trans. Wirel. Commun. 19 (12)(2020) 8083-8098.
|
| [16] |
X. Zhou, L. Huang, T. Ye, W. Sun, Computation bits maximization in UAV-assisted MEC networks with fairness constraint, IEEE Int. Things J. 9 (21) (2022) 20997-21009.
|
| [17] |
D. Wei, J. Ma, L. Luo, Y. Wang, L. He, X. Li, Computation offloading over multi-UAV MEC network: a distributed deep reinforcement learning approach, Comput. Netw. 199 (2021) 108439.
|
| [18] |
A.M. Seid, G.O. Boateng, B. Mareri, G. Sun, W. Jiang, Multi-agent DRL for task offloading and resource allocation in multi-UAV enabled IoT edge network, IEEE Trans. Netw. Serv. Manag. 18 (4) (2021) 4531-4547.
|
| [19] |
Z. Zhu, S. Wan, P. Fan, K.B. Letaief, Federated multiagent actor-critic learning for age sensitive mobile-edge computing, IEEE Int. Things J. 9(2) (2022) 1053-1067.
|
| [20] |
H. Peng, X. Shen, Multi-agent reinforcement learning based resource management in MEC- and UAV-assisted vehicular networks, IEEE J. Sel. Areas Commun. 39 (1)(2021) 131-141.
|
| [21] |
L. Wang, K. Wang, C. Pan, W. Xu, N. Aslam, L. Hanzo, Multi-agent deep rein-forcement learning-based trajectory planning for multi-UAV assisted mobile edge computing, IEEE Trans. Cogn. Commun. Netw. 7(1) (2021) 73-84.
|
| [22] |
N. Cheng, F. Lyu, W. Quan, C. Zhou, H. He, W. Shi, X. Shen, Space/aerial-assisted computing offloading for IoT applications: a learning-based approach, IEEE J. Sel. Areas Commun. 37 (5) (2019) 1117-1129.
|
| [23] |
P. Juang, H. Oki, Y. Wang, M. Martonosi, L.S. Peh, D. Rubenstein, Energy-efficient computing for wildlife tracking: design tradeoffs and early experiences with ze-branet, Comput. Archit. News 30 (5) (2002) 96-107.
|
| [24] |
X. Li, L. Huang, H. Wang, S. Bi, Y.A. Zhang, An integrated optimization-learning framework for online combinatorial computation offloading in MEC networks, IEEE Wirel. Commun. 29 (1) (2022) 170-177.
|
| [25] |
Q. Liu, L. Shi, L. Sun, J. Li, M. Ding, F. Shu, Path planning for UAV-mounted mobile edge computing with deep reinforcement learning, IEEE Trans. Veh. Technol. 69 (5)(2020) 5723-5728.
|
| [26] |
T. Ren, J. Niu, B. Dai, X. Liu, Z. Hu, M. Xu, M. Guizani, Enabling efficient scheduling in large-scale UAV-assisted mobile-edge computing via hierarchical reinforcement learning, IEEE Int. Things J. 9 (10) (2022) 7095-7109.
|
| [27] |
M.L. Littman, Markov games as a framework for multi-agent reinforcement learn-ing, in: Proceedings of the 1994 Machine Learning Proceedings, Morgan Kaufmann, 1994, pp. 157-163.
|
| [28] |
L. Panait, S. Luke, Cooperative multi-agent learning: the state of the art, Auton. Agents Multi-Agent Syst. 11 (3) (2005) 387-434.
|
| [29] |
T.D. Kulkarni, K. Narasimhan, A. Saeedi, J. Tenenbaum, Hierarchical deep rein-forcement learning: integrating temporal abstraction and intrinsic motivation,in:Proceedings of the Advances in Neural Information Processing Systems, Curran As-sociates, Inc., 2016, pp. 3675-3683.
|
| [30] |
Z. Chu, P. Xiao, M. Shojafar, D. Mi, W. Hao, J. Shi, F. Zhou, Utility maximization for IRS assisted wireless powered mobile edge computing and caching (WP-MECC) networks, IEEE Trans. Commun. 71 (1) (2023) 457-472.
|
| [31] |
Z. Zhu, Z. Li, Z. Chu, G. Sun, W. Hao, P. Liu, I. Lee, Resource allocation for intelligent reflecting surface assisted wireless powered IoT systems with power splitting, IEEE Trans. Wirel. Commun. 21 (5) (2022) 2987-2998.
|
| [32] |
T.P. Lillicrap, J.J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning, arXiv :1509.02971.
|
| [33] |
S. Bi, L. Huang, H. Wang, Y.A. Zhang, Lyapunov-guided deep reinforcement learning for stable online computation offloading in mobile-edge computing networks, IEEE Trans. Wirel. Commun. 20 (11) (2021) 7519-7537.
|
| [34] |
X. Liu, Y. Liu, Y. Chen, L. Hanzo, Trajectory design and power control for multi-UAV assisted wireless networks: a machine learning approach, IEEE Trans. Veh. Technol. 68 (8) (2019) 7957-7969.
|
| [35] |
C. Liu, F. Tang, Y. Hu, K. Li, Z. Tang, K. Li, Distributed task migration optimization in MEC by extending multi-agent deep reinforcement learning approach, IEEE Trans. Parallel Distrib. Syst. 32 (7) (2021) 1603-1614.
|
| [36] |
L. Wu, Z. Liu, P. Sun, H. Chen, K. Wang, Y. Zuo, Y. Yang, DOT: decentralized offload-ing of tasks in OFDMA-based heterogeneous computing networks, IEEE Int. Things J. 9 (20) (2022) 20071-20082.
|
| [37] |
L. Huang, S. Bi, Y.A. Zhang, Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks, IEEE Trans. Mob. Comput. 19 (11) (2020) 2581-2593.
|
| [38] |
A. Al-Hourani, S. Kandeepan, S. Lardner, Optimal lap altitude for maximum cover-age, IEEE Wirel. Commun. Lett. 3(6) (2014) 569-572.
|
| [39] |
T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel, S. Levine, Soft actor-critic algorithms and applications, CoRR, arXiv : 1812.05905.
|
| [40] |
V. Mnih, K. Kavukcuoglu, D. Silver, A.A. Rusu, J. Veness, M.G. Bellemare, A. Graves, M. Riedmiller, A.K. Fidjeland, G. Ostrovski, Human-level control through deep rein-forcement learning, Nature 518 (7540) (2015) 529-533.
|
| [41] |
T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. Horgan, J. Quan, A. Sendonaris, I. Osband, G. Dulac-Arnold, J. Agapiou, J. Leibo, A. Gruslys,Deep Q-learning from demonstrations, in:Proceedings of the AAAI Conference on Artificial Intelligence, 2018, pp. 1-12.
|
| [42] |
A.K. Jain, Data clustering: 50 years beyond K-means, Pattern Recognit. Lett. 31 (8)(2010) 651-666.
|
| [43] |
J. Cui, Z. Ding, P. Fan, N. Al-Dhahir, Unsupervised machine learning-based user clustering in millimeter-wave-NOMA systems, IEEE Trans. Wirel. Commun. 17 (11)(2018) 7425-7440.
|
| [44] |
C.E. Shannon, A mathematical theory of communication, Mob. Comput. Commun. Rev. 5(1) (2001) 3-55.
|
| [45] |
R.I. Bor-Yaliniz, A. El-Keyi, H. Yanikomeroglu, Efficient 3D placement of an aerial base station in next generation cellular networks, in: Proceedings of the 2016 IEEE International Conference on Communications, IEEE, 2016, pp. 1-5.
|
| [46] |
S. Bi, L. Huang, Y.A. Zhang, Joint optimization of service caching placement and computation offloading in mobile edge computing systems, IEEE Trans. Wirel. Com-mun. 19 (7) (2020) 4947-4963.
|
| [47] |
J. Wang, K. Liu, J. Pan, Online UAV-mounted edge server dispatching for mobile-to-mobile edge computing, IEEE Int. Things J. 7(2) (2020) 1375-1386.
|