Federated reinforcement learning: techniques, applications, and open challenges

Jiaju Qi , Qihao Zhou , Lei Lei , Kan Zheng

Intelligence & Robotics ›› 2021, Vol. 1 ›› Issue (1) : 18 -57.

PDF
Intelligence & Robotics ›› 2021, Vol. 1 ›› Issue (1) :18 -57. DOI: 10.20517/ir.2021.02
Review
Review

Federated reinforcement learning: techniques, applications, and open challenges

Author information +
History +
PDF

Abstract

This paper presents a comprehensive survey of federated reinforcement learning (FRL), an emerging and promising field in reinforcement learning (RL). Starting with a tutorial of federated learning (FL) and RL, we then focus on the introduction of FRL as a new method with great potential by leveraging the basic idea of FL to improve the performance of RL while preserving data-privacy. According to the distribution characteristics of the agents in the framework, FRL algorithms can be divided into two categories, i.e., Horizontal Federated Reinforcement Learning and vertical federated reinforcement learning (VFRL). We provide the detailed definitions of each category by formulas, investigate the evolution of FRL from a technical perspective, and highlight its advantages over previous RL algorithms. In addition, the existing works on FRL are summarized by application fields, including edge computing, communication, control optimization, and attack detection. Finally, we describe and discuss several key research directions that are crucial to solving the open problems within FRL.

Keywords

Federated Learning / Reinforcement Learning / Federated Reinforcement Learning

Cite this article

Download citation ▾
Jiaju Qi, Qihao Zhou, Lei Lei, Kan Zheng. Federated reinforcement learning: techniques, applications, and open challenges. Intelligence & Robotics, 2021, 1(1): 18-57 DOI:10.20517/ir.2021.02

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Nair A, Srinivasan P, Blackwell S, et al. Massively parallel methods for deep reinforcement learning. CoRR 2015;abs/1507.04296. Available from: http://arxiv.org/abs/1507.04296.

[2]

Grounds M.Tuyls K,Guessoum Z.Parallel reinforcement learning with linear function approximation..Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning.2008;Berlin, HeidelbergSpringer Berlin Heidelberg60-74

[3]

Clemente AV, Martínez HNC, Chandra A. Efficient parallel methods for deep reinforcement learning. CoRR 2017;abs/1705.04862. Available from: http://arxiv.org/abs/1705.04862.

[4]

Lim WYB,Hoang DT.Federated learning in mobile edge networks: a comprehensive survey..IEEE Communications Surveys Tutorials2020;22:2031-63

[5]

Nguyen DC,Pathirana PN.Federated learning for internet of things: a comprehensive survey..IEEE Communications Surveys Tutorials2021;23:1622-58

[6]

Khan LU,Han Z,Hong CS.Federated learning for internet of things: recent advances, taxonomy, and open challenges..IEEE Communications Surveys Tutorials2021;23:1759-99

[7]

Yang Q,Cheng Y.1st ed.2019;Morgan & Claypool

[8]

Yang Q,Chen T.Federated machine learning: concept and applications..ACM Transactions on Intelligent Systems and Technology (TIST)2019;10:1-19

[9]

Qinbin L, Zeyi W, Bingsheng H. Federated learning systems: vision, hype and reality for data privacy and protection. CoRR 2019;abs/1907.09693. Available from: http://arxiv.org/abs/1907.09693.

[10]

Li T,Talwalkar A.Federated learning: challenges, methods, and future directions..IEEE Signal Processing Magazine2020;37:50-60

[11]

Wang S,Salonidis T,Makaya C.Adaptive federated learning in resource constrained edge computing systems..IEEE Journal on Selected Areas in Communications2019;37:1205-21

[12]

McMahan HB, Moore E, Ramage D, y Arcas BA. Communication-efficient learning of deep networks from decentralized data. CoRR 2016;abs/1602.05629. Available from: http://arxiv.org/abs/1602.05629.

[13]

Phong LT,Hayashi T,Moriai S.Privacy-preserving deep learning via additively homomorphic encryption..IEEE Transactions on Information Forensics and Security2018;13:1333-45

[14]

Zhu H.Multi-objective evolutionary federated learning..IEEE Transactions on Neural Networks and Learning Systems2020;31:1310-22

[15]

Kairouz P, McMahan HB, Avent B, et al. Advances and open problems in federated learning. CoRR 2019;abs/1912.04977. Available from: http://arxiv.org/abs/1912.04977.

[16]

Pan SJ.A survey on transfer learning..IEEE Transactions on Knowledge and Data Engineering2010;22:1345-59

[17]

Li Y. Deep reinforcement learning: an overview. CoRR 2017;abs/1701.07274. Available from: http://arxiv.org/abs/1701.07274.

[18]

Xu Z,Meng J.Experience-driven networking: a deep reinforcement learning based approach. In: IEEE INFOCOM 2018-IEEE Conference on Computer Communications.2018;IEEE1871-79

[19]

Mohammadi M,Guizani M.Semisupervised deep reinforcement learning in support of IoT and smart city services..IEEE Internet of Things Journal2018;5:624-35

[20]

Bu F, Wang X. A smart agriculture IoT system based on deep reinforcement learning. Future Generation Computer Systems 2019;99:500–507. Available from: https://www.sciencedirect.com/science/article/pii/S0167739X19307277.

[21]

Xiong X,Lei L.Resource allocation based on deep reinforcement learning in IoT edge computing..IEEE Journal on Selected Areas in Communications2020;38:1133-46

[22]

Lei L,Zheng K.Patent analytics based on feature vector space model: a case of IoT..IEEE Access2019;7:45705-15

[23]

Shalev-Shwartz S, Shammah S, Shashua A. Safe, multi-agent, reinforcement learning for autonomous driving. CoRR 2016;abs/1610.03295. Available from: http://arxiv.org/abs/1610.03295.

[24]

Sallab AE,Perot E.Deep reinforcement learning framework for autonomous driving..Electronic Imaging2017;2017:70-76

[25]

Taylor ME. Teaching reinforcement learning with mario: an argument and case study. In: Second AAAI Symposium on Educational Advances in Artificial Intelligence; 2011. Available from: https://www.aaai.org/ocs/index.php/EAAI/EAAI11/paper/viewPaper/3515.

[26]

Holcomb SD,Ault SV,Wang J.Overview on deepmind and its alphago zero ai. In: Proceedings of the 2018 international conference on big data and education2018;67-71

[27]

Watkins CJ, Dayan P. Q-learning. Machine learning 1992;8:279–92. Available from: https://link.springer.com/content/pdf/10.1007/BF00992698.pdf.

[28]

Thorpe TL. Vehicle traffic light control using sarsa. In: Online]. Available: citeseer. ist. psu. edu/thorpe97vehicle. html. Citeseer; 1997. Available from: https://citeseer.ist.psu.edu/thorpe97vehicle.html.

[29]

Silver D, Lever G, Heess N, et al. Deterministic policy gradient algorithms. In: Xing EP, Jebara T, editors. Proceedings of the 31st International Conference on Machine Learning. vol. 32 of Proceedings of Machine Learning Research. Bejing, China: PMLR; 2014. pp. 387–95. Available from: https://proceedings.mlr.press/v32/silver14.html.

[30]

Williams RJ.Simple statistical gradient-following algorithms for connectionist reinforcement learning..Machine learning1992;8:229-56

[31]

Konda VR, Tsitsiklis JN. Actor-critic algorithms. In: Advances in neural information processing systems; 2000. pp. 1008–14. Available from: https://proceedings.neurips.cc/paper/1786-actor-critic-algorithms.pdf

[32]

Henderson P, Islam R, Bachman P, et al. Deep reinforcement learning that matters. In: Proceedings of the AAAI conference on artificial intelligence. vol. 32; 2018. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/11694.

[33]

Lei L,Dahlenburg G,Zheng K.Dynamic energy dispatch based on Deep Reinforcement Learning in IoT-Driven Smart Isolated Microgrids..IEEE Internet of Things Journal2021;8:7938-53

[34]

Lei L,Xiong X,Xiang W.Multiuser resource control with deep reinforcement learning in IoT edge computing..IEEE Internet of Things Journal2019;6:10119-33

[35]

Ohnishi S,Yamaguchi Y,Yasui Y.Constrained deep q-learning gradually approaching ordinary q-learning..Frontiers in neurorobotics2019;13:103 PMCID:PMC6914867

[36]

Peng J.Incremental multi-step Q-learning. In: machine learning proceedings 1994.1994;Elsevier226-32

[37]

Mnih V,Silver D.Human-level control through deep reinforcement learning..Nature2015;518:529-33

[38]

Lei L,Zheng K.Deep reinforcement learning for autonomous internet of things: model, applications and challenges..IEEE Communications Surveys Tutorials2020;22:1722-60

[39]

Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double q-learning. In: proceedings of the AAAI conference on artificial intelligence. vol. 30; 2016. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/10295.

[40]

Schaul T, Quan J, Antonoglou I, Silver D. Prioritized experience replay. arXiv preprint arXiv:151105952 2015. Available from: https://arxiv.org/abs/1511.05952.

[41]

Gu S, Lillicrap TP, Ghahramani Z, Turner RE, Levine S. Q-Prop: sample-efficient policy gradient with an off-policy critic. CoRR 2016;abs/1611.02247. Available from: http://arxiv.org/abs/1611.02247.

[42]

Haarnoja T, Zhou A, Abbeel P, Levine S. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy J, Krause A, editors. Proceedings of the 35th International Conference on Machine Learning. vol. 80 of Proceedings of Machine Learning Research. PMLR; 2018. pp. 1861–70. Available from: https://proceedings.mlr.press/v80/haarnoja18b.html.

[43]

Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, et al. Asynchronous methods for deep reinforcement learning. In: Balcan MF, Weinberger KQ, editors. Proceedings of The 33rd International Conference on Machine Learning. vol. 48 of Proceedings of Machine Learning Research. New York, New York, USA: PMLR; 2016. pp. 1928–37. Available from: https://proceedings.mlr.press/v48/mniha16.html.

[44]

Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv: 150902971 2015. Available from: https://arxiv.org/abs/1509.02971.

[45]

Barth-Maron G, Hoffman MW, Budden D, et al. Distributed distributional deterministic policy gradients. CoRR 2018;abs/1804.08617. Available from: http://arxiv.org/abs/1804.08617.

[46]

Fujimoto S, van Hoof H, Meger D. Addressing function approximation error in actor-critic methods. In: Dy J, Krause A, editors. Proceedings of the 35th International Conference on Machine Learning. vol. 80 of Proceedings of Machine Learning Research. PMLR; 2018. pp. 1587–96. Available from: https://proceedings.mlr.press/v80/fujimoto18a.html.

[47]

Schulman J, Levine S, Abbeel P, Jordan M, Moritz P. Trust region policy optimization. In: Bach F, Blei D, editors. Proceedings of the 32nd International Conference on Machine Learning. vol. 37 of Proceedings of Machine Learning Research. Lille, France: PMLR; 2015. pp. 1889–97. Available from: https://proceedings.mlr.press/v37/schulman15.html.

[48]

Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. CoRR 2017;abs/1707.06347. Available from: http://arxiv.org/abs/1707.06347.

[49]

Zhu P, Li X, Poupart P. On improving deep reinforcement learning for POMDPs. CoRR 2017;abs/1704.07978. Available from: http://arxiv.org/abs/1704.07978.

[50]

Hausknecht M, Stone P. Deep recurrent q-learning for partially observable mdps. In: 2015 aaai fall symposium series; 2015. Available from: https://www.aaai.org/ocs/index.php/FSS/FSS15/paper/viewPaper/11673.

[51]

Heess N, Hunt JJ, Lillicrap TP, Silver D. Memory-based control with recurrent neural networks. CoRR 2015;abs/1512.04455. Available from: http://arxiv.org/abs/1512.04455.

[52]

Foerster J, Nardelli N, Farquhar G, et al. Stabilising experience replay for deep multi-agent reinforcement learning. In: Precup D, Teh YW, editors. Proceedings of the 34th International Conference on Machine Learning. vol. 70 of Proceedings of Machine Learning Research. PMLR; 2017. pp. 1146–55. Available from: https://proceedings.mlr.press/v70/foerster17b.html.

[53]

Van der Pol E, Oliehoek FA. Coordinated deep reinforcement learners for traffic light control. Proceedings of learning, inference and control of multi-agent systems (at NIPS 2016) 2016. Available from: https://www.elisevanderpol.nl/papers/vanderpolNIPSMALIC2016.pdf.

[54]

Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S. Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 32; 2018. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/11794.

[55]

Lowe R, Wu Y, Tamar A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments. CoRR 2017;abs/1706.02275. Available from: http://arxiv.org/abs/1706.02275.

[56]

Nadiger C,Abdelhak S.Federated Reinforcement Learning for Fast Personalization. In: 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)2019;123-27

[57]

Liu B, Wang L, Liu M, Xu C. Lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic systems. CoRR 2019;abs/1901.06455. Available from: http://arxiv.org/abs/1901.06455.

[58]

Ren J,Hou T,Tang C.Federated learning-based computation offloading optimization in edge computing-supported internet of things..IEEE Access2019;7:69194-201

[59]

Wang X,Li X,Taleb T.Federated deep reinforcement learning for internet of things with decentralized cooperative edge caching..IEEE Internet of Things Journal2020;7:9441-55

[60]

Chen J, Monga R, Bengio S, Józefowicz R. Revisiting distributed synchronous SGD. CoRR 2016;abs/1604.00981. Available from: http://arxiv.org/abs/1604.00981.

[61]

Mnih V, Badia AP, Mirza M, et al. Asynchronous methods for deep reinforcement learning. In: Balcan MF, Weinberger KQ, editors. Proceedings of The 33rd International Conference on Machine Learning. vol. 48 of Proceedings of Machine Learning Research. New York, New York, USA: PMLR; 2016. pp. 1928–37. Available from: https://proceedings.mlr.press/v48/mniha16.html.

[62]

Espeholt L, Soyer H, Munos R, et al. IMPALA: Scalable distributed deep-RL with importance weighted actor- learner architectures. In: Dy J, Krause A, editors. Proceedings of the 35th International Conference on Machine Learning. vol. 80 of Proceedings of Machine Learning Research. PMLR; 2018. pp. 1407–16. Available from: http://proceedings.mlr.press/v80/espeholt18a.html.

[63]

Horgan D, Quan J, Budden D, et al. Distributed prioritized experience replay. CoRR 2018;abs/1803.00933. Available from: http://arxiv.org/abs/1803.00933.

[64]

Liu T,Ai Y.Parallel reinforcement learning: a framework and case study..IEEE/CAA Journal of Automatica Sinica2018;5:827-35

[65]

Zhuo HH, Feng W, Xu Q, Yang Q, Lin Y. Federated reinforcement learning. CoRR 2019;abs/1901.08277. Available from: http://arxiv.org/abs/1901.08277.

[66]

Canese L, Cardarilli GC, Di Nunzio L, et al. Multi-agent reinforcement learning: a review of challenges and applications. Applied Sciences 2021;11:4948. Available from: https://doi.org/10.3390/app11114948.

[67]

Busoniu L,De Schutter B.A Comprehensive survey of multiagent reinforcement learning..IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)2008;38:156-72

[68]

Zhang K,Başar T.Multi-agent reinforcement learning: a selective overview of theories and algorithms..Handbook of Rein forcement Learning and Control2021;321-84

[69]

Stone P.Multiagent systems: a survey from a machine learning perspective..Autonomous Robots2000;8:345-83

[70]

Szepesvári C.A unified analysis of value-function-based reinforcement-learning algorithms..Neural computation1999;11:2017-60

[71]

Littman ML.Value-function reinforcement learning in markov games..Cognitive systems research2001;2:55-66

[72]

Tan M.Multi-agent reinforcement learning: independent vs. cooperative agents. In: proceedings of the tenth international conference on machine learning1993;330-37

[73]

Lauer M, Riedmiller M. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: In Proceedings of the Seventeenth International Conference on Machine Learning. Citeseer; 2000. Available from: http://citeseerx.ist.psu.edu/viewdoc/summary.

[74]

Monahan GE.State of the art—a survey of partially observable Markov decision processes: theory, models, and algorithms..Management science1982;28:1-16

[75]

Oroojlooyjadid A, Hajinezhad D. A review of cooperative multi-agent deep reinforcement learning. CoRR 2019;abs/1908.03963. Available from: http://arxiv.org/abs/1908.03963.

[76]

Bernstein DS,Immerman N.The complexity of decentralized control of Markov decision processes..Mathematics of operations research2002;27:819-40

[77]

Omidshafiei S, Pazis J, Amato C, How JP, Vian J. Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In: Precup D, Teh YW, editors. Proceedings of the 34th International Conference on Machine Learning. vol. 70 of Proceedings of Machine Learning Research. PMLR; 2017. pp. 2681–90. Available from: https://proceedings.mlr.press/v70/omidshafiei17a.html.

[78]

Han Y.Ipomdp-net: A deep neural network for partially observable multi-agent planning using interactive pomdps. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 332019;6062-69

[79]

Karkus P, Hsu D, Lee WS. QMDP-Net: Deep learning for planning under partial observability; 2017. Available from: https://arxiv.org/abs/1703.06692.

[80]

Mao W,Miehling E.Information state embedding in partially observable cooperative multi-agent reinforcement learning. In: 2020 59th IEEE Conference on Decision and Control (CDC)2020;6124-31

[81]

Mao H, Zhang Z, Xiao Z, Gong Z. Modelling the dynamic joint policy of teammates with attention multi-agent DDPG. CoRR 2018;abs/1811.07029. Available from: http://arxiv.org/abs/1811.07029.

[82]

Lee HR.Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response..European Journal of Operational Research2021;291:296-308

[83]

Sukhbaatar S, szlam a, Fergus R. Learning multiagent communication with backpropagation. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, editors. Advances in Neural Information Processing Systems. vol. 29. Curran Associates, Inc.; 2016. Available from: https://proceedings.neurips.cc/paper/2016/file/55b1927fdafef39c48e5b73b5d61ea60-Paper.pdf.

[84]

Foerster JN, Assael YM, de Freitas N, Whiteson S. Learning to communicate with deep multi-agent reinforcement learning. CoRR 2016;abs/1605.06676. Available from: http://arxiv.org/abs/1605.06676.

[85]

Buşoniu L,De Schutter B.Multi-agent reinforcement learning: an overview..Innovations in multiagent systems and applications 12010;183-221

[86]

Hu Y,Liu W.Reward shaping based federated reinforcement learning..IEEE Access2021;9:67259-67

[87]

Anwar A, Raychowdhury A. Multi-task federated reinforcement learning with adversaries. CoRR 2021;abs/2103.06473. Available from: https://arxiv.org/abs/2103.06473.

[88]

Wang X,Wang C.In-edge AI: intelligentizing mobile edge computing, caching and communication by federated learning..IEEE Network2019;33:156-65

[89]

Wang X,Wang C.Attention-weighted federated deep reinforcement learning for device-to-device assisted heterogeneous collaborative edge caching..IEEE Journal on Selected Areas in Communications2021;39:154-69

[90]

Zhang M,Zheng FC,You X.Cooperative edge caching via federated deep reinforcement learning in Fog-RANs. In: 2021 IEEE International Conference on Communications Workshops (ICC Workshops)2021;1-6

[91]

Majidi F,Barekatain B.HFDRL: an intelligent dynamic cooperate cashing method based on hierarchical federated deep reinforcement learning in edge-enabled IoT..IEEE Internet of Things Journal2021;1-1

[92]

Zhao L,Wang H,Luo J.Towards cooperative caching for vehicular networks with multi-level federated reinforcement learning. In: ICC 2021 - IEEE International Conference on Communications2021;1-6

[93]

Zhu Z,Fan P.Federated multi-agent actor-critic learning for age sensitive mobile edge computing..IEEE Internet of Things Journal2021;1-1

[94]

Yu S, Chen X, Zhou Z, Gong X, Wu D. When deep reinforcement learning meets federated learning: intelligent multi-timescale resource management for multi-access edge computing in 5G ultra dense network. arXiv:200910601 [cs] 2020 Sep. ArXiv: 2009.10601. Available from: http://arxiv.org/abs/2009.10601.

[95]

Tianqing Z,Ye D,Li J.Resource allocation in IoT edge computing via concurrent federated reinforcement learning..IEEE Internet of Things Journal2021;1-1

[96]

Huang H,Zhao Y.Scalable orchestration of service function chains in NFV-enabled networks: a federated reinforcement learning approach..IEEE Journal on Selected Areas in Communications2021;39:2558-71

[97]

Liu YJ,Sun Y,Liang YC.Device association for RAN slicing based on hybrid federated deep reinforcement learning..IEEE Transactions on Vehicular Technology2020;69:15731-45

[98]

Wang G,Zhou Z.Measure Contribution of participants in federated learning. In: 2019 IEEE International Conference on Big Data (Big Data)2019;2597-604

[99]

Cao Y,Liang YC.Federated deep reinforcement learning for user access control in open radio access networks. In: ICC 2021 - IEEE International Conference on Communications2021;1-6

[100]

Zhang L,Zhou Z,Sun Y.Enhancing WiFi multiple access performance with federated deep reinforcement learning. In: 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall)2020;1-6

[101]

Xu M,Gupta BB.Multi-agent federated reinforcement learning for secure incentive mechanism in intelligent cyber-physical systems..IEEE Internet of Things Journal2021;1-1

[102]

Zhang X,Yan S.Deep-reinforcement-learning-based mode selection and resource allocation for cellular V2X communications..IEEE Internet of Things Journal2020;7:6380-91

[103]

Kwon D,Park S,Cho S.Multiagent DDPG-based deep learning for smart ocean federated learning IoT networks..IEEE Internet of Things Journal2020;7:9895-903

[104]

Liang X, Liu Y, Chen T, Liu M, Yang Q. Federated transfer reinforcement learning for autonomous driving. arXiv:191006001 [cs] 2019 Oct. ArXiv: 1910.06001. Available from: http://arxiv.org/abs/1910.06001.

[105]

Lim HK, Kim JB, Heo JS, Han YH. Federated reinforcement learning for training control policies on multiple IoT devices. Sensors 2020 Mar;20:1359. Available from: https://www.mdpi.com/1424-8220/20/5/1359.

[106]

Lim HK,Ullah I,Han YH.Federated reinforcement learning acceleration method for precise control of multiple devices..IEEE Access2021;9:76296-306

[107]

Mowla NI,Doh I.AFRL: Adaptive federated reinforcement learning for intelligent jamming defense in FANET..Journal of Communications and Networks2020;22:244-58

[108]

Nguyen TG,Hoang DT,So-In C.Federated deep reinforcement learning for traffic monitoring in SDN-Based IoT networks..IEEE Transactions on Cognitive Communications and Networking2021;1-1

[109]

Wang X,Lin H.Towards accurate anomaly detection in industrial internet-of-things using hierarchical federated learning..IEEE Internet of Things Journal2021;1-1

[110]

Lee S.Federated reinforcement learning for energy management of multiple smart homes with distributed energy resources..IEEE Transactions on Industrial Informatics2020;1-1

[111]

Samet H. The quadtree and related hierarchical data structures. ACM Comput Surv 1984;16:187–260. Available from: https://doi.org/10.1145/356924.356930.

[112]

Abdel-Aziz MK,Perfecto C.Cooperative perception in vehicular networks using multi-agent reinforcement learning. In: 2020 54th Asilomar Conference on Signals, Systems, and Computers2020;408-12

[113]

Wang H, Kaplan Z, Niu D, Li B. Optimizing federated learning on Non-IID data with reinforcement learning. In: IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. Toronto, ON, Canada: IEEE; 2020. pp. 1698–707. Available from: https://ieeexplore.ieee.org/document/9155494/.

[114]

Zhang P,Aujla GS.Reinforcement learning for edge device selection using social attribute perception in industry 4.0..IEEE Internet of Things Journal2021;1-1

[115]

Zhan Y,Leijie W.L4L: experience-driven computational resource control in federated learning..IEEE Transactions on Computers2021;1-1

[116]

Dong Y,Aujla GS.RA-RL: reputation-aware edge device selection method based on reinforcement learning. In: 2021 IEEE 22nd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)2021;348-53

[117]

Sahu AK, Li T, Sanjabi M, et al. On the convergence of federated optimization in heterogeneous networks. CoRR 2018;abs/1812.06127. Available from: http://arxiv.org/abs/1812.06127.

[118]

Chen M,Saad W.Convergence time optimization for federated learning over wireless networks..IEEE Transactions on Wireless Communications2021;20:2457-71

[119]

Li X, Huang K, Yang W, Wang S, Zhang Z. On the convergence of fedAvg on Non-IID data; 2020. Available from: https://arxiv.org/abs/1907.02189?context=stat.ML.

[120]

Bonawitz KA, Eichner H, Grieskamp W, et al. Towards federated learning at scale: system design. CoRR 2019;abs/1902.01046. Available from: http://arxiv.org/abs/1902.01046.

[121]

Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature 2015;518:529–33. Available from: https://doi.org/10.1038/nature14236.

[122]

Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning; 2019. Available from: https://arxiv.org/abs/1509.02971.

[123]

Lyu L, Yu H, Yang Q. Threats to federated learning: a survey. CoRR 2020;abs/2003.02133. Available from: https://arxiv.org/abs/2003.02133.

[124]

Fung C, Yoon CJM, Beschastnikh I. Mitigating sybils in federated learning poisoning. CoRR 2018;abs/1808.04866. Available from: http://arxiv.org/abs/1808.04866.

[125]

Anwar A.Multi-task federated reinforcement learning with adversaries2021;

[126]

Zhu L, Liu Z, Han S. Deep leakage from gradients. CoRR 2019;abs/1906.08935. Available from: http://arxiv.org/abs/1906.08935.

[127]

Nishio T.Client Selection for federated learning with heterogeneous resources in mobile edge. In: ICC 2019-2019 IEEE International Conference on Communications (ICC)2019;1-7

[128]

Yang T, Andrew G, Eichner H, et al. Applied federated learning: improving google keyboard query suggestions. CoRR 2018;abs/1812.02903. Available from: http://arxiv.org/abs/1812.02903.

[129]

Yu H, Liu Z, Liu Y, et al. A fairness-aware incentive scheme for federated learning. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. AIES ’20. New York, NY, USA: Association for Computing Machinery; 2020. p. 393–399. Available from: https://doi.org/10.1145/3375627.3375840.

AI Summary AI Mindmap
PDF

798

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/