A survey on federated learning: a perspective from multi-party computation
Fengxia LIU, Zhiming ZHENG, Yexuan SHI, Yongxin TONG, Yi ZHANG
A survey on federated learning: a perspective from multi-party computation
Federated learning is a promising learning paradigm that allows collaborative training of models across multiple data owners without sharing their raw datasets. To enhance privacy in federated learning, multi-party computation can be leveraged for secure communication and computation during model training. This survey provides a comprehensive review on how to integrate mainstream multi-party computation techniques into diverse federated learning setups for guaranteed privacy, as well as the corresponding optimization techniques to improve model accuracy and training efficiency. We also pinpoint future directions to deploy federated learning to a wider range of applications.
federated learning / multi-party computation / privacy-preserving data mining / distributed learning
Fengxia Liu received the PhD degree in mathematics from the China Academy of Engineering Physics, China in 2021. Her research interests include the complexity of privacy analysis, federated learning, and graph neural networks
Zhiming Zheng recieved the PhD degree in mathematics from Peking University, China in 1987. He has been engaged in network security, artificial intelligence and blockchain research for a long time, and has achieved a series of original research results. For example, in the aspect of network security research, he has established the theory and method of dynamic cryptographic-based cryptoanalysis based on the integration of algebra and dynamics and the related network security system, breaking through the key technical bottlenecks of space and space information security
Yexuan Shi received the BE and PhD degrees in computer science and technology from Beihang University, China in 2017 and 2022, respectively. He is currently a post-doctoral researcher in the School of Computer Science and Engineering, Beihang University, China. His research interests include big spatio-temporal data analytics, federated learning, and privacy-preserving data analytics
Yongxin Tong received the PhD degree in computer science and engineering from The Hong Kong University of Science and Technology, China in 2014. He is currently a professor in the School of Computer Science and Engineering, Beihang University, China. His research interests include big spatio-temporal data analytics, federated learning, crowdsourcing, privacy-preserving data analytics, and uncertain data management
Yi Zhang received the PhD degree in probability theory and mathematical statistics from the Renmin University of China, China in 2020. He is currently a postdoc with Institute for mathematical science, Renmin University of China, China. His research interests include federated learning, supply chain finance, and optimization under uncertainty
[1] |
Konečný J, McMahan H B, Yu F X, Richtárik P, Suresh A T, Bacon D Federated learning: strategies for improving communication efficiency. 2016, arXiv preprint arXiv: 1610.05492
|
[2] |
Yang Q, Liu Y, Chen T, Tong Y. Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology, 2019, 10( 2): 12
|
[3] |
Tong Y, Zeng Y, Zhou Z, Liu B, Shi Y, Li S, Xu K, Lv W. Federated computing: query, learning, and beyond. IEEE Data Engineering Bulletin, 2023, 46(1): 9−26
|
[4] |
Zhang K, Song X, Zhang C, Yu S. Challenges and future directions of secure federated learning: a survey. Frontiers of Computer Science, 2022, 16( 5): 165817
|
[5] |
Chen Y, Qin X, Wang J, Yu C, Gao W. FedHealth: a federated transfer learning framework for wearable healthcare. IEEE Intelligent Systems, 2020, 35( 4): 83–93
|
[6] |
Byrd D, Polychroniadou A. Differentially private secure multi-party computation for federated learning in financial applications. In: Proceedings of the 1st ACM International Conference on AI in Finance. 2020, 16
|
[7] |
Liu S, Xu S, Yu W, Fu Z, Zhang Y, Marian A. FedCT: federated collaborative transfer for recommendation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2021, 716−725
|
[8] |
McMahan B, Moore E, Ramage D, Hampson S, Arcas B A Y. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 2017, 1273−1282
|
[9] |
Truex S, Liu L, Chow K H, Gursoy M E, Wei W. LDP-Fed: federated learning with local differential privacy. In: Proceedings of the 3rd ACM International Workshop on Edge Systems, Analytics and Networking. 2020, 61−66
|
[10] |
Mohassel P, Zhang Y. SecureML: a system for scalable privacy-preserving machine learning. In: Proceedings of 2017 IEEE Symposium on Security and Privacy. 2017, 19−38
|
[11] |
Ghosh A, Chung J, Yin D, Ramchandran K. An efficient framework for clustered federated learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1643
|
[12] |
Sattler F, Müller K R, Samek W. Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32( 8): 3710–3722
|
[13] |
Abad M S H, Ozfatura E, GUndUz D, Ercetin O. Hierarchical federated learning ACROSS heterogeneous cellular networks. In: Proceedings of 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. 2020, 8866−8870
|
[14] |
Wang Z, Song M, Zhang Z, Song Y, Wang Q, Qi H. Beyond inferring class representatives: User-level privacy leakage from federated learning. In: Proceedings of 2019-IEEE Conference on Computer Communications. 2019, 2512−2520
|
[15] |
Wei W, Liu L, Loper M, Chow K H, Gursoy M E, Truex S, Wu Y A framework for evaluating gradient leakage attacks in federated learning. 2020, arXiv preprint arXiv: 2004.10397
|
[16] |
Cormode G, Jha S, Kulkarni T, Li N, Srivastava D, Wang T. Privacy at scale: local differential privacy in practice. In: Proceedings of 2018 International Conference on Management of Data. 2018, 1655−1658
|
[17] |
Bonawitz K, Ivanov V, Kreuter B, Marcedone A, McMahan H B, Patel S, Ramage D, Segal A, Seth K. Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017, 1175−1191
|
[18] |
Tong Y, Pan X, Zeng Y, Shi Y, Xue C, Zhou Z, Zhang X, Chen L, Xu Y, Xu K, Lv W. Hu-Fu: efficient and secure spatial queries over data federation. Proceedings of the VLDB Endowment, 2022, 15( 6): 1159–1172
|
[19] |
Geyer R C, Klein T, Nabi M. Differentially private federated learning: a client level perspective. 2017, arXiv preprint arXiv: 1712.07557
|
[20] |
Triastcyn A, Faltings B. Federated learning with Bayesian differential privacy. In: Proceedings of 2019 IEEE International Conference on Big Data. 2019, 2587−2596
|
[21] |
Wei K, Li J, Ding M, Ma C, Su H, Zhang B, Poor H V. User-level privacy-preserving federated learning: analysis and performance optimization. IEEE Transactions on Mobile Computing, 2022, 21( 9): 3388–3401
|
[22] |
Zhang X, Chen X, Hong M, Wu S, Yi J. Understanding clipping for federated learning: Convergence and client-level differential privacy. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 26048−26067
|
[23] |
Shi Y, Tong Y, Su Z, Jiang D, Zhou Z, Zhang W. Federated topic discovery: a semantic consistent approach. IEEE Intelligent Systems, 2021, 36( 5): 96–103
|
[24] |
Liu Y, Ma Z, Yan Z, Wang Z, Liu X, Ma J. Privacy-preserving federated k-means for proactive caching in next generation cellular networks. Information Sciences, 2020, 521: 14–31
|
[25] |
Wang Y, Tong Y, Zhou Z, Ren Z, Xu Y, Wu G, Lv W. Fed-LTD: towards cross-platform ride hailing via federated learning to dispatch. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022, 4079−4089
|
[26] |
Fu F, Shao Y, Yu L, Jiang J, Xue H, Tao Y, Cui B. VF2Boost: very fast vertical federated gradient boosting for cross-enterprise learning. In: Proceedings of 2021 International Conference on Management of Data. 2021, 563−576
|
[27] |
Liu Y, Kang Y, Xing C, Chen T, Yang Q. A secure federated transfer learning framework. IEEE Intelligent Systems, 2020, 35( 4): 70–82
|
[28] |
Zhang Y, Lu Y, Liu F. A systematic survey for differential privacy techniques in federated learning. Journal of Information Security, 2023, 14( 2): 111–135
|
[29] |
Ning B, Li X, Yang F, Sun Y, Li G, Yuan G Y. Group relational privacy protection on time-constrained point of interests. Frontiers of Computer Science, 2023, 17( 3): 173607
|
[30] |
Wang H, Xu Z, Zhang X, Peng X, Li K. An optimal differentially private data release mechanism with constrained error. Frontiers of Computer Science, 2022, 16( 1): 161608
|
[31] |
Dwork C, Kenthapadi K, McSherry F, Mironov I, Naor M. Our data, ourselves: privacy via distributed noise generation. In: Proceedings of the 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques. 2006, 486−503
|
[32] |
Wang N, Zheng W, Wang Z, Wei Z, Gu Y, Tang P, Yu G. Collecting and analyzing key-value data under shuffled differential privacy. Frontiers of Computer Science, 2023, 17( 2): 172606
|
[33] |
Yao A C. Protocols for secure computations. In: Proceedings of the 23rd Annual Symposium on Foundations of Computer Science. 1982, 160−164
|
[34] |
Bayatbabolghani F, Blanton M. Secure multi-party computation. In: Proceedings of 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018, 2157−2159
|
[35] |
Yao A C. How to generate and exchange secrets. In: Proceedings of the 27th Annual Symposium on Foundations of Computer Science. 1986, 162−167
|
[36] |
Liu C, Wang X S, Nayak K, Huang Y, Shi E. ObliVM: a programming framework for secure computation. In: Proceedings of 2015 IEEE Symposium on Security and Privacy. 2015, 359−376
|
[37] |
Zahur S, Rosulek M, Evans D. Two halves make a whole - reducing data transfer in garbled circuits using half gates. In: Proceedings of the 34th Annual International Conference on the Theory and Applications of Cryptographic Techniques. 2015, 220−250
|
[38] |
Shamir A. How to share a secret. Communications of the ACM, 1979, 22( 11): 612–613
|
[39] |
Zhang K, Tong Y, Shi Y, Zeng Y, Xu Y, Chen L, Zhou Z, Xu K, Lv W, Zheng Z. Approximate k-nearest neighbor query over spatial data federation. In: Proceedings of the 28th International Conference on Database Systems for Advanced Applications. 2023, 351−368
|
[40] |
Gentry C. Fully homomorphic encryption using ideal lattices. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing. 2009, 169−178
|
[41] |
Mohri M, Sivek G, Suresh A T. Agnostic federated learning. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 4615−4625
|
[42] |
Shi Y, Tong Y, Zeng Y, Zhou Z, Ding B, Chen L. Efficient approximate range aggregation over large-scale spatial data federation. IEEE Transactions on Knowledge and Data Engineering, 2023, 35( 1): 418–430
|
[43] |
Lim W Y B, Ng J S, Xiong Z, Jin J, Zhang Y, Niyato D, Leung C, Miao C. Decentralized edge intelligence: a dynamic resource allocation framework for hierarchical federated learning. IEEE Transactions on Parallel and Distributed Systems, 2022, 33( 3): 536–550
|
[44] |
Warnat-Herresthal S, Schultze H, Shastry K L, Manamohan S, Mukherjee S, Garg V, Sarveswara R, Händler K, Pickkers P, Aziz N A, Ktena S, Tran F, Bitzer M, Ossowski S, Casadei N, Herr C, Petersheim D, Behrends U, Kern F, Fehlmann T, Schommers P, Lehmann C, Augustin M, Rybniker J, Altmüller J, Mishra N, Bernardes J P, Krämer B, Bonaguro L, Schulte-Schrepping J, De Domenico E, Siever C, Kraut M, Desai M, Monnet B, Saridaki M, Siegel C M, Drews A, Nuesch-Germano M, Theis H, Heyckendorf J, Schreiber S, Kim-Hellmuth S, COVID-19 Aachen Study (COVAS), Nattermann J, Skowasch D, Kurth I, Keller A, Bals R, Nürnberg P, Rieß O, Rosenstiel P, Netea M G, Theis F, Mukherjee S, Backes M, Aschenbrenner A C, Ulas T, Deutsche COVID-19 Omics Initiative (DeCOI), Breteler M M B, Giamarellos-Bourboulis E J, Kox M, Becker M, Cheran S, Woodacre M S, Goh E L, Schultze J L. Swarm learning for decentralized and confidential clinical machine learning. Nature, 2021, 594( 7862): 265–270
|
[45] |
Kim H, Park J, Bennis M, Kim S L. Blockchained on-device federated learning. IEEE Communications Letters, 2020, 24( 6): 1279–1283
|
[46] |
Agarwal N, Suresh A T, Yu F, Kumar S, McMahan H B. cpSGD: communication-efficient and differentially-private distributed SGD. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 7575−7586
|
[47] |
Canonne C L, Kamath G, Steinke T. The discrete Gaussian for differential privacy. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1315
|
[48] |
Agarwal N, Kairouz P, Liu Z. The skellam mechanism for differentially private federated learning. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 5052−5064
|
[49] |
Jiang L, Wang Y, Zheng W, Jin C, Li Z, Teo S G LSTMSPLIT: effective SPLIT learning based LSTM on sequential time-series data. 2022, arXiv preprint arXiv: 2203.04305
|
[50] |
Cheu A, Smith A, Ullman J, Zeber D, Zhilyaev M. Distributed differential privacy via shuffling. In: Proceedings of the 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques. 2019, 375−403
|
[51] |
Wang Y, Tong Y, Shi D, Xu K. An efficient approach for cross-silo federated learning to rank. In: Proceedings of the 37th International Conference on Data Engineering. 2021, 1128−1139
|
[52] |
Hu R, Gong Y, Guo Y. Federated learning with sparsified model perturbation: improving accuracy under client-level differential privacy. 2022, arXiv preprint arXiv: 2202.07178
|
[53] |
Jiang D, Song Y, Tong Y, Wu X, Zhao W, Xu Q, Yang Q. Federated topic modeling. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019, 1071−1080
|
[54] |
Jiang D, Tong Y, Song Y, Wu X, Zhao W, Peng J, Lian R, Xu Q, Yang Q. Industrial federated topic modeling. ACM Transactions on Intelligent Systems and Technology, 2021, 12( 1): 2
|
[55] |
Truex S, Baracaldo N, Anwar A, Steinke T, Ludwig H, Zhang R, Zhou Y. A hybrid approach to privacy-preserving federated learning. In: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security. 2019, 1−11
|
[56] |
Kasiviswanathan S P, Lee H K, Nissim K, Raskhodnikova S, Smith A. What can we learn privately? SIAM Journal on Computing, 2011, 40(3): 793−826
|
[57] |
Erlingsson Ú, Pihur V, Korolova A. RAPPOR: randomized aggregatable privacy-preserving ordinal response. In: Proceedings of 2014 ACM SIGSAC Conference on Computer and Communications Security. 2014, 1054−1067
|
[58] |
Girgis A M, Data D, Diggavi S N. Renyi differential privacy of the subsampled shuffle model in distributed learning. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 29181−29192
|
[59] |
Wang Y, Tong Y, Shi D. Federated latent dirichlet allocation: a local differential privacy based framework. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 6283−6290
|
[60] |
Wang C, Wu X, Liu G, Deng T, Peng K, Wan S. Safeguarding cross-silo federated learning with local differential privacy. Digital Communications and Networks, 2022, 8( 4): 446–454
|
[61] |
Mohri M, Rostamizadeh A. Rademacher complexity bounds for non-I.I.D. processes. In: Proceedings of the 21st International Conference on Neural Information Processing Systems. 2008, 1097−1104
|
[62] |
Mansour Y, Mohri M, Ro J, Suresh A T. Three approaches for personalization with applications to federated learning. 2020, arXiv preprint arXiv: 2002.10619
|
[63] |
Deng Y, Kamani M M, Mahdavi M. Adaptive personalized federated learning. 2020, arXiv preprint arXiv: 2003.13461
|
[64] |
Wei S, Tong Y, Zhou Z, Song T. Efficient and fair data valuation for horizontal federated learning. In: Yang Q, Fan L, Yu H, eds. Federated Learning. Cham: Springer, 2020, 139−152
|
[65] |
Song T, Tong Y, Wei S. Profit allocation for federated learning. In: Proceedings of 2019 IEEE International Conference on Big Data. 2019, 2577−2586
|
[66] |
Chai Z, Ali A, Zawad S, Truex S, Anwar A, Baracaldo N, Zhou Y, Ludwig H, Yan F, Cheng Y. TiFL: a tier-based federated learning system. In: Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing. 2020, 125−136
|
[67] |
Huang T, Lin W, Wu W, He L, Li K, Zomaya A Y. An efficiency-boosting client selection scheme for federated learning with fairness guarantee. IEEE Transactions on Parallel and Distributed Systems, 2021, 32( 7): 1552–1564
|
[68] |
Lai F, Zhu X, Madhyastha H V, Chowdhury M. Oort: efficient federated learning via guided participant selection. 2020, arXiv preprint arXiv: 2010.06081
|
[69] |
Zhang R, Wang Y, Zhou Z, Ren Z, Tong Y, Xu K. Data source selection in federated learning: a submodular optimization approach. In: Proceedings of the 27th International Conference on Database Systems for Advanced Applications. 2022, 606−614
|
[70] |
Wang H, Kaplan Z, Niu D, Li B. Optimizing federated learning on non-IID data with reinforcement learning. In: Proceedings of the IEEE Conference on Computer Communications. 2020, 1698−1707
|
[71] |
Pan X, Tong Y, Xue C, Zhou Z, Du J, Zeng Y, Shi Y, Zhang X, Chen L, Xu Y, Xu K, Lv W. Hu-fu: a data federation system for secure spatial queries. Proceedings of the VLDB Endowment, 2022, 15( 12): 3582–3585
|
[72] |
Chen V, Pastro V, Raykova M. Secure computation for machine learning with SPDZ. 2019, arXiv preprint arXiv: 1901.00329
|
[73] |
Ziller A, Trask A, Lopardo A, Szymkow B, Wagner B, Bluemke E, Nounahon J M, Passerat-Palmbach J, Prakash K, Rose N, Ryffel T, Reza Z N, Kaissis G. PySyft: a library for easy federated learning. In: Rehman M H U, Gaber M M, eds. Federated Learning Systems: Towards Next-Generation AI. Cham: Springer, 2021, 111−139
|
[74] |
Damgård I, Pastro V, Smart N, Zakarias S. Multiparty computation from somewhat homomorphic encryption. In: Proceedings of the 32nd Annual Cryptology Conference. 2012, 643−662
|
[75] |
Jeon B, Ferdous S M, Rahman M R, Walid A. Privacy-preserving decentralized aggregation for federated learning. In: Proceedings of 2021 IEEE Conference on Computer Communications Workshops. 2021, 1−6
|
[76] |
Dowlin N, Gilad-Bachrach R, Laine K, Lauter K E, Naehrig M, Wernsing J. CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In: Proceedings of the 33rd International Conference on Machine Learning. 2016, 201−210
|
[77] |
Cheng K, Fan T, Jin Y, Liu Y, Chen T, Papadopoulos D, Yang Q. SecureBoost: a lossless federated learning framework. IEEE Intelligent Systems, 2021, 36( 6): 87–98
|
[78] |
Zhang Y, Shi Y, Zhou Z, Xue C, Xu Y, Xu K, Du J. Efficient and secure skyline queries over vertical data federation. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(9): 9269 − 9280
|
[79] |
Li T, Sahu A K, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. In: Proceedings of the Machine Learning and Systems 2020. 2020
|
[80] |
Karimireddy S P, Kale S, Mohri M, Reddi S J, Stich S U, Suresh A T. SCAFFOLD: stochastic controlled averaging for federated learning. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 476
|
[81] |
Hamer J, Mohri M, Suresh A T. FedBoost: communication-efficient algorithms for federated learning. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 372
|
[82] |
Rothchild D, Panda A, Ullah E, Ivkin N, Stoica I, Braverman V, Gonzalez J, Arora R. FetchSGD: communication-efficient federated learning with sketching. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 764
|
[83] |
Suresh A T, Yu F X, Kumar S, McMahan H B. Distributed mean estimation with limited communication. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 3329−3337
|
[84] |
Caldas S, Konečný J, McMahan B, Talwalkar A. Expanding the reach of federated learning by reducing client resource requirements. In: Proceedings of the ICLR 2019. 2019
|
[85] |
Xu J, Du W, Jin Y, He W, Cheng R. Ternary compression for communication-efficient federated learning. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33( 3): 1162–1176
|
[86] |
Haddadpour F, Kamani M M, Mokhtari A, Mahdavi M. Federated learning with compression: unified analysis and sharp guarantees. In: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics. 2021, 2350−2358
|
[87] |
Cui L, Su X, Ming Z, Chen Z, Yang S, Zhou Y, Xiao W. CREAT: blockchain-assisted compression algorithm of federated learning for content caching in edge computing. IEEE Internet of Things Journal, 2022, 9( 16): 14151–14161
|
[88] |
Ouyang X, Xie Z, Zhou J, Xing G, Huang J. ClusterFL: a clustering-based federated learning system for human activity recognition. ACM Transactions on Sensor Networks, 2023, 19( 1): 17
|
[89] |
Tan A Z, Yu H, Cui L, Yang Q. Towards personalized federated learning. IEEE Transactions on Neural Networks and Learning Systems, 2022
|
[90] |
Jiang D, Tan C, Peng J, Chen C, Wu X, Zhao W, Song Y, Tong Y, Liu C, Xu Q, Yang Q, Deng L. A GDPR-compliant ecosystem for speech recognition with transfer, federated, and evolutionary learning. ACM Transactions on Intelligent Systems and Technology, 2021, 12( 3): 30
|
/
〈 | 〉 |