Federated learning-outcome prediction with multi-layer privacy protection

Yupei ZHANG; Yuxin LI; Yifei WANG; Shuangshuang WEI; Yunan XU; Xuequn SHANG

doi:10.1007/s11704-023-2791-8

PDF(14921 KB)

Front. Comput. Sci. ›› 2024, Vol. 18 ›› Issue (6) : 186604. DOI: 10.1007/s11704-023-2791-8

Information Systems

RESEARCH ARTICLE

Federated learning-outcome prediction with multi-layer privacy protection

Yupei ZHANG¹^,² ,
Yuxin LI¹^,² ,
Yifei WANG¹^,² ,
Shuangshuang WEI¹^,² ,
Yunan XU¹^,² ,
Xuequn SHANG¹^,²

Author information +

History +

Abstract

Learning-outcome prediction (LOP) is a long-standing and critical problem in educational routes. Many studies have contributed to developing effective models while often suffering from data shortage and low generalization to various institutions due to the privacy-protection issue. To this end, this study proposes a distributed grade prediction model, dubbed FecMap, by exploiting the federated learning (FL) framework that preserves the private data of local clients and communicates with others through a global generalized model. FecMap considers local subspace learning (LSL), which explicitly learns the local features against the global features, and multi-layer privacy protection (MPP), which hierarchically protects the private features, including model-shareable features and not-allowably shared features, to achieve client-specific classifiers of high performance on LOP per institution. FecMap is then achieved in an iteration manner with all datasets distributed on clients by training a local neural network composed of a global part, a local part, and a classification head in clients and averaging the global parts from clients on the server. To evaluate the FecMap model, we collected three higher-educational datasets of student academic records from engineering majors. Experiment results manifest that FecMap benefits from the proposed LSL and MPP and achieves steady performance on the task of LOP, compared with the state-of-the-art models. This study makes a fresh attempt at the use of federated learning in the learning-analytical task, potentially paving the way to facilitating personalized education with privacy protection.

Graphical abstract

Keywords

federated learning / local subspace learning / hierarchical privacy protection / learning outcome prediction / privacy-protected representation learning

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Yupei ZHANG, Yuxin LI, Yifei WANG, Shuangshuang WEI, Yunan XU, Xuequn SHANG. Federated learning-outcome prediction with multi-layer privacy protection. Front. Comput. Sci., 2024, 18(6): 186604 https://doi.org/10.1007/s11704-023-2791-8

This is a preview of subscription content, contact us for subscripton.

Yupei Zhang is currently an associate professor in the School of Computer Science at Northwestern Polytechnical University, China. He received the PhD degree in Computer Science from Xi’an Jiaotong University, China in 2017. He worked as a postdoctoral researcher at Emory University, USA from 2018 to 2020 and at the University of Pennsylvania, USA from 2020 to 2021. His research interests lie in machine learning, big data, and educational data mining. He served on the committee of many international conferences and international journals

Yuxin Li is currently a Master student in the School of Computer Science at Northwestern Polytechnical University, China. Her research interests lie in big data and educational data mining

Yifei Wang is currently a Master student in the School of Computer Science at Northwestern Polytechnical University, China. Her research interests lie in federated optimization and educational data mining

Shuangshuang Wei is currently a Master student in the School of Computer Science at Northwestern Polytechnical University, China. Her research interests lie in big data, brain cognitive, and educational data mining

Yunan Xu is currently a Master student in the School of Computer Science at Northwestern Polytechnical University, China. Her research interests lie in big data, contrastive learning, and educational data mining

Xuequn Shang is currently a professor in the School of Computer Science at Northwestern Polytechnical University, China. She received the PhD degree in Computer Science from Otto-von-Guericke-University Magdeburg, Germany in 2005. Her research interests include data mining, bioinformatics, educational data mining, and data management. She has published many academic papers in international journals, including Nature Methods, Cell Reports, and Briefings in Bioinformatics. She served in many international journals as an editor and at many international conferences as a committee member

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Zhang Y, An R, Liu S, Cui J, Shang X . Predicting and understanding student learning performance using multi-source sparse attention convolutional neural networks. IEEE Transactions on Big Data, 2023, 9( 1): 118–132

[2]	Zhang Y, Dai H, Yun Y, Liu S, Lan A, Shang X . Meta-knowledge dictionary learning on 1-bit response data for student knowledge diagnosis. Knowledge-Based Systems, 2020, 205: 106290

[3]	Symeonidis P, Malakoudis D . Multi-modal matrix factorization with side information for recommending massive open online courses. Expert Systems with Applications, 2019, 118: 261–271

[4]	Zhang Y, Yun Y, Dai H, Cui J, Shang X . Graphs regularized robust matrix factorization and its application on student grade prediction. Applied Sciences, 2020, 10( 5): 1755

[5]	Bydžovská H. Student performance prediction using collaborative filtering methods. In: Proceedings of the 17th International Conference on Artificial Intelligence in Education. 2015, 550−553

[6]	Al-Shehri H, Al-Qarni A, Al-Saati L, Batoaq A, Badukhen H, Alrashed S, Alhiyafi J, Olatunji S O. Student performance prediction using support vector machine and k-nearest neighbor. In: Proceedings of the 30th IEEE Canadian Conference on Electrical and Computer Engineering. 2017, 1−4

[7]	Polyzou A, Karypis G . Feature extraction for next-term prediction of poor student performance. IEEE Transactions on Learning Technologies, 2019, 12( 2): 237–248

[8]	Zhang Y, Yun Y, An R, Cui J, Dai H, Shang X . Educational data mining techniques for student performance prediction: method review and comparison analysis. Frontiers in Psychology, 2021, 12: 698490

[9]	Li T, Hu S, Beirami A, Smith V. Ditto: Fair and robust federated learning through personalization. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 6357−6368

[10]	Li T, Sahu A K, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. In: Proceedings of Machine Learning and Systems. 2020, 429−450

[11]	Collins L, Hassani H, Mokhtari A, Shakkottai S. Exploiting shared representations for personalized federated learning. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 2089−2099

[12]	Haddadpour F, Mahdavi M. On the convergence of local descent methods in federated learning. 2019, arXiv preprint arXiv: 1910.14425

[13]	Li T, Sahu A K, Talwalkar A, Smith V . Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 2020, 37( 3): 50–60

[14]	Li Q, Wen Z, Wu Z, Hu S, Wang N, Li Y, Liu X, He B . A survey on federated learning systems: vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering, 2023, 35( 4): 3347–3366

[15]	Tan A Z, Yu H, Cui L, Yang Q. Towards personalized federated learning. IEEE Transactions on Neural Networks and Learning Systems, 2022, 1–17

[16]	Li Y, Liu X, Zhang X, Shao Y, Wang Q, Geng Y. Personalized federated learning via maximizing correlation with sparse and hierarchical extensions. 2021, arXiv preprint arXiv: 2107.05330

[17]	McMahan B, Moore E, Ramage D, Hampson S, Arcas B A Y. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 2017, 1273−1282

[18]	Chen Y R, Rezapour A, Tzeng W G . Privacy-preserving ridge regression on distributed data. Information Sciences, 2018, 451-452: 34–49

[19]	Dennis D K, Li T, Smith V. Heterogeneity for the win: One-shot federated clustering. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 2611−2620

[20]	Ribero M, Henderson J, Williamson S, Vikalo H . Federating recommendations using differentially private prototypes. Pattern Recognition, 2022, 129: 108746

[21]	Zhou P, Wang K, Guo L, Gong S, Zheng B . A privacy-preserving distributed contextual federated online learning framework with big data support in social recommender systems. IEEE Transactions on Knowledge and Data Engineering, 2021, 33( 3): 824–838

[22]	Zhang Y, Xu Y, Wei S, Wang Y, Li Y, Shang X . Doubly contrastive representation learning for federated image recognition. Pattern Recognition, 2023, 139: 109507

[23]	Ma X, Zhang J, Guo S, Xu W. Layer-wised model aggregation for personalized federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 10082–10091

[24]	Li X C, Zhan D C, Shao Y, Li B, Song S. FedPHP: Federated personalization with inherited private models. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2021, 587−602

[25]	Zhang X, Li Y, Li W, Guo K, Shao Y. Personalized federated learning via variational bayesian inference. In: Proceedings of International Conference on Machine Learning. 2022, 26293−26310

[26]	Smith V, Chiang C K, Sanjabi M, Talwalkar A. Federated multi-task learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4427–4437

[27]	Duan M, Liu D, Chen X, Liu R, Tan Y, Liang L . Self-balancing federated learning with global imbalanced data in mobile systems. IEEE Transactions on Parallel and Distributed Systems, 2021, 32( 1): 59–71

[28]	Bercea C I, Wiestler B, Rueckert D, Albarqouni S . Federated disentangled representation learning for unsupervised brain anomaly detection. Nature Machine Intelligence, 2022, 4( 8): 685–695

[29]	Wu Q, Chen X, Zhou Z, Zhang J . FedHome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Transactions on Mobile Computing, 2022, 21( 8): 2818–2832

[30]	Wang N, Chen Y, Hu Y, Lou W, Hou Y T. FeCo: Boosting intrusion detection capability in IoT networks via contrastive learning. In: Proceedings of IEEE INFOCOM 2022-IEEE Conference on Computer Communications. 2022, 1409−1418

[31]	Long G, Xie M, Shen T, Zhou T, Wang X, Jiang J . Multi-center federated learning: clients clustering for better personalization. World Wide Web, 2023, 26( 1): 481–500

[32]	Sattler F, Müller K R, Samek W . Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32( 8): 3710–3722

[33]	Li X, Jiang M, Zhang X, Kamp M, Dou Q. FedBN: Federated learning on non-IID features via local batch normalization. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[34]	Zhang Y, Wei S, Liu S, Wang Y, Xu Y, Li Y, Shang X . Graph-regularized federated learning with shareable side information. Knowledge-Based Systems, 2022, 257: 109960

[35]	Yang L, Huang J, Lin W, Cao J . Personalized federated learning on non-IID data via group-based meta-learning. ACM Transactions on Knowledge Discovery from Data, 2023, 17( 4): 49

[36]	Bonawitz K A, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konecný J, Mazzocchi S, McMahan B, Van Overveldt T, Petrou D, Ramage D, Roselander J. Towards federated learning at scale: System design. In: Proceedings of Machine Learning and Systems. 2019, 374−388

[37]	Li X C, Gan L, Zhan D C, Shao Y, Li B, Song S. Aggregate or not? Exploring where to privatize in DNN based federated learning under different non-IID scenes. 2021, arXiv preprint arXiv: 2107.11954

[38]	Liang P P, Liu T, Ziyin L, Allen N B, Auerbach R P, Brent D, Salakhutdinov R, Morency L P. Think locally, act globally: federated learning with local and global representations. 2020, arXiv preprint arXiv: 2001.01523

[39]	Arivazhagan M G, Aggarwal V, Singh A K, Choudhary S. Federated learning with personalization layers. 2019, arXiv preprint arXiv: 1912.00818

[40]	van der Maaten L, Hinton G . Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9( 86): 2579–2605

Acknowledgements

This study was supported in part by the National Natural Science Foundation of China (Grant Nos. 62272392, U1811262, 61802313), the Key Research and Development Program of China (2020AAA0108500), the Key Research and Development Program of Shaanxi Province (2023-YBGY-405), the Fundamental Research Funds for the Central University (D5000230088), and the Higher Research Funding on International Talent Cultivation at NPU (GJGZZD202202).