Gradient purification: defense against data poisoning attack in decentralized federated learning

Bin LI; Xiaoye MIAO; Yan ZHANG; Jianwei YIN

doi:10.1007/s11704-025-50240-3

Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (9) :2009352 DOI: 10.1007/s11704-025-50240-3

Artificial Intelligence

RESEARCH ARTICLE

Gradient purification: defense against data poisoning attack in decentralized federated learning

Author information +

History +

PDF (6228KB)

Abstract

Decentralized federated learning (DFL) is inherently vulnerable to data poisoning attacks, as malicious clients can transmit manipulated gradients to neighboring clients. Existing defense methods either reject suspicious gradients per iteration or restart DFL aggregation after excluding all malicious clients. They all neglect the potential benefits that may exist within contributions from malicious clients. In this paper, we propose a novel gradient purification defense, termed $G P D$ , to defend against data poisoning attacks in DFL. It aims to separately mitigate the harm in gradients and retain benefits embedded in model weights, thereby enhancing overall model accuracy. For each benign client in $G P D$ , a recording variable is designed to track historically aggregated gradients from one of its neighbors. It allows benign clients to precisely detect malicious neighbors and mitigate all aggregated malicious gradients at once. Upon mitigation, benign clients optimize model weights using purified gradients. This optimization not only retains previously beneficial components from malicious clients but also exploits canonical contributions from benign clients. We analyze the convergence of $G P D$ , as well as its ability to harvest high accuracy. Extensive experiments demonstrate that, $G P D$ is capable of mitigating data poisoning attacks under both iid and non-iid data distributions. It also significantly outperforms state-of-the-art defense methods in terms of model accuracy.

Graphical abstract

Keywords

decentralized federated learning / data poisoning attack / security protocol

Cite this article

Download citation ▾

Bin LI, Xiaoye MIAO, Yan ZHANG, Jianwei YIN. Gradient purification: defense against data poisoning attack in decentralized federated learning. Front. Comput. Sci., 2026, 20(9): 2009352 DOI:10.1007/s11704-025-50240-3

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Lim W Y B, Luong N C, Hoang D T, Jiao Y, Liang Y, Yang Q, Niyato D, Miao C . Federated learning in mobile edge networks: a comprehensive survey. IEEE Communications Surveys & Tutorials, 2020, 22( 3): 2031–2063

[2]	Liu J, Huang J, Zhou Y, Li X, Ji S, Xiong H, Dou D . From distributed machine learning to federated learning: a survey. Knowledge and Information Systems, 2022, 64( 4): 885–917

[3]	Kumar K N, Mohan C K, Cenkeramaddi L R . The impact of adversarial attacks on federated learning: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46( 5): 2672–2691

[4]	Beltrán E T M, Pérez M Q, Sánchez P M S, Bernal S L, Bovet G, Pérez M G, Pérez G M, Celdrán A H . Decentralized federated learning: fundamentals, state of the art, frameworks, trends, and challenges. IEEE Communications Surveys & Tutorials, 2023, 25( 4): 2983–3013

[5]	Shi Y, Shen L, Wei K, Sun Y, Yuan B, Wang X, Tao D. Improving the model consistency of decentralized federated learning. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 1294

[6]	Qu Y, Pokhrel S R, Garg S, Gao L, Xiang Y . A blockchained federated learning framework for cognitive computing in industry 4. 0 networks. IEEE Transactions on Industrial Informatics, 2021, 17( 4): 2964–2973

[7]	Ragab M, Savateev Y, Oliver H, Tiropanis T, Poulovassilis A, Chapman A, Roussos G . ESPRESSO: a framework to empower search on the decentralized web. Data Science and Engineering, 2024, 9( 4): 431–448

[8]	Sun N, Wang W, Tong Y, Liu K . Blockchain based federated learning for intrusion detection for Internet of Things. Frontiers of Computer Science, 2024, 18( 5): 185328

[9]	Qin Z, Yan X, Zhou M, Deng S. BlockDFL: a blockchain-based fully decentralized peer-to-peer federated learning framework. In: Proceedings of the ACM Web Conference 2024. 2024, 2914−2925

[10]	Khan A. Data management and AI for blockchain data analysis: a round trip and opportunities. In: Proceedings of Workshops at the 50th International Conference on Very Large Data Bases. 2024

[11]	Sun T, Li D, Wang B . Decentralized federated averaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45( 4): 4289–4301

[12]	Lu S, Li R, Liu W . FedDAA: a robust federated learning framework to protect privacy and defend against adversarial attack. Frontiers of Computer Science, 2024, 18( 2): 182307

[13]	Zheng R, Qu L, Chen T, Zheng K, Shi Y, Yin H. Poisoning decentralized collaborative recommender system and its countermeasures. In: Proceedings of the 47th ACM SIGIR Conference on Research and Development in Information Retrieval. 2024, 1712−1721

[14]	Fang M, Zhang Z, Hairi, Khanduri P, Liu J, Lu S, Liu Y, Gong N. Byzantine-robust decentralized federated learning. In: Proceedings of 2024 on ACM SIGSAC Conference on Computer and Communications Security. 2024, 2874−2888

[15]	Xie C, Chen M, Chen P, Li B. CRFL: certifiably robust federated learning against backdoor attacks. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 11372−11382

[16]	Rieger P, Nguyen T D, Miettinen M, Sadeghi A R. DeepSight: mitigating backdoor attacks in federated learning through deep model inspection. In: Proceedings of the 29th Network and Distributed System Security Symposium. 2022

[17]	Sun J, Li A, Di Valentin L, Hassanzadeh A, Chen Y, Li H. FL-WBC: enhancing robustness against model poisoning attacks in federated learning from a client perspective. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 965

[18]	Zhu C, Roos S, Chen L Y. LeadFL: client self-defense against model poisoning in federated learning. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 1818

[19]	Yuan D, Miao Y, Gong N Z, Yang Z, Li Q, Song D, Wang Q, Liang X. Detecting fake accounts in online social networks at the time of registrations. In: Proceedings of 2019 ACM SIGSAC Conference on Computer and Communications Security. 2019, 1423−1438

[20]	Gupta A, Luo T, Ngo M V, Das S K. Long-short history of gradients is all you need: detecting malicious and unreliable clients in federated learning. In: Proceedings of the 27th European Symposium on Research in Computer Security. 2022, 445−465

[21]	Cao X, Jia J, Zhang Z, Gong N Z. FedRecover: recovering from poisoning attacks in federated learning using historical information. In: Proceedings of 2023 IEEE Symposium on Security and Privacy. 2023, 1366−1383

[22]	Lian X, Zhang C, Zhang H, Hsieh C J, Zhang W, Liu J. Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5336−5346

[23]	Wang Y, Poor H V . Decentralized stochastic optimization with inherent privacy protection. IEEE Transactions on Automatic Control, 2023, 68( 4): 2293–2308

[24]	Koloskova A, Loizou N, Boreiri S, Jaggi M, Stich S U. A unified theory of decentralized SGD with changing topology and local updates. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 499

[25]	Liao Y, Li Z, Huang K, Pu S . A compressed gradient tracking method for decentralized optimization with linear convergence. IEEE Transactions on Automatic Control, 2022, 67( 10): 5622–5629

[26]	Koloskova A, Lin T, Stich S U. An improved analysis of gradient tracking for decentralized machine learning. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 873

[27]	Aketi S A, Hashemi A, Roy K. Global update tracking: a decentralized learning algorithm for heterogeneous data. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 2126

[28]	Kumari K, Rieger P, Fereidooni H, Jadliwala M, Sadeghi A R. BayBFed: Bayesian backdoor defense for federated learning. In: Proceedings of 2023 IEEE Symposium on Security and Privacy. 2023, 737−754

[29]	Xie C, Huang K, Chen P, Li B. DBA: distributed backdoor attacks against federated learning. In: Proceedings of the International Conference on Learning Representations. 2020

[30]	Bhagoji A N, Chakraborty S, Mittal P, Calo S. Analyzing federated learning through an adversarial lens. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 634−643

[31]	Yin D, Chen Y, Kannan R, Bartlett P. Byzantine-robust distributed learning: towards optimal statistical rates. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 5650−5659

[32]	Blanchard P, Mhamdi E M E, Guerraoui R, Stainer J. Machine learning with adversaries: byzantine tolerant gradient descent. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 118−128

[33]	Mhamdi E M E, Guerraoui R, Rouault S. The hidden vulnerability of distributed learning in Byzantium. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 3521−3530

[34]	Naseri M, Hayes J, De Cristofaro E. Local and central differential privacy for robustness and privacy in federated learning. In: Proceedings of the 29th Network and Distributed System Security Symposium. 2022

[35]	Nedic A . Distributed gradient methods for convex machine learning problems in networks: distributed optimization. IEEE Signal Processing Magazine, 2020, 37( 3): 92–101

[36]	Zhao H, Li B, Li Z, Richtárik P, Chi Y. Beer: fast O(1/T) rate for decentralized nonconvex optimization with communication compression. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 2295

[37]	Danielsson P E . Euclidean distance mapping. Computer Graphics and Image Processing, 1980, 14( 3): 227–248

[38]	Bagdasaryan E, Shmatikov V. Blind backdoors in deep learning models. In: Proceedings of the 30th USENIX Security Symposium. 2021, 1505−1521

[39]	Baruch M, Baruch G, Goldberg Y. A little is enough: circumventing defenses for distributed learning. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 775

[40]	Fang M, Cao X, Jia J, Gong N Z. Local model poisoning attacks to byzantine-robust federated learning. In: Proceedings of the 29th USENIX Conference on Security Symposium. 2020, 92

[41]	Shejwalkar V, Houmansadr A. Manipulating the byzantine: optimizing model poisoning attacks and defenses for federated learning. In: Proceedings of the 28th Annual Network and Distributed System Security Symposium. 2021

[42]	Li Q, Diao Y, Chen Q, He B. Federated learning on non-IID data silos: an experimental study. In: Proceedings of the 38th IEEE International Conference on Data Engineering (ICDE). 2022, 965−978