Soft-GNN: towards robust graph neural networks via self-adaptive data utilization

Yao WU; Hong HUANG; Yu SONG; Hai JIN

doi:10.1007/s11704-024-3575-5

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (4) : 194311 DOI: 10.1007/s11704-024-3575-5

Excellent Young Computer Scientists Forum

RESEARCH ARTICLE

Soft-GNN: towards robust graph neural networks via self-adaptive data utilization

Yao WU ¹^,²
, Hong HUANG ¹^,^†
, Yu SONG ³
, Hai JIN ¹

Author information +

History +

PDF (4430KB)

Abstract

Graph neural networks (GNNs) have gained traction and have been applied to various graph-based data analysis tasks due to their high performance. However, a major concern is their robustness, particularly when faced with graph data that has been deliberately or accidentally polluted with noise. This presents a challenge in learning robust GNNs under noisy conditions. To address this issue, we propose a novel framework called Soft-GNN, which mitigates the influence of label noise by adapting the data utilized in training. Our approach employs a dynamic data utilization strategy that estimates adaptive weights based on prediction deviation, local deviation, and global deviation. By better utilizing significant training samples and reducing the impact of label noise through dynamic data selection, GNNs are trained to be more robust. We evaluate the performance, robustness, generality, and complexity of our model on five real-world datasets, and our experimental results demonstrate the superiority of our approach over existing methods.

Graphical abstract

Keywords

graph neural networks / node classification / label noise / robustness

Cite this article

Download citation ▾

Yao WU, Hong HUANG, Yu SONG, Hai JIN. Soft-GNN: towards robust graph neural networks via self-adaptive data utilization. Front. Comput. Sci., 2025, 19(4): 194311 DOI:10.1007/s11704-024-3575-5

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Wu B, Li J, Hou C, Fu G, Bian Y, Chen L, Huang J. Recent advances in reliable deep graph learning: adversarial attack, inherent noise, and distribution shift. 2022, arXiv preprint arXiv: 2202.07114

[2]	Li S Y, Huang S J, Chen S . Crowdsourcing aggregation with deep Bayesian learning. Science China Information Sciences, 2021, 64( 3): 130104

[3]	Hoang N T, Choong J J, Murata T. Learning graph neural networks with noisy labels. In: Proceedings of the ICLR LLD 2019. 2019

[4]	Zügner D, Akbarnejad A, Günnemann S. Adversarial attacks on neural networks for graph data. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 2847−2856

[5]	Li Y, Yin J, Chen L. Unified robust training for graph neural networks against label noise. In: Proceedings of the 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2021, 528−540

[6]	Zhuo Y, Zhou X, Wu J. Training graph convolutional neural network against label noise. In: Proceedings of the 28th International Conference on Neural Information Processing. 2021, 677−689

[7]	Du X, Bian T, Rong Y, Han B, Liu T, Xu T, Huang W, Huang J. PI-GNN: a novel perspective on semi-supervised node classification against noisy labels. 2021, arXiv preprint arXiv: 2106.07451

[8]	Dai E, Aggarwal C, Wang S. NRGNN: learning a label noise resistant graph neural network on sparsely and noisily labeled graphs. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021, 227−236

[9]	Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of ICLR 2017. 2017

[10]	Huang L, Zhang C, Zhang H. Self-adaptive training: beyond empirical risk minimization. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1624

[11]	Frenay B, Verleysen M . Classification in the presence of label noise: a survey. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25( 5): 845–869

[12]	Bootkrajang J, Kabán A . Classification of mislabelled microarrays using robust sparse logistic regression. Bioinformatics, 2013, 29( 7): 870–877

[13]	Shi X, Che W . Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis. Frontiers of Computer Science, 2023, 17( 5): 175333

[14]	Xiao T, Xia T, Yang Y, Huang C, Wang X. Learning from massive noisy labeled data for image classification. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015, 2691−2699

[15]	Tian Q, Sun H, Peng S, Ma T . Self-adaptive label filtering learning for unsupervised domain adaptation. Frontiers of Computer Science, 2023, 17( 1): 171308

[16]	Beck C, Booth H, El-Assady M, Butt M. Representation problems in linguistic annotations: ambiguity, variation, uncertainty, error and bias. In: Proceedings of the 14th Linguistic Annotation Workshop. 2020, 60−73

[17]	Biggio B, Nelson B, Laskov P. Support vector machines under adversarial label noise. In: Proceedings of the 3rd Asian Conference on Machine Learning. 2011, 97−112

[18]	Abellán J, Masegosa A R. An experimental study about simple decision trees for bagging ensemble on datasets with classification noise. In: Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty. 2009, 446−456

[19]	Wang R, Liu T, Tao D . Multiclass learning with partially corrupted labels. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29( 6): 2568–2580

[20]	Sigurdsson S, Larsen J, Hansen L K, Philipsen P A, Wulf H C. Outlier estimation and detection application to skin lesion classification. In: Proceedings of 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing. 2002, 1049−1052

[21]	Bouveyron C, Girard S . Robust supervised classification with mixture models: learning from data with uncertain labels. Pattern Recognition, 2009, 42( 11): 2649–2658

[22]	Brodley C E, Friedl M A . Identifying mislabeled training data. Journal of Artificial Intelligence Research, 1999, 11: 131–167

[23]	Thongkam J, Xu G, Zhang Y, Huang F. Support vector machine for outlier detection in breast cancer survivability prediction. In: Proceedings of the Asia-Pacific Web Conference. 2008, 99−109

[24]	Miranda A L B, Garcia L P F, Carvalho A C P L F, Lorena A C. Use of classification algorithms in noise detection and elimination. In: Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems. 2009, 417−424

[25]	Dedeoglu E, Kesgin H T, Amasyali M F . A robust optimization method for label noisy datasets based on adaptive threshold: adaptive-k. Frontiers of Computer Science, 2024, 18( 4): 184315

[26]	Wang Y, Kucukelbir A, Blei D M. Robust probabilistic modeling with Bayesian data reweighting. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 3646−3655

[27]	Jiang L, Zhou Z, Leung T, Li L J, Li F F. MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 2304−2313

[28]	Shu J, Xie Q, Yi L, Zhao Q, Zhou S, Xu Z, Meng D. Meta-weight-net: learning an explicit mapping for sample weighting. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 172

[29]	Nguyen D, Mummadi C K, Ngo T P N, Nguyen T H P, Beggel L, Brox T. SELF: learning to filter noisy labels with self-ensembling. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[30]	Chen P, Liao B, Chen G, Zhang S. Understanding and utilizing deep neural networks trained with noisy labels. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 1062−1070

[31]	Zheng G, Awadallah A H, Dumais S. Meta label correction for noisy label learning. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 11053−11061

[32]	Liu J, Li R, Sun C . Co-correcting: noise-tolerant medical image classification via mutual label correction. IEEE Transactions on Medical Imaging, 2021, 40( 12): 3580–3592

[33]	Malach E, Shalev-Shwartz S. Decoupling "when to update" from "how to update". In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 961−971

[34]	Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I W, Sugiyama M. Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8536−8546

[35]	Guo X, Wang W . Towards making co-training suffer less from insufficient views. Frontiers of Computer Science, 2019, 13( 1): 99–105

[36]	Gasteiger J, Bojchevski A, Günnemann S. Predict then propagate: Graph neural networks meet personalized pagerank. In: Proceedings of the International Conference on Learning Representations (ICLR). 2019

[37]	Wei X, Gong X, Zhan Y, Du B, Luo Y, Hu W. CLNode: curriculum learning for node classification. In: Proceedings of the 16th ACM International Conference on Web Search and Data Mining. 2023, 670−678

[38]	Liu Y, Wu Z, Lu Z, Wen G, Ma J, Lu G, Zhu X. Multi-teacher self-training for semi-supervised node classification with noisy labels. In: Proceedings of the 31st ACM International Conference on Multimedia. 2023, 2946−2954

[39]	Kuang K, Cui P, Athey S, Xiong R, Li B. Stable prediction across unknown environments. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1617−1626

[40]	Hwang M, Jeong Y, Sung W. Data distribution search to select core-set for machine learning. In: Proceedings of the 9th International Conference on Smart Media and Applications. 2020, 172−176

[41]	Paul M, Ganguli S, Dziugaite G K. Deep learning on a data diet: finding important examples early in training. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2021, 20596−20607

[42]	Wang H, Leskovec J. Unifying graph convolutional neural networks and label propagation. 2020, arXiv preprint arXiv: 2002.06755

[43]	Dong W, Wu J, Luo Y, Ge Z, Wang P. Node representation learning in graph via node-to-neighbourhood mutual information maximization. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 16620−16629

[44]	Hochreiter S, Schmidhuber J . Long short-term memory. Neural Computation, 1997, 9( 8): 1735–1780

[45]	Sen P, Namata G, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T . Collective classification in network data. AI Magazine, 2008, 29( 3): 93–106

[46]	Namata G, London B, Getoor L, Huang B. Query-driven active surveying for collective classification. In: Proceedings of the Workshop on Mining and Learning with Graphs. 2012

[47]	Mernyei P, Cangea C. Wiki-CS: a wikipedia-based benchmark for graph neural networks. In: Proceedings of the Graph Representation Learning and Beyond Workshop on ICML. 2020

[48]	Shchur O, Mumme M, Bojchevski A, Günnemann S. Pitfalls of graph neural network evaluation. 2019, arXiv preprint arXiv: 1811.05868

[49]	Zhang Z, Sabuncu M R. Generalized cross entropy loss for training deep neural networks with noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8792−8802

[50]	Zhu M, Wang X, Shi C, Ji H, Cui P. Interpreting and unifying graph neural networks with an optimization framework. In: Proceedings of the Web Conference 2021. 2021, 1215−1226