Fairness is essential for robustness: fair adversarial training by identifying and augmenting hard examples

Ningping MOU; Xinli YUE; Lingchen ZHAO; Qian WANG

doi:10.1007/s11704-024-3587-1

Frontiers of Computer Science >

2025 , Vol. 19 >Issue 3: 193803

DOI: https://doi.org/10.1007/s11704-024-3587-1

Information Security

Fairness is essential for robustness: fair adversarial training by identifying and augmenting hard examples

Ningping MOU ,
Xinli YUE ,
Lingchen ZHAO ,
Qian WANG

Expand

Key Laboratory of Aerospace Information Security and Trusted Computing (Ministry of Education), School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China

qianwang@whu.edu.cn

Received date: 22 Jul 2023

Accepted date: 16 Jan 2024

Copyright

2025 Higher Education Press

Fold

Abstract

Adversarial training has been widely considered the most effective defense against adversarial attacks. However, recent studies have demonstrated that a large discrepancy exists in the class-wise robustness of adversarial training, leading to two potential issues: firstly, the overall robustness of a model is compromised due to the weakest class; and secondly, ethical concerns arising from unequal protection and biases, where certain societal demographic groups receive less robustness in defense mechanisms. Despite these issues, solutions to address the discrepancy remain largely underexplored. In this paper, we advance beyond existing methods that focus on class-level solutions. Our investigation reveals that hard examples, identified by higher cross-entropy values, can provide more fine-grained information about the discrepancy. Furthermore, we find that enhancing the diversity of hard examples can effectively reduce the robustness gap between classes. Motivated by these observations, we propose Fair Adversarial Training (FairAT) to mitigate the discrepancy of class-wise robustness. Extensive experiments on various benchmark datasets and adversarial attacks demonstrate that FairAT outperforms state-of-the-art methods in terms of both overall robustness and fairness. For a WRN-28-10 model trained on CIFAR10, FairAT improves the average and worst-class robustness by 2.13% and 4.50%, respectively.

Key words： robust fairness; adversarial training; hard example; data augmentation

Cite this article

Ningping MOU , Xinli YUE , Lingchen ZHAO , Qian WANG . Fairness is essential for robustness: fair adversarial training by identifying and augmenting hard examples[J]. Frontiers of Computer Science, 2025 , 19(3) : 193803 . DOI: 10.1007/s11704-024-3587-1

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (Grant Nos. U20B2049, U21B2018 and 62302344).

Competing interests

The authors declare that they have no competing interests or financial conflicts to disclose.

References

Publishing order | Descend order by publishing year | Descend order by cited within

1	Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I J, Fergus R. Intriguing properties of neural networks. In: Proceedings of the 2nd International Conference on Learning Representations. 2014

2	Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: Proceedings of the 3rd International Conference on Learning Representations. 2015

3	Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In: Proceedings of IEEE Symposium on Security and Privacy. 2017, 39–57

4	Croce F, Hein M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 206

5	Athalye A, Carlini N, Wagner D A. Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 274–283

6	Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In: Proceedings of the 6th International Conference on Learning Representations. 2018

7	Zhang H, Yu Y, Jiao J, Xing E P, El Ghaoui L, Jordan M I. Theoretically principled trade-off between robustness and accuracy. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 7472–7482

8	Rice L, Wong E, Kolter J Z. Overfitting in adversarially robust deep learning. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 749

9	Tian Q, Kuang K, Jiang K, Wu F, Wang Y. Analysis and applications of class-wise robustness in adversarial training. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021, 1561–1570

10	Xu H, Liu X, Li Y, Jain A K, Tang J. To be robust or to be fair: towards fairness in adversarial training. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 11492–11501

11	Hardt M, Price E, Srebro N. Equality of opportunity in supervised learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 3323–3331

12	Krasanakis E, Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris Y. Adaptive sensitive reweighting to mitigate bias in fairness-aware classification. In: Proceedings of 2018 World Wide Web Conference. 2018, 853–862

13	Ustun B, Liu Y, Parkes D C. Fairness without harm: decoupled classifiers with preference guarantees. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 6373–6382

14	Ma X, Wang Z, Liu W. On the tradeoff between robustness and fairness. In: Proceedings of the 36th Conference on Neural Information Processing Systems. 2022, 26230–26241

15	Devries T, Taylor G W. Improved regularization of convolutional neural networks with cutout. 2017, arXiv preprint arXiv: 1708.04552

16	Zhang H, Cissé M, Dauphin Y N, Lopez-Paz D. Mixup: beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations. 2018

17	Yun S, Han D, Chun S, Oh S J, Yoo Y, Choe J. CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. 2019, 6022–6031

18	Wang Y, Zou D, Yi J, Bailey J, Ma X, Gu Q. Improving adversarial robustness requires revisiting misclassified examples. In: Proceedings of the 8th International Conference on Learning Representations. 2020

19	Zhan Y, Zheng B, Wang Q, Mou N, Guo B, Li Q, Shen C, Wang C. Towards black-box adversarial attacks on interpretable deep learning systems. In: Proceedings of 2022 IEEE International Conference on Multimedia and Expo. 2022, 1–6

20	Mou N, Zheng B, Wang Q, Ge Y, Guo B. A few seconds can change everything: Fast decision-based attacks against DNNs. In: Proceedings of the 31st International Joint Conference on Artificial Intelligence. 2022, 3342–3350

21	Tramèr F, Carlini N, Brendel W, Mądry A. On adaptive attacks to adversarial example defenses. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 138

22	Aghaei S, Azizi M J, Vayanos P. Learning optimal and fair decision trees for non-discriminative decision-making. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 1418–1426

23	Goel N, Yaghini M, Faltings B. Non-discriminatory machine learning through convex fairness criteria. In: Proceedings of 2018 AAAI/ACM Conference on AI, Ethics, and Society. 2018, 116

24	Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A . A survey on bias and fairness in machine learning. ACM Computing Surveys, 2022, 54( 6): 115

25	Wang Y X, Ramanan D, Hebert M. Learning to model the tail. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 7032–7042

26	Cao K, Wei C, Gaidon A, Aréchiga N, Ma T. Learning imbalanced datasets with label-distribution-aware margin loss. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 1567–1578

27	Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H. A reductions approach to fair classification. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 60–69

28	Cui Y, Jia M, Lin T Y, Song Y, Belongie S. Class-balanced loss based on effective number of samples. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 9260–9269

29	Zhan X, Liu H, Li Q, Chan A B. A comparative survey: benchmarking for pool-based active learning. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 4679–4686

30	Beluch W H, Genewein T, Nürnberger A, Köhler J M. The power of ensembles for active learning in image classification. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 9368–9377

31	Gal Y, Islam R, Ghahramani Z. Deep Bayesian active learning with image data. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1183–1192

32	Rade R, Moosavi-Dezfooli S M. Reducing excessive margin to achieve a better accuracy vs. robustness trade-off. In: Proceedings of the 10th International Conference on Learning Representations. 2022

33	Zhang J, Zhu J, Niu G, Han B, Sugiyama M, Kankanhalli M S. Geometry-aware instance-reweighted adversarial training. In: Proceedings of the 9th International Conference on Learning Representations. 2021

34	Carmon Y, Raghunathan A, Schmidt L, Liang P, Duchi J C. Unlabeled data improves adversarial robustness. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 1004

35	Hendrycks D, Lee K, Mazeika M. Using pre-training can improve model robustness and uncertainty. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 2712–2721

36	Najafi A, Maeda S I, Koyama M, Miyato T. Robustness to adversarial perturbations in learning from incomplete data. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 497

37	Zhai R, Cai T, He D, Dan C, He K, Hopcroft J, Wang L. Adversarially robust generalization just requires more unlabeled data. In: Proceedings of ICLR 2020. 2020

38	Torralba A, Fergus R, Freeman W T . 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30( 11): 1958–1970

39	Cubuk E D, Zoph B, Mané D, Vasudevan V, Le Q V. AutoAugment: learning augmentation policies from data. 2018, arXiv preprint arXiv: 1805.09501

40	Krizhevsky A. Learning multiple layers of features from tiny images. Technical Report, University of Toronto, 2009

41	Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A Y. Reading digits in natural images with unsupervised feature learning. In: Proceedings of NIPS Workshop on Deep Learning and Unsupervised Feature Learning. 2011

42	He K, Zhang X, Ren S, Sun J. Identity mappings in deep residual networks. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 630–645

43	Zagoruyko S, Komodakis N. Wide residual networks. In: Proceedings of British Machine Vision Conference. 2016

44	Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255

45	Croce F, Andriushchenko M, Sehwag V, Debenedetti E, Flammarion N, Chiang M, Mittal P, Hein M. RobustBench: a standardized adversarial robustness benchmark. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021

46	Wang D, Shang Y. A new active labeling method for deep learning. In: Proceedings of 2014 International Joint Conference on Neural Networks. 2014, 112–119

47	Shannon C E . A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 2001, 5( 1): 3–55

Options

Outlines

About the journal

Aims & scope

Description

Editorial board

Abstracting / Indexing

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers

Abstract

Cite this article

Acknowledgements

Competing interests

References