CMAGAN: classifier-aided minority augmentation generative adversarial networks for industrial imbalanced data and its application to fault prediction

Wen-Jie Wang , Zhao Liu , Ping Zhu

Advances in Manufacturing ›› 2024, Vol. 12 ›› Issue (3) : 603 -618.

PDF
Advances in Manufacturing ›› 2024, Vol. 12 ›› Issue (3) : 603 -618. DOI: 10.1007/s40436-024-00496-y
Article

CMAGAN: classifier-aided minority augmentation generative adversarial networks for industrial imbalanced data and its application to fault prediction

Author information +
History +
PDF

Abstract

Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models. To address this issue, the augmentation of samples in minority classes based on generative adversarial networks (GANs) has been demonstrated as an effective approach. This study proposes a novel GAN-based minority class augmentation approach named classifier-aided minority augmentation generative adversarial network (CMAGAN). In the CMAGAN framework, an outlier elimination strategy is first applied to each class to minimize the negative impacts of outliers. Subsequently, a newly designed boundary-strengthening learning GAN (BSLGAN) is employed to generate additional samples for minority classes. By incorporating a supplementary classifier and innovative training mechanisms, the BSLGAN focuses on learning the distribution of samples near classification boundaries. Consequently, it can fully capture the characteristics of the target class and generate highly realistic samples with clear boundaries. Finally, the new samples are filtered based on the Mahalanobis distance to ensure that they are within the desired distribution. To evaluate the effectiveness of the proposed approach, CMAGAN was used to solve the class imbalance problem in eight real-world fault-prediction applications. The performance of CMAGAN was compared with that of seven other algorithms, including state-of-the-art GAN-based methods, and the results indicated that CMAGAN could provide higher-quality augmented results.

Keywords

Class imbalance / Minority class augmentation / Generative adversarial network (GAN) / Boundary strengthening learning (BSL) / Fault prediction

Cite this article

Download citation ▾
Wen-Jie Wang, Zhao Liu, Ping Zhu. CMAGAN: classifier-aided minority augmentation generative adversarial networks for industrial imbalanced data and its application to fault prediction. Advances in Manufacturing, 2024, 12(3): 603-618 DOI:10.1007/s40436-024-00496-y

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Jiang X, Ge Z. Data augmentation classifier for imbalanced fault classification. IEEE Trans Autom Sci Eng, 2021, 18(3): 1206-1217.

[2]

Liu F, Dai Y. Product processing quality classification model for small-sample and imbalanced data environment. Comput Intell Neurosci, 2022, 2022: 9024165.

[3]

Li Z, Wang Y, Wang K (2017) Intelligent predictive maintenance for fault diagnosis and prognosis in machine centers : Industry 4.0 scenario. Adv Manuf 5 (4):377‒387

[4]

Zhuo Y, Ge Z. Gaussian discriminative analysis aided GAN for imbalanced big data augmentation and fault classification. J Process Control, 2020, 92: 271-287.

[5]

Lan Z, Huang G, Li Y, et al. Conquering insufficient/imbalanced data learning for the Internet of Medical Things. Neural Comput Appl, 2022, 35(31): 22949-22958.

[6]

Shao S, Wang P, Yan R. Generative adversarial networks for data augmentation in machine fault diagnosis. Comput Ind, 2019, 106: 85-93.

[7]

Islam A, Belhaouari SB, Rehman AU, et al. KNNOR: an oversampling technique for imbalanced datasets. Appl Soft Comput, 2022, 115.

[8]

Krawczyk B, Wozniak M, Schaefer G. Cost-sensitive decision tree ensembles for effective imbalanced classification. Appl Soft Comput, 2014, 14: 554-562.

[9]

Yang K, Yu Z, Wen X, et al. Hybrid classifier ensemble for imbalanced data. IEEE Trans Neural Netw Learn Syst, 2020, 31(4): 1387-1400.

[10]

Madani M, Motameni H, Mohamadi H. KNNGAN: an oversampling technique for textual imbalanced datasets. J Supercomput, 2023, 79(5): 5291-5326.

[11]

Wei Z, Zhang L, Zhao L. Minority-prediction-probability-based oversampling technique for imbalanced learning. Inf Sci, 2023, 622: 1273-1295.

[12]

Koziarski M. Potential anchoring for imbalanced data classification. Pattern Recognit, 2021, 120.

[13]

Xie Y, Qiu M, Zhang H, et al. Gaussian distribution based oversampling for imbalanced data classification. IEEE Trans Knowl Data Eng, 2022, 34(2): 667-679.

[14]

Kaur H, Pannu HS, Malhi AK. A systematic review on imbalanced data challenges in machine learning: applications and solutions. ACM Comput Surv, 2019, 52(4): 1-36.

[15]

Liu X, Wu J, Zhou Z. Exploratory undersampling for class-Imbalance learning. IEEE Trans Syst Man Cybern B, 2009, 39(2): 539-550.

[16]

Liu R. A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification. Appl Intell, 2023, 53(1): 786-803.

[17]

Son M, Jung S, Jung S, et al. BCGAN: a CGAN-based over-sampling model using the boundary class for data balancing. J Supercomput, 2021, 77(9): 10463-10487.

[18]

Chawla NV, Bowyer KW, Hall LO, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res, 2002, 16: 321-357.

[19]

He H, Bai Y, Garcia EA et al (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks, IEEE, pp 1322‒1328

[20]

Han H, Wang WY, Mao BH. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. Adv Intell Comput, 2005, 3644: 878-887.

[21]

Douzas G, Bacao F, Last F. Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf Sci, 2018, 465: 1-20.

[22]

Goodfellow I, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial nets. Adv Neural Inf Process Syst 27. https://doi.org/10.3156/jsoft.29.5_177_2

[23]

Qin Z, Liu Z, Zhu P, et al. Style transfer in conditional GANs for cross-modality synthesis of brain magnetic resonance images. Comput Biol Med, 2022, 148.

[24]

Li Y, Gan Z, Shen Y et al (2019) StoryGAN: a sequential conditional GAN for story visualization. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA, pp 6322‒6331

[25]

Yang G, Zhong Y, Yang L, et al. Fault diagnosis of harmonic drive with imbalanced data using generative adversarial network. IEEE Trans Instrum Meas, 2021, 70: 1-11.

[26]

Li J, Cao L, Liu H, et al. Imbalanced data generation and fusion for in-situ monitoring of laser powder bed fusion. Mech Syst Signal Process, 2023, 199.

[27]

Li Y, Shi Z, Liu C, et al. Augmented time regularized generative adversarial network (ATR-GAN) for data augmentation in online process anomaly detection. IEEE Trans Autom Sci Eng, 2022, 19(4): 3338-3355.

[28]

Yu Y, Guo L, Gao H, et al. PCWGAN-GP: a new method for imbalanced fault diagnosis of machines. IEEE Trans Instrum Meas, 2022, 71: 3180431.

[29]

Wang X, Jiang H, Liu Y, et al. Data-augmented patch variational autoencoding generative adversarial networks for rolling bearing fault diagnosis. Meas Sci Technol, 2023, 34(5): .

[30]

Wang X, Jiang H, Wu Z, et al. Adaptive variational autoencoding generative adversarial networks for rolling bearing fault diagnosis. Adv Eng Inform, 2023, 56.

[31]

Odena A, Olah C, Shlens J (2017) Conditional image synthesis with auxiliary classifier GANs. In: International conference on machine learning, Sydney, Australia, 2017

[32]

Park N, Mohammadi M, Gorde K et al (2018) Data synthesis based on generative adversarial networks. arXiv:1806.03384, https://doi.org/10.14778/3231751.3231757

[33]

Zhang Y, Zaidi N, Zhou J, et al. Interpretable tabular data generation. Knowl Inf Syst, 2023, 65(7): 2935-2963.

[34]

Zhai J, Qi J, Zhang S. Imbalanced data classification based on diverse sample generation and classifier fusion. Int J Mach Learn Cybern, 2022, 13(3): 735-750.

[35]

Mirza M, Osindero S (2014) Conditional generative adversarial nets. https://doi.org/10.48550/arXiv.1411.1784

[36]

Xu L, Skoularidou M, Cuesta-Infante A et al (2019) Modeling tabular data using conditional GAN. Adv Neural Inf Process Syst, 32. https://doi.org/10.48550/arxiv.1907.00503

[37]

Dong Y, Xiao H, Dong Y. SA-CGAN: an oversampling method based on single attribute guided conditional GAN for multi-class imbalanced learning. Neurocomputing, 2022, 472: 326-337.

[38]

Choi E, Biswal S, Malin B et al (2017) Generating multi-label discrete patient records using generative adversarial networks.In: machine learning for healthcare conference, Northeastern University, 2017

[39]

Wen L, Zhang X, Li Q, et al. KGA: integrating KPCA and GAN for microbial data augmentation. Int J Mach Learn Cybern, 2023, 14(4): 1427-1444.

[40]

De Maesschalck R, Jouan-Rimbaud D, Massart DL. The mahalanobis distance. Chemometr Intell Lab Syst, 2000, 50(1): 1-18.

Funding

National Natural Science Foundation of China http://dx.doi.org/10.13039/501100001809(52375256)

Natural Science Foundation of Shanghai Municipality http://dx.doi.org/10.13039/100007219(21ZR1431500)

AI Summary AI Mindmap
PDF

243

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/