Cancer classification with data augmentation based on generative adversarial networks

Kaimin WEI; Tianqi LI; Feiran HUANG; Jinpeng CHEN; Zefan HE

doi:10.1007/s11704-020-0025-x

Front. Comput. Sci. ›› 2022, Vol. 16 ›› Issue (2) :162601 DOI: 10.1007/s11704-020-0025-x

Information Systems

RESEARCH ARTICLE

Cancer classification with data augmentation based on generative adversarial networks

Kaimin WEI ¹^,²
, Tianqi LI ¹^,²
, Feiran HUANG ¹^,²^,^†
, Jinpeng CHEN ³
, Zefan HE ¹^,²

Author information +

History +

PDF (8634KB)

Abstract

Accurate diagnosis is a significant step in cancer treatment. Machine learning can support doctors in prognosis decision-making, and its performance is always weakened by the high dimension and small quantity of genetic data. Fortunately, deep learning can effectively process the high dimensional data with growing. However, the problem of inadequate data remains unsolved and has lowered the performance of deep learning. To end it, we propose a generative adversarial model that uses non target cancer data to help target generator training. We use the reconstruction loss to further stabilize model training and improve the quality of generated samples. We also present a cancer classification model to optimize classification performance. Experimental results prove that mean absolute error of cancer gene made by our model is 19.3% lower than DC-GAN, and the classification accuracy rate of our produced data is higher than the data created by GAN. As for the classification model, the classification accuracy of our model reaches 92.6%, which is 7.6% higher than the model without any generated data.

Graphical abstract

Keywords

data mining / cancer data analysis / deep learning / generative adversarial networks

Cite this article

Download citation ▾

Kaimin WEI, Tianqi LI, Feiran HUANG, Jinpeng CHEN, Zefan HE. Cancer classification with data augmentation based on generative adversarial networks. Front. Comput. Sci., 2022, 16(2): 162601 DOI:10.1007/s11704-020-0025-x

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Padma V V . An overview of targeted cancer therapy. BioMedicine, 2015, 5( 4): 1– 6

[2]	Siegel R , Miller K , Jemal A . Cancer statistics 2019. CA: A Cancer Journal for Clinicians, 2019, 69( 1): 7– 34

[3]	Abeel T , Helleputte T , Van de Deer Y , Dupont P , Saeys Y . Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics, 2009, 26( 3): 392– 398

[4]	Bokulich N A , Kaehler B D , Rideout J R , Dillon M , Bolyen E , Knight R , Huttley G A , Caporaso J G . Optimizing taxonomic classification of marker-gene amplicon sequences with qiime 2s q2-feature-classifier plugin. Microbiome, 2018, 6( 90): 1– 17

[5]	Zhang R , Huang G , Sundararajan N , Saratchandran P . Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Transactions on Computer Biology Bioinformation, 2007, 4( 3): 485– 495

[6]	Sun W, Zheng B, Qian W. Computer aided lung cancer diagnosis with deep learning algorithms. Medical Imaging 2016: Computer-Aided Diagnosis. 2016, 9785: 97850Z

[7]	Institute N C. The cancer genome atlas. see the homepage of National Cancer Institute, 2020

[8]	Ebigbo A , Mendel R , Probst A , Manzeneder J , de Souza Jr L A , Papa J P , Palm C , Messmann H . Computer-aided diagnosis using deep learning in the evaluation of early oesophageal adenocarcinoma. Gut, 2019, 68( 7): 1143– 1145

[9]	Khosravan N , Celik H , Turkbey B , Jones E C , Wood B , Bagci U . A collaborative computer aided diagnosis (C-CAD) system with eye-tracking, sparse attentional model, and deep learning. Medical Image Analysis, 2019, 51 : 101– 115

[10]	Afshar P , Mohammadi A , Plataniotis K N , Oikonomou A , Benali H . From handcrafted to deep-learning-based cancer radiomics: challenges and opportunities. IEEE Signal Processing Magazine, 2019, 36( 4): 132– 160

[11]	Jeyaraj P R , Nadar E R S . Computer-assisted medical image classification for early diagnosis of oral cancer employing deep learning algorithm. Journal of Cancer Research and Clinical Oncology, 2019, 145( 4): 829– 837

[12]	Golub T R , Slonim D K , Tamayo P , Huard C , Gaasenbeek M , Mesirov J P , Coller H , Loh M L , Downing J R , Caligiuri M A . Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 1999, 286( 5439): 531– 537

[13]	Furey T S , Cristianini N , Duffy N , Bednarski D W , Schummer M , Haussler D . Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 2000, 16( 10): 906– 914

[14]	Reddy S , Reddy K T , Kumari V V , Varma K V . An SVM based approach to breast cancer classification using rbf and polynomial kernel functions with varying arguments. International Journal of Computer Science and Information Technologies, 2014, 5( 4): 5901– 5904

[15]	Fakoor R, Ladhak F, Nazi A, Huber M. Using deep learning to enhance cancer diagnosis and classification. In: Proceedings of International Conference on Machine Learning. 2013, 1–7

[16]	Danaee P, Ghaeini R, Hendrix D. A deep learning approach for cancer detection and relevant gene identification. In: Proceedings of Pacific Symposium on Biocomputing. 2017, 219–229

[17]	Esteva A , Kuprel B , Novoa R A , Ko J , Swetter S M , Blau H M , Thrun S . Dermatologist-level classification of skin cancer with deep neural networks. Nature, 2017, 542( 7639): 115– 118

[18]	Sirinukunwattana K , Raza S E A , Tsang Y , Snead D R J , Cree I A , Rajpoot N M . Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE Transacations on Medical Imaging, 2016, 35( 5): 1196– 1206

[19]	Coudray N , Ocampo P S , Sakellaropoulos T , Narula N , Snuderl M , Fenyö D , Moreira A L , Razavian N , Tsirigos A . Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nature Medicine, 2018, 24( 10): 1559– 1569

[20]	Liang M , Li Z , Chen T , Zeng J . Integrative data analysis of multiplatform cancer data with a multimodal deep learning approach. IEEE/ACM Transactions on Computer Biology Bioinformation, 2015, 12( 4): 928– 937

[21]	Chawla N V , Bowyer K W , Hall L O , Kegelmeyer W P . Smote: synthetic minority over-sampling. Journal of Artificial Intelligence Research, 2002, 16( 1): 321– 357

[22]	Li F, Fergus R, Perona P. A bayesian approach to unsupervised oneshot learning of object categories. In: Proceedings of the 9th IEEE International Conference on Computer Vision. 2003, 1134–1141

[23]	Perez L, Wang J. The effectiveness of data augmentation in image classification using deep learning. 2017, arXiv preprint arXiv: 1712.04621

[24]	Peng X, Tang Z, Yang F, Feris R S, Metaxas D N. Jointly optimize data augmentation and network training: adversarial data augmentation in human pose estimation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2018, 2226–2234

[25]	Mok T C W, Chung A C S. Learning data augmentation for brain tumor segmentation with coarse-to-fine generative adversarial networks. In: Proceedings of the 4th International Workshop on Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. 2018, 70–80

[26]	Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A C, Bengio Y. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 2672–2680

[27]	Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of the 4th International Conference on Learning Representations. 2016, 1–16

[28]	Kingma D P, Welling M. Auto-encoding variational bayes. In: Proceedings of the 2nd International Conference on Learning Representations. 2014, 1–14