Learning label-specific features for decomposition-based multi-class classification

Bin-Bin JIA; Jun-Ying LIU; Jun-Yi HANG; Min-Ling ZHANG

doi:10.1007/s11704-023-3076-y

PDF(3940 KB)

Front. Comput. Sci. ›› 2023, Vol. 17 ›› Issue (6) : 176348. DOI: 10.1007/s11704-023-3076-y

Artificial Intelligence

RESEARCH ARTICLE

Learning label-specific features for decomposition-based multi-class classification

Author information +

History +

Abstract

Multi-class classification can be solved by decomposing it into a set of binary classification problems according to some encoding rules, e.g., one-vs-one, one-vs-rest, error-correcting output codes. Existing works solve these binary classification problems in the original feature space, while it might be suboptimal as different binary classification problems correspond to different positive and negative examples. In this paper, we propose to learn label-specific features for each decomposed binary classification problem to consider the specific characteristics containing in its positive and negative examples. Specifically, to generate the label-specific features, clustering analysis is respectively conducted on the positive and negative examples in each decomposed binary data set to discover their inherent information and then label-specific features for one example are obtained by measuring the similarity between it and all cluster centers. Experiments clearly validate the effectiveness of learning label-specific features for decomposition-based multi-class classification.

Graphical abstract

Keywords

machine learning / multi-class classification / error-correcting output codes / label-specific features

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Bin-Bin JIA, Jun-Ying LIU, Jun-Yi HANG, Min-Ling ZHANG. Learning label-specific features for decomposition-based multi-class classification. Front. Comput. Sci., 2023, 17(6): 176348 https://doi.org/10.1007/s11704-023-3076-y

This is a preview of subscription content, contact us for subscripton.

Bin-Bin Jia received the bachelor’s degree from North China Electric Power University, China in 2010, and the master’s degree from Beihang University, China in 2013. He joined Lanzhou University of Technology, China in 2013 and is an assistant professor currently. From September 2017 to March 2022, he studied in Southeast University where he received the PhD degree. His main research interests include machine learning and data mining

Jun-Ying Liu received the bachelor’s degree from North China Electric Power University, China in 2010, and the master’s degree from Beijing Jiaotong University, China in 2012. Currently, she is an assistant professor at the College of Electrical and Information Engineering, Lanzhou University of Technology, China. Her main research interests include machine learning and data mining

Jun-Yi Hang received the BSc and MSc degrees from Beihang University, China in 2017 and 2020, respectively. Currently, he is a PhD student at the School of Computer Science and Engineering, Southeast University, China. His main research interests include machine learning and data mining, especially in learning from multi-label data

Min-Ling Zhang received the BSc, MSc, and PhD degrees in computer science from Nanjing University, China in 2001, 2004 and 2007, respectively. Currently, he is a Professor at the School of Computer Science and Engineering, Southeast University, China. His main research interests include machine learning and data mining. In recent years, Dr. Zhang has served as the General Co-Chairs of ACML’18, Program Co-Chairs of PAKDD’19, CCF-ICAI’19, ACML’17, CCFAI’17, PRICAI’16, Senior PC member or Area Chair of AAAI 2022-2024, IJCAI 2017-2023, KDD 2021-2023, ICDM 2015-2022, etc. He is also on the editorial board of IEEE Transactions on Pattern Analysis and Machine Intelligence, ACM Transactions on Intelligent Systems and Technology, Neural Networks, Science China Information Sciences, Frontiers of Computer Science, etc. Dr. Zhang is the Steering Committee Member of ACML and PAKDD, Vice Chair of the CAAI Machine Learning Society, standing committee member of the CCF Artificial Intelligence & Pattern Recognition Society. He is a Distinguished Member of CCF, CAAI, and Senior Member of AAAI, ACM, IEEE

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Zhou Z H. Machine Learning. Singapore: Springer, 2021

[2]	Han J, Pei J, Tong H. Data Mining: Concepts and Techniques. 4th ed. Cambridge: Morgan Kaufmann, 2022

[3]	Zhou Z H . Open-environment machine learning. National Science Review, 2022, 9( 8): nwac123

[4]	Zhang B, Zhu J, Su H . Toward the third generation artificial intelligence. Science China Information Sciences, 2023, 66( 2): 121101

[5]	Zhao L, Song Y, Zhu Y, Zhang C, Zheng Y. Face recognition based on multi-class SVM. In: Proceedings of 2009 Chinese Control and Decision Conference. 2009, 5871−5873

[6]	Wu K, Jia F, Han Y . Domain-specific feature elimination: multi-source domain adaptation for image classification. Frontiers of Computer Science, 2023, 17( 4): 174705

[7]	Wang T Y, Chiang H M . Fuzzy support vector machine for multi-class text categorization. Information Processing & Management, 2007, 43( 4): 914–929

[8]	Moreo A, Esuli A, Sebastiani F . Word-class embeddings for multiclass text classification. Data Mining and Knowledge Discovery, 2021, 35( 3): 911–963

[9]	Frid A, Manevitz L, Mosafi O. Multi-class classification in parkinson’s disease by leveraging internal topological structure of the data and of the label space. In: Proceedings of 2019 International Joint Conference on Neural Networks. 2019, 1−9

[10]	Wei K, Li T, Huang F, Chen J, He Z . Cancer classification with data augmentation based on generative adversarial networks. Frontiers of Computer Science, 2022, 16( 2): 162601

[11]	Tsoumakas G, Katakis I, Vlahavas I . Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering, 2011, 23( 7): 1079–1089

[12]	Zhang M L, Li Y K, Yang H, Liu X Y . Towards class-imbalance aware multi-label learning. IEEE Transactions on Cybernetics, 2022, 52( 6): 4459–4471

[13]	Read J, Martino L, Luengo D . Efficient monte carlo methods for multi-dimensional learning with classifier chains. Pattern Recognition, 2014, 47( 3): 1535–1546

[14]	Jia B B, Zhang M L . Multi-dimensional classification via stacked dependency exploitation. Science China Information Sciences, 2020, 63( 12): 222102

[15]	Jia B B, Zhang M L . Multi-dimensional classification via selective feature augmentation. Machine Intelligence Research, 2022, 19( 1): 38–51

[16]	Lorena A C, De Carvalho A C P L F, Gama J M P . A review on the combination of binary classifiers in multiclass problems. Artificial Intelligence Review, 2008, 30( 1−4): 19–37

[17]	Hsu C W, Lin C J . A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 2002, 13( 2): 415–425

[18]	Duan K, Keerthi S. Which is the best multiclass SVM method? An empirical study. In: Proceedings of the 6th International Workshop on Multiple Classifier Systems. 2005, 278−285

[19]	Dietterich T G, Bakiri G . Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 1995, 2: 263–286

[20]	Allwein E L, Schapire R E, Singer Y . Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 2000, 1: 113–141

[21]	Pujol O, Radeva P, Vitrià J . Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28( 6): 1007–1012

[22]	Escalera S, Tax D M J, Pujol O, Radeva P, Duin R P W . Subclass problem-dependent design for error-correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30( 6): 1041–1054

[23]	Escalera S, Pujol O, Radeva P . On the decoding process in ternary error-correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32( 1): 120–134

[24]	Pujol O, Escalera S, Radeva P . An incremental node embedding technique for error correcting output codes. Pattern Recognition, 2008, 41( 2): 713–725

[25]	Lecun Y, Bottou L, Bengio Y, Haffner P . Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86( 11): 2278–2324

[26]	Cortes C, Vapnik V . Support-vector networks. Machine Learning, 1995, 20( 3): 273–297

[27]	Liu J Y, Jia B B . Combining one-vs-one decomposition and instance-based learning for multi-class classification. IEEE Access, 2020, 8: 197499–197507

[28]	Wang Z, Xue X. Multi-class support vector machine. In: Ma Y Q, Guo G D, eds. Support Vector Machines Applications. Cham: Springer, 2014, 23−48

[29]	Hastie T, Rosset S, Zhu J, Zou H . Multi-class adaboost. Statistics and Its Interface, 2009, 2( 3): 349–360

[30]	Zheng F, Xue H, Chen X, Wang Y. Maximum margin tree error correcting output codes. In: Proceedings of the 14th Pacific Rim International Conference on Artificial Intelligence. 2016, 681−691

[31]	Zheng F, Xue H. Subclass maximum margin tree error correcting output codes. In Proceedings of the 15th Pacific Rim International Conference on Artificial Intelligence. 2018, 454−462

[32]	Kang S, Cho S, Kang P . Constructing a multi-class classifier using one-against-one approach with different binary classifiers. Neurocomputing, 2015, 149: 677–682

[33]	Liu M, Zhang D, Chen S, Xue H . Joint binary classifier learning for ECOC-based multi-class classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38( 11): 2335–2341

[34]	Zhang M L, Wu L . LIFT: multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37( 1): 107–120

[35]	Jain A K, Murty M N, Flynn P J . Data clustering: a review. ACM Computing Surveys, 1999, 31( 3): 264–323

[36]	Fan R E, Chang K W, Hsieh C J, Wang X R, Lin C J . LIBLINEAR: a library for large linear classification. Journal of Machine Learning Research, 2008, 9: 1871–1874

[37]	Crammer K, Singer Y . On the algorithmic implementation of multiclass kernel-based vector machines. The Journal of Machine Learning Research, 2001, 2: 265–292

[38]	Dobson A J, Barnett A G. An Introduction to Generalized Linear Models. 4th ed. Boca Raton: Chapman and Hall/CRC, 2018

[39]	Demšar J . Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 2006, 7: 1–30

[40]	Wang S, Yao X . Multiclass imbalance problems: analysis and potential solutions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42( 4): 1119–1130

Acknowledgements

The authors wish to thank the associate editor and anonymous reviewers for their helpful comments and suggestions. This work was supported by the National Natural Science Foundation of China (Grant No. 62225602).