Robust Simplex-Based Multinomial Logistic Regression

Shunqin Zhang , Sanguo Zhang , Hang Li , Sheng Fu

Communications in Mathematics and Statistics ›› : 1 -37.

PDF
Communications in Mathematics and Statistics ›› :1 -37. DOI: 10.1007/s40304-025-00459-0
Article
research-article
Robust Simplex-Based Multinomial Logistic Regression
Author information +
History +
PDF

Abstract

The multicategory logistic regression (MLR) is one of the most popular large-margin classifiers in machine learning. Although it has been successfully applied in various fields, some intrinsic problems still remain. In particular, the existing MLR models suffer from the over-specification of decision functions or the choice of reference category and are very sensitive to the potential outliers due to the unbounded loss function. In this article, utilizing the prevalent simplex-based framework, we propose three robust MLR models based on truncated loss function, weighted learning and label-adjusted learning, respectively. Moreover, the first two approaches achieve robustness by removing potential outliers and the latter obtains robustness by adaptively relabeling outliers. Theoretical properties including Fisher consistency, probability estimation and breakdown point are well established. Intensive numerical studies demonstrate that the proposed methods are very competitive for problems with potential outliers.

Keywords

Fisher consistency / Multicategory classification / Probability estimation / Robustness / Simplex-based framework / 62H30 / 62J12 / 68Q32

Cite this article

Download citation ▾
Shunqin Zhang, Sanguo Zhang, Hang Li, Sheng Fu. Robust Simplex-Based Multinomial Logistic Regression. Communications in Mathematics and Statistics 1-37 DOI:10.1007/s40304-025-00459-0

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

An L, Tao PD. Solving a class of linearly constrained indefinite quadratic problems by dc algorithms. J. Glob. Optim.. 1997, 11(3): 253-285.

[2]

An L, Tao PD. DC programming and DCA: thirty years of developments. Math. Program.. 2018, 169(1): 5-68.

[3]

Bartlett PL, Jordan MI, McAuliffe JD. Convexity, classification, and risk bounds. J. Am. Stat. Assoc.. 2006, 101(473): 138-156.

[4]

Bishop CM. Pattern Recognition and Machine Learning. 2006, New York, Springer

[5]

Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)

[6]

Boyd S, Vandenberghe L. Convex Optimization. 2004, New York, Cambridge University Press.

[7]

Copas JB. Binary regression models for contaminated data. J. R. Stat. Soc. Ser. B. 1988, 50(2): 225-253.

[8]

Cortes C, Vapnik V. Support-vector networks. Mach. Learn.. 1995, 20(3): 273-297.

[9]

Feng J, Xu H, Mannor S, Yan S. Robust logistic regression and classification. Adv. Neural. Inf. Process. Syst.. 2014, 27: 253-261

[10]

Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw.. 2010, 33(1): 1-22.

[11]

Fu S, Zhang S, Liu Y. Adaptively weighted large-margin angle-based classifiers. J. Multivar. Anal.. 2018, 166: 282-299.

[12]

Fu S, He Q, Zhang S, Liu Y. Robust outcome weighted learning for optimal individualized treatment rules. J. Biopharm. Stat.. 2019, 29(4): 606-624.

[13]

Fu S, Chen P, Ye Z. Simplex-based proximal multicategory support vector machine. IEEE Trans. Inf. Theory. 2023, 69: 2427-2451.

[14]

Fu S, Chen P, Liu Y, Ye Z. Simplex-based multinomial logistic regression with diverging numbers of categories and covariates. Stat. Sin.. 2023, 33(4): 2463-2493

[15]

Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2009, New York, Springer.

[16]

Hayashi K. A boosting method with asymmetric mislabeling probabilities which depend on covariates. Comput. Stat.. 2012, 27(2): 203-218.

[17]

Hosmer DWJrLemeshow S, Sturdivant RX. Applied Logistic Regression. 20133Hoboken, Wiley.

[18]

Huber PJ, Ronchetti EM. Robust Statistics. 20092Hoboken, Wiley.

[19]

Hung H, Jou Z-Y, Huang S-Y. Robust mislabel logistic regression without modeling mislabel probabilities. Biometrics. 2018, 74(1): 145-154.

[20]

Hunter DR, Lange K. A tutorial on mm algorithms. Am. Stat.. 2004, 58(1): 30-37.

[21]

Johnson B, Tateishi R, Xie Z. Using geographically weighted variables for image classification. Remote Sens. Lett.. 2012, 3(6): 491-499.

[22]

Kimeldorf G, Wahba G. Some results on Tchebycheffian spline functions. J. Math. Anal. Appl.. 1971, 33(1): 82-95.

[23]

Komori O, Eguchi S, Ikeda S, Okamura H, Ichinokawa M, Nakayama S. An asymmetric logistic regression model for ecological data. Methods Ecol. Evol.. 2016, 7(2): 249-260.

[24]

Krishnapuram B, Carin L, Figueiredo MA, Hartemink AJ. Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans. Pattern Anal. Mach. Intell.. 2005, 27(6): 957-968.

[25]

Lin X, Wahba G, Xiang D, Gao F, Klein R, Klein B. Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV. Ann. Stat.. 2000, 28(6): 1570-1600

[26]

Liu Y, Shen X. Multicategory ψ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\psi $$\end{document}-learning. J. Am. Stat. Assoc.. 2006, 101(474): 500-509.

[27]

Liu Y, Shen X, Doss H. Multicategory ψ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\psi $$\end{document}-learning and support vector machine: computational tools. J. Comput. Graph. Stat.. 2005, 14(1): 219-236.

[28]

Liu Y, Zhang HH, Wu Y. Hard or soft classification? Large-margin unified machines. J. Am. Stat. Assoc.. 2011, 106(493): 166-177.

[29]

Malley JD, Kruppa J, Dasgupta A, Malley KG, Ziegler A. Probability machines: consistent probability estimation using nonparametric learning machines. Methods Inf. Med.. 2012, 51(1): 74-81.

[30]

McDaniel MW, Nishihata T, Brooks CA, Salesses P, Iagnemma K. Terrain classification and identification of tree stems using ground-based lidar. J. Field Robot.. 2012, 29(6): 891-910.

[31]

Mroueh, Y., Poggio, T., Rosasco, L., Slotine, J.-J.: Multiclass learning with simplex coding. Adv. Neural Inf. Process. Syst. 25 (2012)

[32]

Park SY, Liu Y. Robust penalized logistic regression with truncated loss functions. Can. J. Stat.. 2011, 39(2): 300-323.

[33]

Pregibon D. Resistant fits for some commonly used logistic models with medical applications. Biometrics. 1982, 38(2): 485-498.

[34]

Qi Z, Liu D, Fu H, Liu Y. Multi-armed angle-based direct learning for estimating optimal individualized treatment rules with various outcomes. J. Am. Stat. Assoc.. 2020, 115(530): 678-691.

[35]

Qian C, Tran-Dinh Q, Fu S, Zou C, Liu Y. Robust multicategory support matrix machines. Math. Program.. 2019, 176(1): 429-463.

[36]

R Core Team: R: a language and environment for statistical computing (2022)

[37]

Rahman JU, Chen Q, Yang Z. Additive parameter for deep face recognition. Commun. Math. Stat.. 2020, 8(2): 203-217.

[38]

Ramana BV, Babu M, Venkateswarlu Net al. . A critical study of selected classification algorithms for liver disease diagnosis. Int. J. Database Manag. Syst.. 2011, 3(2): 101-114.

[39]

Ren M, Zhang S, Zhang Q. Robust high-dimensional regression for data with anomalous responses. Ann. Inst. Stat. Math.. 2021, 73(4): 703-736.

[40]

Shen X, Tseng GC, Zhang X, Wong WH. On ψ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\psi $$\end{document}-learning. J. Am. Stat. Assoc.. 2003, 98(463): 724-734.

[41]

Sun H, Craig BA, Zhang L. Angle-based multicategory distance-weighted SVM. J. Mach. Learn. Res.. 2017, 18(1): 2981-3001

[42]

Vovan T, Tranphuoc L, Chengoc H. Classifying two populations by Bayesian method and applications. Commun. Math. Stat.. 2019, 7: 141-161.

[43]

Wahba, G.: Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In: Advances in Kernel Methods-Support Vector Learning, pp. 69–87 (1999)

[44]

Wang W, Qiao X. Set-valued support vector machine with bounded error rates. J. Am. Stat. Assoc.. 2022, 118: 1-13

[45]

Wang H, Zhang J. A survey of deep learning-based mesh processing. Commun. Math. Stat.. 2022, 10(1): 163-194.

[46]

Wu Y, Liu Y. Robust truncated hinge loss support vector machines. J. Am. Stat. Assoc.. 2007, 102(479): 974-983.

[47]

Wu Y, Liu Y. Adaptively weighted large margin classifiers. J. Comput. Graph. Stat.. 2013, 22(2): 416-432.

[48]

Yang Y, Guo Y, Chang X. Angle-based cost-sensitive multicategory classification. Comput. Stat. Data An.. 2021, 156. 107107

[49]

Yin M, Zeng D, Gao J, Wu Z, Xie S. Robust multinomial logistic regression based on RPCA. IEEE J. Sel. Top. Signal Process.. 2018, 12(6): 1144-1154.

[50]

Zahid FM, Tutz G. Ridge estimation for multinomial logit models with symmetric side constraints. Comput. Stat.. 2013, 28(3): 1017-1034.

[51]

Zhang C, Liu Y. Multicategory angle-based large-margin classification. Biometrika. 2014, 101(3): 625-640.

[52]

Zhang C, Liu Y, Wang J, Zhu H. Reinforced angle-based multicategory support vector machines. J. Comput. Graph. Stat.. 2016, 25(3): 806-825.

[53]

Zhang C, Lu X, Zhu Z, Hu Y, Singh D, Jones C, Liu J, Prins JF, Liu Y. REC: fast sparse regression-based multicategory classification. Stat. Interface. 2017, 10(2): 175-185.

[54]

Zhang C, Pham M, Fu S, Liu Y. Robust multicategory support vector machines using difference convex algorithm. Math. Program.. 2018, 169(1): 277-305.

[55]

Zhang C, Chen J, Fu H, He X, Zhao Y-Q, Liu Y. Multicategory outcome weighted margin-based learning for estimating individualized treatment rules. Stat. Sin.. 2020, 30(4): 1857-1879

[56]

Zhao J, Yu G, Liu Y. Assessing robustness of classification using angular breakdown point. Ann. Stat.. 2018, 46(6B): 3362-3389.

[57]

Zhou X, Wang Y, Zeng D. Multicategory classification via forward-backward support vector machine. Commun. Math. Stat.. 2020, 8(3): 319-339.

[58]

Zhu J, Hastie T. Classification of gene microarrays by penalized logistic regression. Biostatistics. 2004, 5(3): 427-443.

[59]

Zhu J, Hastie T. Kernel logistic regression and the import vector machine. J. Comput. Graph. Stat.. 2005, 14(1): 185-205.

[60]

Zou H, Hastie T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B. 2005, 67(2): 301-320.

Funding

Guangxi Key Research and Development Program(2020AB10023)

National Natural Science Foundation of China(12171454)

RIGHTS & PERMISSIONS

School of Mathematical Sciences, University of Science and Technology of China and Springer-Verlag GmbH Germany, part of Springer Nature

PDF

8

Accesses

0

Citation

Detail

Sections
Recommended

/