Learning rates for multi-kernel linear programming classifiers

Feilong Cao , Xing Xing

Front. Math. China ›› 2011, Vol. 6 ›› Issue (2) : 203 -219.

PDF (198KB)
Front. Math. China ›› 2011, Vol. 6 ›› Issue (2) : 203 -219. DOI: 10.1007/s11464-011-0103-3
Research Articles
RESEARCH ARTICLE

Learning rates for multi-kernel linear programming classifiers

Author information +
History +
PDF (198KB)

Abstract

In this paper, we consider the learning rates of multi-kernel linear programming classifiers. Our analysis shows that the convergence behavior of multi-kernel linear programming classifiers is almost the same as that of multi-kernel quadratic programming. This is implemented by setting a stepping stone between the linear programming and the quadratic programming. An upper bound is presented for general probability distributions and distribution satisfying some Tsybakov noise condition.

Keywords

Multi-kernel / linear programming / learning rate / classification

Cite this article

Download citation ▾
Feilong Cao, Xing Xing. Learning rates for multi-kernel linear programming classifiers. Front. Math. China, 2011, 6(2): 203-219 DOI:10.1007/s11464-011-0103-3

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Anthony M., Bartlett P. L. Neural Network Learning: Theoretical Foundations, 1999, Cambridge: Cambridge University Press

[2]

Aronszajn N. Theory of reproducing kernel. Trans Amer Math Soc, 1950, 68, 337-404

[3]

Bartleet P. L. The sample complexity of pattern classification with neural networks: the size of the wights is more important than the size of the network. IEEE Trans Inform Theory, 1998, 44, 525-536

[4]

Bartleet P. L., Jordan M. I., Mcaulifle J. D. Convexity, classification, and risk bounds. J Amer Stati Assoc, 2006, 101, 138-156

[5]

Chen D. R., Wu Q., Ying Y., Zhou D. X. Support vector machine soft margin classifier: error analysis. J Machine Learning Research, 2004, 5, 1143-1175

[6]

Cucker F., Smale S. On the mathematical foundations of learning. Bull Amer Math Soc, 2001, 39, 1-49

[7]

Devroye L., Györfi L., Lugosi G. A Probabilistic Theory of Pattern Recognition, 1997, Berlin: Springer-Verlag.

[8]

Lugosi G., Vayatis N. On the Bayes-risk consistency of regularized boosting methods. Ann Statist, 2004, 32, 30-55

[9]

Pollsard D. Convergence of Stochastic Processes, 1984, Berlin: Springer-Verlag.

[10]

Scovel C., Steinwart I. Fast Rates for Support Vector Machines, 2005, Berlin/Heidelberg: Springer-Verlag.

[11]

Smale S., Zhou D. X. Shannon sampling and function reconstruction from point values. Bull Amer Math Soc, 2004, 41, 279-305

[12]

Tsybakov A. B. Optimal aggregation of classifiers in statistical learning. Ann Statis, 2004, 32, 135-166

[13]

Vapnik V. Statistical Learning Theory, 1998, New York: John Wiley & Sons.

[14]

Wahba G. Spline Models for Observational Data, 1990, Philadelphia: SIAM.

[15]

Wu Q., Ying Y., Zhou D. X. Multi-kernel regularized classifiers. J Complexity, 2007, 23, 108-134

[16]

Wu Q., Zhou D. X. SVM soft margin classifiers: linear programming versus quadratic programming. Neural Computation, 2005, 17, 1160-1187

[17]

Zhang T. Covering number bounds of certain regularized linear function classes. J Machine Learning Research, 2002, 2, 527-550

[18]

Zhang T. Statistical behavior and consistency of classification methods based on convex risk minimization. Ann Stat, 2004, 32, 56-85

AI Summary AI Mindmap
PDF (198KB)

727

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/