Frontiers of Mathematics in China >
An oracle inequality for regularized risk minimizers with strongly mixing observations
Received date: 06 Jun 2011
Accepted date: 29 Sep 2012
Published date: 01 Apr 2013
Copyright
We establish a general oracle inequality for regularized risk minimizers with strongly mixing observations, and apply this inequality to support vector machine (SVM) type algorithms. The obtained main results extend the previous known results for independent and identically distributed samples to the case of exponentially strongly mixing observations.
Feilong CAO , Xing XING . An oracle inequality for regularized risk minimizers with strongly mixing observations[J]. Frontiers of Mathematics in China, 2013 , 8(2) : 301 -315 . DOI: 10.1007/s11464-013-0247-4
1 |
Aronszajn N. Theory of reproducing kernels. Trans Amer Math Soc, 1950, 68: 337-404
|
2 |
Bousquet O. New approaches to statistical learning theory. Ann Inst Statist Math,2003, 55: 371-389
|
3 |
Chen D R, Wu Q, Ying Y M, Zhou D X. Support vector machine soft margin classifier: error analysis. J Mach Learn Res, 2004, 5: 1143-1175
|
4 |
Cucker F, Smale S. On the mathematical foundations of learning. Bull Amer Math Soc, 2001, 39: 1-49
|
5 |
Cucker F, Zhou D X. Learning Theory: An Approximation Theory Viewpoint. Cambridge: Cambridge Univ Press, 2007
|
6 |
Hu T. Online regression with varying Gaussians and non-identical distributions. Anal Appl, 2011, 9: 395-408
|
7 |
Ibragimov I A, Linnik Y V. Independent and Stationary Sequences of Random Variables. Groningen: Wolters-Noordnoff, 1971
|
8 |
Karandikar R L, Vidyasagar M. Rates of uniform convergence of empirical means with mixing processes. Statist Probab Letters, 2002, 58: 297-307
|
9 |
Meir R. Nonparametric time series prediction through adaptive modal selection. Mach Learn, 2000, 39: 5-34
|
10 |
Modha S, Masry E. Minimum complexity regression estimation with weakly dependent observations. IEEE Trans Inform Theory, 1996, 42: 2133-2145
|
11 |
Rosenblatt M. A central theorem and strong mixing condition. Proc Natl Acad Sci, 1956, 4: 43-47
|
12 |
Smale S, Zhou D X. Online learning with Markov sampling. Anal Appl, 2009, 7: 87-113
|
13 |
Steinwart I, Hush D, Scovel C. An oracle inequality for clipped regularized risk minimizers. Advance in Neural Information Processing Systems, 2007, 19: 1321-1328
|
14 |
Steinwart I, Hush D, Scovel C. Learning from dependent observations. J Multivariate Anal, 2009, 100: 175-194
|
15 |
Steinwart I, Scovel C. Fast rates for support vector machines using Gaussian kernels. Ann Statist, 2007, 35: 575-607
|
16 |
Sun H, Wu Q. Indefinite kernel network with dependent sampling. Anal Appl (to appear)
|
17 |
Vapnik V. Statistical Learning Theory. New York: Wiley, 1998
|
18 |
Vapnik V, Chervonenkis A. On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Appl, 1971, 16: 264-280
|
19 |
Vidyasagar M. Learning and Generalization with Applications to Neural Networks. 2nd ed. Berlin: Springer, 2002
|
20 |
White H. Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings. Neural Networks, 1989, 3: 535-549
|
21 |
Wu Q, Zhou D X. SVM soft margin classifiers: linear programming versus quadratic programming. Neural Comput, 2005, 17: 1160-1187
|
22 |
Xiang D H, Zhou D X. Classification with Gaussians and convex loss. J Mach Learn Res, 2009, 10: 1447-1468
|
23 |
Xiao Q W, Pan Z W. Learning from non-identical sampling for classification. Adv Comput Math, 2010, 33: 97-112
|
24 |
Yu B. Rates of convergence for empirical processes of stationary mixing sequences. Ann Probab, 1994, 22: 94-114
|
25 |
Zhou D X. Capacity of reproducing kernel spaces in learning theory. IEEE Trans Inform Theory, 2003, 49: 1743-1752
|
/
〈 | 〉 |