REVIEW ARTICLE

An oracle inequality for regularized risk minimizers with strongly mixing observations

  • Feilong CAO ,
  • Xing XING
Expand
  • Institute of Metrology and Computational Science, China Jiliang University, Hangzhou 310018, China

Received date: 06 Jun 2011

Accepted date: 29 Sep 2012

Published date: 01 Apr 2013

Copyright

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg

Abstract

We establish a general oracle inequality for regularized risk minimizers with strongly mixing observations, and apply this inequality to support vector machine (SVM) type algorithms. The obtained main results extend the previous known results for independent and identically distributed samples to the case of exponentially strongly mixing observations.

Cite this article

Feilong CAO , Xing XING . An oracle inequality for regularized risk minimizers with strongly mixing observations[J]. Frontiers of Mathematics in China, 2013 , 8(2) : 301 -315 . DOI: 10.1007/s11464-013-0247-4

1
Aronszajn N. Theory of reproducing kernels. Trans Amer Math Soc, 1950, 68: 337-404

DOI

2
Bousquet O. New approaches to statistical learning theory. Ann Inst Statist Math,2003, 55: 371-389

DOI

3
Chen D R, Wu Q, Ying Y M, Zhou D X. Support vector machine soft margin classifier: error analysis. J Mach Learn Res, 2004, 5: 1143-1175

4
Cucker F, Smale S. On the mathematical foundations of learning. Bull Amer Math Soc, 2001, 39: 1-49

DOI

5
Cucker F, Zhou D X. Learning Theory: An Approximation Theory Viewpoint. Cambridge: Cambridge Univ Press, 2007

DOI

6
Hu T. Online regression with varying Gaussians and non-identical distributions. Anal Appl, 2011, 9: 395-408

DOI

7
Ibragimov I A, Linnik Y V. Independent and Stationary Sequences of Random Variables. Groningen: Wolters-Noordnoff, 1971

8
Karandikar R L, Vidyasagar M. Rates of uniform convergence of empirical means with mixing processes. Statist Probab Letters, 2002, 58: 297-307

DOI

9
Meir R. Nonparametric time series prediction through adaptive modal selection. Mach Learn, 2000, 39: 5-34

DOI

10
Modha S, Masry E. Minimum complexity regression estimation with weakly dependent observations. IEEE Trans Inform Theory, 1996, 42: 2133-2145

DOI

11
Rosenblatt M. A central theorem and strong mixing condition. Proc Natl Acad Sci, 1956, 4: 43-47

DOI

12
Smale S, Zhou D X. Online learning with Markov sampling. Anal Appl, 2009, 7: 87-113

DOI

13
Steinwart I, Hush D, Scovel C. An oracle inequality for clipped regularized risk minimizers. Advance in Neural Information Processing Systems, 2007, 19: 1321-1328

14
Steinwart I, Hush D, Scovel C. Learning from dependent observations. J Multivariate Anal, 2009, 100: 175-194

DOI

15
Steinwart I, Scovel C. Fast rates for support vector machines using Gaussian kernels. Ann Statist, 2007, 35: 575-607

DOI

16
Sun H, Wu Q. Indefinite kernel network with dependent sampling. Anal Appl (to appear)

17
Vapnik V. Statistical Learning Theory. New York: Wiley, 1998

18
Vapnik V, Chervonenkis A. On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Appl, 1971, 16: 264-280

DOI

19
Vidyasagar M. Learning and Generalization with Applications to Neural Networks. 2nd ed. Berlin: Springer, 2002

20
White H. Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings. Neural Networks, 1989, 3: 535-549

DOI

21
Wu Q, Zhou D X. SVM soft margin classifiers: linear programming versus quadratic programming. Neural Comput, 2005, 17: 1160-1187

DOI

22
Xiang D H, Zhou D X. Classification with Gaussians and convex loss. J Mach Learn Res, 2009, 10: 1447-1468

23
Xiao Q W, Pan Z W. Learning from non-identical sampling for classification. Adv Comput Math, 2010, 33: 97-112

DOI

24
Yu B. Rates of convergence for empirical processes of stationary mixing sequences. Ann Probab, 1994, 22: 94-114

DOI

25
Zhou D X. Capacity of reproducing kernel spaces in learning theory. IEEE Trans Inform Theory, 2003, 49: 1743-1752

DOI

Options
Outlines

/