Frontiers of Mathematics in China >
Law of iterated logarithm and model selection consistency for generalized linear models with independent and dependent responses
Received date: 23 Apr 2020
Accepted date: 09 Jan 2021
Published date: 15 Jun 2021
Copyright
We study the law of the iterated logarithm (LIL) for the maximum likelihood estimation of the parameters (as a convex optimization problem) in the generalized linear models with independent or weakly dependent (-mixing) responses under mild conditions. The LIL is useful to derive the asymptotic bounds for the discrepancy between the empirical process of the log-likelihood function and the true log-likelihood. The strong consistency of some penalized likelihood-based model selection criteria can be shown as an application of the LIL. Under some regularity conditions, the model selection criterion will be helpful to select the simplest correct model almost surely when the penalty term increases with the model dimension, and the penalty term has an order higher than O(log log n) but lower than O(n): Simulation studies are implemented to verify the selection consistency of Bayesian information criterion.
Xiaowei YANG , Shuang SONG , Huiming ZHANG . Law of iterated logarithm and model selection consistency for generalized linear models with independent and dependent responses[J]. Frontiers of Mathematics in China, 2021 , 16(3) : 825 -856 . DOI: 10.1007/s11464-021-0900-2
1 |
Ai M Y, Wang F, Yu J, Zhang H M. Optimal subsampling for large-scale quantile regression. J Complexity, 2021, 62: 101512
|
2 |
Ai M Y, Yu J, Zhang H M, Wang H Y. Optimal subsampling algorithms for big data regressions. Statist Sinica, 2021, 31(2): 749–772
|
3 |
Akaike H. Information theory and an extension of the maximum likelihood principle. In: Second International Symposium on Information Theory. 1973, 267–281
|
4 |
Bosq D. Nonparametric Statistics for Stochastic Processes: Estimation and Prediction. Lect Notes Stat, Vol 110. Berlin: Springer, 1998
|
5 |
Brown L D. Fundamentals of Statistical Exponential Families: with Applications in Statistical Decision Theory. Inst Math Stat Lecture Notes-Monogr Ser, Vol 9. Hayward: Inst Math Stat, 1986
|
6 |
Chen X R. Quasi Likelihood Method for Generalized Linear Model. Hefei: Press of University of Science and Technology of China, 2011 (in Chinese)
|
7 |
Czado C, Munk A. Noncanonical links in generalized linear models when is the effort justified? J Statist Plann Inference, 2000, 87(2): 317–345
|
8 |
Efron B, Hastie T C. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. Cambridge: Cambridge Univ Press, 2016
|
9 |
Fahrmeir L, Kaufmann H. Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models. Ann Statist, 1985, 13(1): 342–368
|
10 |
Fahrmeir L, Tutz G. Multivariate Statistical Modelling Based on Generalized Linear Models. 2nd ed. New York: Springer, 2001
|
11 |
Fan J Q, Qi L, Tong X. Penalized least squares estimation with weakly dependent data. Sci China Math, 2016, 59(12): 2335–2354
|
12 |
Fang X Z. Laws of the iterated logarithm for maximum likelihood estimates of parameter vectors in nonhomogeneous Poisson processes. Acta Sci Natur Univ Pekinensis, 1998, 34(5): 563–573
|
13 |
Hansen B. Econometrics. Version: Jan 2018. 2018
|
14 |
He X M, Wang G. Law of the iterated logarithm and invariance principle for M-estimators. Proc Amer Math Soc, 1995, 123(2): 563–573
|
15 |
Kim Y D, Jeon J J. Consistent model selection criteria for quadratically supported risks. Ann Statist, 2016, 44(6): 2467–2496
|
16 |
Kroll M. Non-parametric Poisson regression from independent and weakly dependent observations by model selection. J Statist Plann Inference, 2019, 199: 249–270
|
17 |
Lai T L, Wei C Z. A law of the iterated logarithm for double arrays of independent random variables with applications to regression and time series models. Ann Probab, 1982, 10(2): 320–335
|
18 |
Lin Z Y, Lu C R. Limit Theory for Mixing Dependent Random Variables. Mathematics and Its Applications. Beijing/Dordrecht: Science Press/Kluwer Academic Publishers,1997
|
19 |
Mahoney M W, Duchi J C, Gilbert A C. The Mathematics of Data. Providence: Amer Math Soc, 2018
|
20 |
Markatou M, Basu A, Lindsay B G. Weighted likelihood equations with bootstrap root search. J Amer Statist Assoc, 1998, 93(442): 740–750
|
21 |
McCullagh P, Nelder J A. Generalized Linear Models. 2nd ed. London: Chapman and Hall, 1989
|
22 |
Miao Y, Yang G Y. The loglog law for LS estimator in simple linear EV regression models. Statistics, 2011, 45(2): 155–162
|
23 |
Nelder J A, Wedderburn R W M. Generalized linear models. J R Statist Soc Ser A, 1972, 135(3): 370–384
|
24 |
Qian G Q, Wu Y H. Strong limit theorems on model selection in generalized linear regression with binomial responses. Statist Sinica, 2006, 16(4): 1335–1365
|
25 |
Rao C R, Wu Y H. A strongly consistent procedure for model selection in a regression problem. Biometrika, 1989, 76(2): 369–374
|
26 |
Rao C R, Zhao L C. Linear representation of M-estimates in linear models. Canad J Statist, 1992, 20(4): 359–368
|
27 |
Rigollet P. Kullback-Leibler aggregation and misspecified generalized linear models. Ann Statist, 2012, 40(2): 639–665
|
28 |
Rissanen J. Stochastic Complexity in Statistical Inquiry. Singapore: World Scientific, 1989
|
29 |
Schwarz G. Estimating the dimension of a model. Ann Statist, 1978, 6(2): 461–464
|
30 |
Shao J. Mathematical Statistics. 2nd ed. New York: Springer, 2003
|
31 |
Stout W F. Almost Sure Convergence. New York: Academic Press, 1974
|
32 |
Tutz G. Regression for Categorical Data. Cambridge: Cambridge Univ Press, 2011
|
33 |
van der Vaart A W. Asymptotic Statistics. Cambridge: Cambridge Univ Press, 1998
|
34 |
Wu Y, Zen M M. A strongly consistent information criterion for linear model selection based on M-estimation. Probab Theory Related Fields, 1999, 113(4): 599–625
|
35 |
Yin C C, Zhao L C, Wei C D. Asymptotic normality and strong consistency of maximum quasi-likelihood estimates in generalized linear models. Sci China Ser A, 2006, 49(2): 145–157
|
36 |
Zhang H, Jia J. Elastic-net regularized high-dimensional negative binomial regression: consistency and weak signals detection. Statist Sinica (to appear)
|
37 |
Zhang H M, Tan K, Li B. COM-negative binomial distribution: modeling overdispersion and ultrahigh zero-inated count data. Front Math China, 2018, 13(4): 967–998
|
/
〈 | 〉 |