Bayesian Analysis of Two-Part Latent Variable Model with Mixed Data

Shuang-Can Xiong , Ye-Mao Xia , Bin Lu

Communications in Mathematics and Statistics ›› : 1 -37.

PDF
Communications in Mathematics and Statistics ›› : 1 -37. DOI: 10.1007/s40304-023-00359-1
Article

Bayesian Analysis of Two-Part Latent Variable Model with Mixed Data

Author information +
History +
PDF

Abstract

In analyzing semi-continuous data, two-part model is a widely appreciated tool, in which two components are enclosed to characterize the mixing proportion of zeros and the actual level of positive values in semi-continuous data. The primary interest underlying such a model is primarily to exploit the dependence of the observed covariates on the semi-continuous variables; as such, the exploitation of unobserved heterogeneity is sometimes ignored. In this paper, we extend the conventional two-part regression model to much more general situations where multiple latent factors are considered to interpret the latent heterogeneity arising from the absence of covariates. A structural equation is constructed to describe the interrelationships between the latent factors. Moreover, a general statistical analysis procedure is developed to accommodate semi-continuous, ordered and unordered data simultaneously. A procedure for parameter estimation and model assessment is developed under a Bayesian framework. Empirical results including a simulation study and a real example are presented to illustrate the proposed methodology.

Keywords

Two-part latent variable model / Gibbs sampler / Model comparison / Household finance

Cite this article

Download citation ▾
Shuang-Can Xiong, Ye-Mao Xia, Bin Lu. Bayesian Analysis of Two-Part Latent Variable Model with Mixed Data. Communications in Mathematics and Statistics 1-37 DOI:10.1007/s40304-023-00359-1

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Agresti A. An Introduction to Categorical Data Analysis. 2007 2 Hoboken: Wiley

[2]

Berger JO. Statistical Decision Theory and Bayesian Analysis. 1985 New York: Springer

[3]

Bernheim D. Do households appreciate their financial vulnerabilities? An analysis of actions, perceptions, and public policy. Tax Policy Econ. Growth. 1995, 3 11-13

[4]

Bollen KA. Structural Equations with Latent Variables. 1989 New York: Wiley

[5]

Brown S, Ghosh P, Su L, Taylor K. Modelling household finances: a bayesian approach to a multivariate two-part model. J. Empir. Financ.. 2015, 33 190-207

[6]

Browning M, Lusardi A. Household saving: micro theories and micro facts. J. Econ. Lit.. 1996, 34 4 1797-1855

[7]

Chow SM, Tang NS, Yuan Y, Song XY, Zhu HT. Bayesian estimation of semiparametric nonlinear dynamic factor analysis models using the Dirichlet prior. Br. J. Math. Stat. Psychol.. 2011, 64 69-106

[8]

Cragg JG. Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica. 1971, 39 5 829-844

[9]

Duan N, Manning WG, Morris CN, Newhouse JP. A Comparison of alternative models for the demand for medical Care. J. Bus. Econ. Stat.. 1983, 1 2 115-126

[10]

Feng XN, Lu B, Song XY, Ma S. Financial literacy and household finances: a Bayesian two-part latent variable modeling approach. J. Empir. Financ.. 2019, 51 119-137

[11]

Geisser S, Eddy W. A predictive approach to model selection. J. Am. Stat. Assoc.. 1979, 74 1537-1160

[12]

Gelfand AE, Smith AFM. Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc.. 1990, 85 410 398-409

[13]

Gelman A, Meng XL. Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Stat. Sci.. 1998, 13 163-185

[14]

Gelman A, Meng XL, Stern H. Posterior predictive assessment of model fitness via realized discrepancies. Stat. Sin.. 1996, 6 733-759

[15]

Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences (with discussion). Stat. Sci.. 1992, 7 4 457-511

[16]

Geman, S., Geman, D.: Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6(6), 721–741 (1984)

[17]

Geyer CJ. Practical Markov chain Monte Carlo. Stat. Sci.. 1992, 7 4 473-511

[18]

Gilks WR, Wild P. Adaptive rejection sampling for Gibbs sampling. Appl. Stat.. 1992, 41 337-348

[19]

Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970, 57 97-109

[20]

Jöreskog KG. A general approach to conrmatory maximum likelihood factor analysis. Psychometrika. 1969, 34 183-202

[21]

Kass RE, Raftery AE. Bayes factors. J. Am. Stat. Assoc.. 1995, 90 430 773-795

[22]

Kim YK, Muthén BO. Two-part factor mixture modeling: application to an aggressive behavior measurement instrument. Struct. Equ. Model.. 2009, 16 4 602-624

[23]

Lee SY. Structural Equation Modeling: A Bayesian Approach. 2007 New York: Wiley

[24]

Lee SY, Xia YM. A robust bayesian approach for structural equation models with missing data. Psychometrika. 2008, 73 3 343-364

[25]

Liu L, Strawderman RL, Cowen ME, Shih YCT. A flexible two-part random effects model for correlated medical costs. J. Health Econ.. 2010, 29 1 110-123

[26]

Liu L, Strawderman RL, Johnson B, O’Quigley JM. Analyzing repeated measures semicontinuous data, with application to an alcohol dependence study. Stat. Methods Med. Res.. 2016, 25 1 1-33

[27]

Little RJA, Rubin DB. Statistical Analysis with Missing Data. 2002 2 New York: Wiley

[28]

Lusardi A, Mitchell OS. Baby boomer retirement security: the roles of planning, financial literacy, and housing wealth. Monetary Econ.. 2007, 42 35-44

[29]

Manning, W.G., et al.: A two-part model of the demand for medical care: preliminary results from the health insurance experiment. In: van der Gaag, J., Perlman, M. (eds.) Health, Economics, and Health Economics, p. 103C104. North-Holland, Amsterdam (1981)

[30]

Meng XL. Posterior predictive p-values. Ann. Stat.. 1994, 22 1142-1160

[31]

Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equations of state calculations by fast computing machine. J. Chem. Phys.. 1953, 21 1087-1091

[32]

Mukhopadhyay S, Gelfand AE. Dirichlet process mixed generalized linear models. J. Am. Stat. Assoc.. 1997, 92 438 633-639

[33]

Neelon B, Zhu L, Neelon SEB. Bayesian two-part spatial models for semicontinuous data with application to emergency department expenditures. Biostatistics. 2015, 16 3 465-479

[34]

Olsen MK, Schafer JL. A two-part random-effects model for semicontinuous longitudinal data. J. Am. Stat. Assoc.. 2001, 96 454 730-745

[35]

Owen AB. Statistically efficient thinning of a Markov chain sampler. J. Comput. Graph. Stat.. 2017

[36]

Polson NG, Scott JG, Windle J. Bayesian inference for logistic models using PlyaCGamma latent variables. J. Am. Stat. Assoc.. 2013, 108 504 1339-1349

[37]

Rooij MV, Lusardi A, Alessie R. Financial literacy and stock market participation. J. Financ. Econ.. 2011, 101 2 449-472

[38]

Shi JQ, Lee SY. Bayesian sampling-based approach for factor analysis model with continuous and polytomous data. Br. J. Math. Stat. Psychol.. 1998, 51 2 233-252

[39]

Smith VA, Neelon B, Preisser JS, Maciejewski L. A marginalized two-part model for semicontinuous data. Stat. Med.. 2015, 33 28 4891-4903

[40]

Schneider, S., Stone, A.A.: Distinguishing between frequency and intensity of healthrelated symptoms from diary assessments. J. Psychosom. Res. 77(3), 205C212 (2014)

[41]

Song XY, Xia YM, Lee SY. Bayesian semiparametric analysis of structural equation models with mixed continuous and unordered categorical variables. Stat. Med.. 2010, 28 17 2253-2276

[42]

Tooze JA, Grunwald JK, Jones RH. Analysis of repeated measures data with clumping at zero. Stat. Methods Med. Res.. 2002, 11 4 341-355

[43]

Walker SG. Sampling the Dirichlet mixture model with slices. Commun. Stat. Simul. Comput.. 2007, 36 1 45-54

[44]

Wang XQ, Feng XN, Song XY. Joint analysis of semicontinuous data with latent variables. Comput. Stat. Data Anal.. 2020, 151

[45]

Xia YM, Gou JW. Bayesian semiparametric analysis for latent variable models with mixed continuous and ordinal outcomes. J. Korean Stat. Soc.. 2016, 45 3 451-465

[46]

Xia YM, Lu B, Tang NS. Inference on two-part latent variable analysis model with multivariate longitudinal data. Struct. Equ. Model. A Multidiscipl. J.. 2019, 26 5 685-709

[47]

Zhu HT, Lee SY. A Bayesian analysis of finite mixtures in the LISREL model. Psychometrika. 2001, 66 1 133-152

AI Summary AI Mindmap
PDF

86

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/