On the use of kernel machines for Mendelian randomization
Weiming Zhang, Debashis Ghosh
On the use of kernel machines for Mendelian randomization
Background: Properly adjusting for unmeasured confounders is critical for health studies in order to achieve valid testing and estimation of the exposure’s causal effect on outcomes. The instrumental variable (IV) method has long been used in econometrics to estimate causal effects while accommodating the effect of unmeasured confounders. Mendelian randomization (MR), which uses genetic variants as the instrumental variables, is an application of the instrumental variable method to biomedical research fields, and has become popular in recent years. One often-used estimator of causal effects for instrumental variables and Mendelian randomization is the two-stage least square estimator (TSLS). The validity of TSLS relies on the accurate prediction of exposure based on IVs in its first stage.
Results: In this note, we propose to model the link between exposure and genetic IVs using the least-squares kernel machine (LSKM). Some simulation studies are used to evaluate the feasibility of LSKM in TSLS setting.
Conclusions: Our results show that LSKM based on genotype score or genotype can be used effectively in TSLS. It may provide higher power when the association between exposure and genetic IVs is nonlinear.
Mendelian randomization / kernel machine / instrumental variable / unmeasured confounder / casual inference
[1] |
Rosenbaum, P.R.and Rubin, D.B. (1983) The central role of the propensity score in observational studies for causal effects. Biometrika. 70, 41–55. PubMed
CrossRef
Pubmed
Google scholar
|
[2] |
Rosenbaum, P. R.and Rubin, D. B. (1984) Reducing bias in observational studies using subclassification on the propensity score. J. Am. Stat. Assoc., 79, 516–524. PubMed
CrossRef
Pubmed
Google scholar
|
[3] |
Rosenbaum, P. R. and Rubin, D. B. (1985) Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am. Stat. 39, 33–38. PubMed
CrossRef
Pubmed
Google scholar
|
[4] |
Robins, J. M., Mark, S. D. and Newey, W. K. (1992) Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics, 48, 479–495.
CrossRef
Pubmed
Google scholar
|
[5] |
Wright, P. G. (1928)The Tariff on Animal and Vegetable Oils. New York: The Macmillan company
|
[6] |
Katan, M. B. (2004) Apolipoprotein E isoforms, serum cholesterol, and cancer. Int. J. Epidemiol., 33, 9
CrossRef
Pubmed
Google scholar
|
[7] |
Hillemacher, T., Frieling, H., Moskau, S., Muschler, M. A., Semmler, A., Kornhuber, J., Klockgether, T., Bleich, S. and Linnebank, M. (2008) Global DNA methylation is influenced by smoking behaviour. Eur. Neuropsychopharmacol., 18, 295–298.
CrossRef
Pubmed
Google scholar
|
[8] |
Bouwland-Both, M. I., van Mil, N. H., Tolhoek, C. P., Stolk, L., Eilers, P. H., Verbiest, M. M., Heijmans, B. T., Uitterlinden, A. G., Hofman, A., van Ijzendoorn, M. H.,
CrossRef
Pubmed
Google scholar
|
[9] |
Crider, K. S., Yang, T. P., Berry, R. J. and Bailey, L. B. (2012) Folate and DNA methylation: a review of molecular mechanisms and the evidence for folate’s role. Adv. Nutr., 3, 21–38.
CrossRef
Pubmed
Google scholar
|
[10] |
Geach, T. (2017) Obesity: methylation a consequence not a cause. Nat. Rev. Endocrinol., 13, 127
CrossRef
Pubmed
Google scholar
|
[11] |
Relton, C. L. and Davey Smith, G. (2012) Two-step epigenetic Mendelian randomization: a strategy for establishing the causal role of epigenetic processes in pathways to disease. Int. J. Epidemiol., 41, 161–176.
CrossRef
Pubmed
Google scholar
|
[12] |
Lin, W., Feng, R., Li, H. (2015) Regularization methods for high-dimensional instrumental variables regression with an application to genetical genomics. J. Am. Stat. Assoc., 110, 270–288. PubMed
CrossRef
Pubmed
Google scholar
|
[13] |
Kang, H., Zhang, A., Cai, T. and Small, D.(2016) Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization. J. Am. Stat. Assoc., 111, 132–144. PubMed
CrossRef
Pubmed
Google scholar
|
[14] |
Hall, P., Horowitz, J. (2005) Nonparametric methods for inference in the presence of instrumental variables. Ann. Stat., 33, 2904–2929. PubMed
CrossRef
Pubmed
Google scholar
|
[15] |
Laurain, V., Toth, R., Piga, D. and Zheng, W. (2015) An instrumental least squares support vector machine for nonlinear system identification. Automatica. 54, 340–347. PubMed
CrossRef
Pubmed
Google scholar
|
[16] |
White, H.Instrumental variables regression with independent observations. Econometrica. 1982;50(2):483–99. PubMed
CrossRef
Pubmed
Google scholar
|
[17] |
Liu, D., Lin, X. and Ghosh, D. (2007) Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics, 63, 1079–1088.
CrossRef
Pubmed
Google scholar
|
[18] |
Kwee, L. C., Liu, D., Lin, X., Ghosh, D. and Epstein, M. P. (2008) A powerful and flexible multilocus association test for quantitative traits. Am. J. Hum. Genet., 82, 386–397.
CrossRef
Pubmed
Google scholar
|
[19] |
Wu, M. C., Lee, S., Cai, T., Li,Y., Boehnke, M. and Lin, X. (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet., 89, 82–93. PubMed
CrossRef
Pubmed
Google scholar
|
[20] |
Lee, S., Emond, M. J., Bamshad, M. J., Barnes, K. C., Rieder, M. J., Nickerson, D. A., Christiani, D. C., Wurfel, M. M.Lin, X., and the NHLBI GO Exome Sequencing Project—ESP Lung Project Team. (2012) Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet., 91, 224–237.
CrossRef
Pubmed
Google scholar
|
[21] |
Lee, S., Wu, M. C. and Lin, X. (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics, 13, 762–775.
CrossRef
Pubmed
Google scholar
|
[22] |
Ionita-Laza, I., Lee, S., Makarov, V., Buxbaum, J. D. and Lin, X. (2013) Sequence kernel association tests for the combined effect of rare and common variants. Am. J. Hum. Genet., 92, 841–853.
CrossRef
Pubmed
Google scholar
|
[23] |
Lee, S., Teslovich, T. M., Boehnke, M. and Lin, X. (2013) General framework for meta-analysis of rare variants in sequencing association studies. Am. J. Hum. Genet., 93, 42–53.
CrossRef
Pubmed
Google scholar
|
[24] |
Zhang, W., Epstein, M. P., Fingerlin, T. E. and Ghosh, D.(2017) Links between the sequence kernel association and the kernel-based adaptive cluster tests. Statistics in Biosciences. Stat. Biosci., 9, 246–258
|
[25] |
Burgess, S. and Thompson, S. G. (2011) Bias in causal estimates from Mendelian randomization studies with weak instruments. Stat. Med., 30, 1312–1323.
CrossRef
Pubmed
Google scholar
|
[26] |
Burgess, S. and Thompson, S. G. (2015) Mendelian Randomization: Methods for Using Genetic Variants in Causal Estimation. Boca Raton: CRC Press
|
[27] |
Bound J., Jaeger , D. and Baker , R. (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J. Am. Stat. Assoc., 90, 443–450. PubMed
CrossRef
Pubmed
Google scholar
|
[28] |
Wang, F., Meyer, N. J., Walley, K. R., Russell, J. A. and Feng, R. (2016) Causal genetic inference using haplotypes as instrumental variables. Genet. Epidemiol., 40, 35–44.
CrossRef
Pubmed
Google scholar
|
/
〈 | 〉 |