Transcriptome-wide association studies: a view from Mendelian randomization
Huanhuan Zhu, Xiang Zhou
Transcriptome-wide association studies: a view from Mendelian randomization
Background: Genome-wide association studies (GWASs) have identified thousands of genetic variants that are associated with many complex traits. However, their biological mechanisms remain largely unknown. Transcriptome-wide association studies (TWAS) have been recently proposed as an invaluable tool for investigating the potential gene regulatory mechanisms underlying variant-trait associations. Specifically, TWAS integrate GWAS with expression mapping studies based on a common set of variants and aim to identify genes whose GReX is associated with the phenotype. Various methods have been developed for performing TWAS and/or similar integrative analysis. Each such method has a different modeling assumption and many were initially developed to answer different biological questions. Consequently, it is not straightforward to understand their modeling property from a theoretical perspective.
Results: We present a technical review on thirteen TWAS methods. Importantly, we show that these methods can all be viewed as two-sample Mendelian randomization (MR) analysis, which has been widely applied in GWASs for examining the causal effects of exposure on outcome. Viewing different TWAS methods from an MR perspective provides us a unique angle for understanding their benefits and pitfalls. We systematically introduce the MR analysis framework, explain how features of the GWAS and expression data influence the adaptation of MR for TWAS, and re-interpret the modeling assumptions made in different TWAS methods from an MR angle. We finally describe future directions for TWAS methodology development.
Conclusions: We hope that this review would serve as a useful reference for both methodologists who develop TWAS methods and practitioners who perform TWAS analysis.
Transcriptome wide association studies (TWAS) integrate expression mapping studies and GWAS studies and aim to identify candidate genes whose genetically regulated expression is associated with trait of interest. We present a comprehensive review on a broad category of recently developed and commonly used TWAS methods. Our review covers different modeling assumptions, different inference procedures, modeling of horizontal pleiotropic effects, and extensions of TWAS towards multivariate MR analysis and summary statistics. Our review also aims to provide a unified view of various TWAS methods from the perspective of Mendelian randomization (MR).
transcriptome-wide association studies / genome-wide association studies / expression mapping studies
[1] |
Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W. J. H., Jansen, R., de Geus, E. J., Boomsma, D. I., Wright, F. A.,
CrossRef
Pubmed
Google scholar
|
[2] |
Lonsdale, J., Thomas, J., Salvatore, M., Phillips, R., Lo, E., Shad, S., Hasz, R., Walters, G., Garcia, F., Young, N.,
CrossRef
Pubmed
Google scholar
|
[3] |
Lappalainen, T., Sammeth, M., Friedländer, M. R., ’t Hoen, P. A., Monlong, J., Rivas, M. A., Gonzàlez-Porta, M., Kurbatova, N., Griebel, T., Ferreira, P. G.,
CrossRef
Pubmed
Google scholar
|
[4] |
Battle, A., Mostafavi, S., Zhu, X., Potash, J. B., Weissman, M. M., McCormick, C., Haudenschild, C. D., Beckman, K. B., Shi, J., Mei, R.,
CrossRef
Pubmed
Google scholar
|
[5] |
Ramasamy, A., Trabzuni, D., Guelfi, S., Varghese, V., Smith, C., Walker, R., De, T., Coin, L., de Silva, R., Cookson, M. R.,
CrossRef
Pubmed
Google scholar
|
[6] |
Gibbs, J. R., van der Brug, M. P., Hernandez, D. G., Traynor, B. J., Nalls, M. A., Lai, S.-L., Arepalli, S., Dillman, A., Rafferty, I. P., Troncoso, J.,
CrossRef
Pubmed
Google scholar
|
[7] |
Tung, J., Zhou, X., Alberts, S. C., Stephens, M. and Gilad, Y. (2015) The genetic architecture of gene expression levels in wild baboons. eLife, 4, e04729
CrossRef
Pubmed
Google scholar
|
[8] |
Pickrell, J. K., Marioni, J. C., Pai, A. A., Degner, J. F., Engelhardt, B. E., Nkadori, E., Veyrieras, J. B., Stephens, M., Gilad, Y. and Pritchard, J. K. (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 464, 768–772
CrossRef
Pubmed
Google scholar
|
[9] |
Stancáková, A., Civelek, M., Saleem, N. K., Soininen, P., Kangas, A. J., Cederberg, H., Paananen, J., Pihlajamäki, J., Bonnycastle, L. L., Morken, M. A.,
CrossRef
Pubmed
Google scholar
|
[10] |
Abeshouse, A., Ahn, J., Akbani, R., Ally, A., Amin, S., Andry, C. D., Annala, M., Aprikian, A., Armenia, J., Arora, A.,
CrossRef
Pubmed
Google scholar
|
[11] |
Fromer, M., Roussos, P., Sieberts, S. K., Johnson, J. S., Kavanagh, D. H., Perumal, T. M., Ruderfer, D. M., Oh, E. C., Topol, A., Shah, H. R.,
CrossRef
Pubmed
Google scholar
|
[12] |
Wright, F. A., Sullivan, P. F., Brooks, A. I., Zou, F., Sun, W., Xia, K., Madar, V., Jansen, R., Chung, W., Zhou, Y. H.,
CrossRef
Pubmed
Google scholar
|
[13] |
Raitakari, O. T., Juonala, M., Rönnemaa, T., Keltikangas-Järvinen, L., Räsänen, L., Pietikäinen, M., Hutri-Kähönen, N., Taittonen, L., Jokinen, E., Marniemi, J.,
CrossRef
Pubmed
Google scholar
|
[14] |
Gamazon, E. R., Wheeler, H. E., Shah, K. P., Mozaffari, S. V., Aquino-Michaels, K., Carroll, R. J., Eyler, A. E., Denny, J. C., Nicolae, D. L., Cox, N. J.,
CrossRef
Pubmed
Google scholar
|
[15] |
Zou, H. and Hastie, T. (2005) Regularization and variable selection via the elastic net. J. R. Stat. Soc. B, 67, 301–320
CrossRef
Google scholar
|
[16] |
Zhou, X., Carbonetto, P. and Stephens, M. (2013) Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet., 9, e1003264
CrossRef
Pubmed
Google scholar
|
[17] |
Zeng, P. and Zhou, X. (2017) Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models. Nat. Commun., 8, 456
CrossRef
Pubmed
Google scholar
|
[18] |
Nagpal, S., Meng, X., Epstein, M. P., Tsoi, L. C., Patrick, M., Gibson, G., De Jager, P. L., Bennett, D. A., Wingo, A. P., Wingo, T. S.,
CrossRef
Pubmed
Google scholar
|
[19] |
Zhu, Z., Zhang, F., Hu, H., Bakshi, A., Robinson, M. R., Powell, J. E., Montgomery, G. W., Goddard, M. E., Wray, N. R., Visscher, P. M.,
CrossRef
Pubmed
Google scholar
|
[20] |
Zhu, Z., Zheng, Z., Zhang, F., Wu, Y., Trzaskowski, M., Maier, R., Robinson, M. R., McGrath, J. J., Visscher, P. M., Wray, N. R.,
CrossRef
Pubmed
Google scholar
|
[21] |
Yuan, Z., Zhu, H., Zeng, P., Yang, S., Sun, S., Yang, C., Liu, J., Zhou, X. (2019) Testing and controlling for horizontal pleiotropy with the probabilistic Mendelian randomization in transcriptome-wide association studies. bioRxiv, 691014
|
[22] |
Sanderson, E., Davey Smith, G., Windmeijer, F. and Bowden, J. (2019) An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int. J. Epidemiol., 48, 713–727
CrossRef
Pubmed
Google scholar
|
[23] |
Burgess, S. and Thompson, S. G. (2015) Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol., 181, 251–260
CrossRef
Pubmed
Google scholar
|
[24] |
Rees, J. M. B., Foley, C. N. and Burgess, S. (2019) Factorial Mendelian randomization: using genetic variants to assess interactions. Int. J. Epidemiol., dyz161
CrossRef
Pubmed
Google scholar
|
[25] |
Burgess, S., Daniel, R. M., Butterworth, A. S. and Thompson, S. G., and the EPIC-InterAct Consortium. (2015) Network Mendelian randomization: using genetic variants as instrumental variables to investigate mediation in causal pathways. Int. J. Epidemiol., 44, 484–495
CrossRef
Pubmed
Google scholar
|
[26] |
Porcu, E., Rüeger, S., Lepik, K., the eQTLGen Consortium, the BIOS Consortium, Santoni, F. A., Reymond, A. and Kutalik, Z. (2019) Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits. Nat. Commun., 10, 3300
CrossRef
Pubmed
Google scholar
|
[27] |
Zuber, V., Colijn, J. M., Klaver, C. and Burgess, S. (2020) Selecting causal risk factors from high-throughput experiments using multivariable Mendelian randomization. Nat. Commun. 11, 29
|
[28] |
Barbeira, A. N., Pividori, M., Zheng, J., Wheeler, H. E., Nicolae, D. L. and Im, H. K. (2019) Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet., 15, e1007889
CrossRef
Pubmed
Google scholar
|
[29] |
Hu, Y., Li, M., Lu, Q., Weng, H., Wang, J., Zekavat, S. M., Yu, Z., Li, B., Gu, J., Muchnik, S.,
CrossRef
Pubmed
Google scholar
|
[30] |
Mancuso, N., Gayther, S., Gusev, A., Zheng, W., Penney, K. L., Kote-Jarai, Z., Eeles, R., Freedman, M., Haiman, C.Pasaniuc, B.,
CrossRef
Pubmed
Google scholar
|
[31] |
Park, Y., Sarkar, A. K., Bhutani, K. and Kellis, M. (2017) Multi-tissue polygenic models for transcriptome-wide association studies. bioRxiv, 107623
|
[32] |
Shi, X., Chai, X., Yang, Y., Cheng, Q., Jiao, Y., Huang, J., Yang, C. and Liu, J. (2019) A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies. bioRxiv, 789396
|
[33] |
Mancuso, N., Freund, M. K., Johnson, R., Shi, H., Kichaev, G., Gusev, A. and Pasaniuc, B. (2019) Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet., 51, 675–682
CrossRef
Pubmed
Google scholar
|
[34] |
Wainberg, M., Sinnott-Armstrong, N., Mancuso, N., Barbeira, A. N., Knowles, D. A., Golan, D., Ermel, R., Ruusalepp, A., Quertermous, T., Hao, K.,
CrossRef
Pubmed
Google scholar
|
[35] |
Barbeira, A. N., Dickinson, S. P., Bonazzola, R., Zheng, J., Wheeler, H. E., Torres, J. M., Torstenson, E. S., Shah, K. P., Garcia, T., Edwards, T. L.,
CrossRef
Pubmed
Google scholar
|
[36] |
Ference, B. A., Robinson, J. G., Brook, R. D., Catapano, A. L., Chapman, M. J., Neff, D. R., Voros, S., Giugliano, R. P., Davey Smith, G., Fazio, S.,
CrossRef
Pubmed
Google scholar
|
[37] |
Helgadottir, A., Gretarsdottir, S., Thorleifsson, G., Hjartarson, E., Sigurdsson, A., Magnusdottir, A., Jonasdottir, A., Kristjansson, H., Sulem, P., Oddsson, A.,
CrossRef
Pubmed
Google scholar
|
[38] |
Pingault, J.-B., O’Reilly, P. F., Schoeler, T., Ploubidis, G. B., Rijsdijk, F. and Dudbridge, F. (2018) Using genetic data to strengthen causal inference in observational research. Nat. Rev. Genet., 19, 566–580
CrossRef
Pubmed
Google scholar
|
[39] |
Zheng, J., Baird, D., Borges, M.-C., Bowden, J., Hemani, G., Haycock, P., Evans, D. M. and Smith, G. D. (2017) Recent developments in Mendelian randomization studies. Curr. Epidemiol. Rep., 4, 330–345
CrossRef
Pubmed
Google scholar
|
[40] |
Haycock, P. C., Burgess, S., Wade, K. H., Bowden, J., Relton, C. and Davey Smith, G. (2016) Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am. J. Clin. Nutr., 103, 965–978
CrossRef
Pubmed
Google scholar
|
[41] |
Lawlor, D. A. (2016) Commentary: Two-sample Mendelian randomization: opportunities and challenges. Int. J. Epidemiol., 45, 908–915
CrossRef
Pubmed
Google scholar
|
[42] |
Bowden, J., Davey Smith, G. and Burgess, S. (2015) Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol., 44, 512–525
CrossRef
Pubmed
Google scholar
|
[43] |
Bowden, J., Davey Smith, G., Haycock, P. C. and Burgess, S. (2016) Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol., 40, 304–314
CrossRef
Pubmed
Google scholar
|
[44] |
Smith, G. D. and Ebrahim, S. (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol., 32, 1–22
CrossRef
Pubmed
Google scholar
|
[45] |
Burgess, S., Small, D. S. and Thompson, S. G. (2017) A review of instrumental variable estimators for Mendelian randomization. Stat. Methods Med. Res., 26, 2333–2355
CrossRef
Pubmed
Google scholar
|
[46] |
Burgess, S., Butterworth, A. and Thompson, S. G. (2013) Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol., 37, 658–665
CrossRef
Pubmed
Google scholar
|
[47] |
Burgess, S., Dudbridge, F. and Thompson, S. G. (2016) Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat. Med., 35, 1880–1906
CrossRef
Pubmed
Google scholar
|
[48] |
Burgess, S. and Thompson, S. G. (2011) Bias in causal estimates from Mendelian randomization studies with weak instruments. Stat. Med., 30, 1312–1323
CrossRef
Pubmed
Google scholar
|
[49] |
Tibshirani, R. (1996) Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B, 58, 267–288
CrossRef
Google scholar
|
[50] |
Hoerl, A. E. and Kennard, R. W. (2000) Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 42, 80–86
CrossRef
Google scholar
|
[51] |
Guan, Y. and Stephens, M. (2011) Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann. Appl. Stat., 5, 1780–1815
CrossRef
Google scholar
|
[52] |
Boyle, E. A., Li, Y. I. and Pritchard, J. K. (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell, 169, 1177–1186
CrossRef
Pubmed
Google scholar
|
[53] |
Yang, C., Wan, X., Lin, X., Chen, M., Zhou, X. and Liu, J. (2019) CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics, 35, 1644–1652
CrossRef
Pubmed
Google scholar
|
[54] |
Yang, Y., Shi, X., Jiao, Y., Huang, J., Chen, M., Zhou, X., Sun, L., Lin, X., Yang, C., Liu, J. (2020) CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. Bioinformatics, 36, 2009–2016
|
[55] |
Hemani, G., Bowden, J. and Davey Smith, G. (2018) Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet., 27, R195–R208
CrossRef
Pubmed
Google scholar
|
[56] |
Verbanck, M., Chen, C.-Y., Neale, B. and Do, R. (2018) Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet., 50, 693–698
CrossRef
Pubmed
Google scholar
|
[57] |
Park, Y., Sarkar, A. K., He, L., Davila-Velderrain, J., De Jager, P. L. and Kellis, M. (2017) A Bayesian approach to mediation analysis predicts 206 causal target genes in Alzheimer’s disease. bioRxiv, 219428
|
[58] |
Burgess, S. and Thompson, S. G. (2017) Interpreting findings from Mendelian randomization using the MR-Egger method. Eur. J. Epidemiol., 32, 377–389
CrossRef
Pubmed
Google scholar
|
[59] |
Dai, J. Y., Peters, U., Wang, X., Kocarnik, J., Chang-Claude, J., Slattery, M. L., Chan, A., Lemire, M., Berndt, S. I., Casey, G.,
CrossRef
Pubmed
Google scholar
|
[60] |
Qi, G. and Chatterjee, N. (2019) Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nat. Commun., 10, 1941
CrossRef
Pubmed
Google scholar
|
[61] |
Berzuini C, Guo H, Burgess S, Bernardinelli L. (2020) A Bayesian approach to Mendelian randomization with multiple pleiotropic variants. 2018. Biostatistics, 21, 86–101
|
[62] |
Li, S. (2017) Mendelian randomization when many instruments are invalid: hierarchical empirical Bayes estimation. ArXiv, 170601389
|
[63] |
Barfield, R., Feng, H., Gusev, A., Wu, L., Zheng, W., Pasaniuc, B. and Kraft, P. (2018) Transcriptome-wide association studies accounting for colocalization using Egger regression. Genet. Epidemiol., 42, 418–433
CrossRef
Pubmed
Google scholar
|
[64] |
Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. and Lin, X. (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet., 89, 82–93
CrossRef
Pubmed
Google scholar
|
[65] |
Li, B. and Leal, S. M. (2008) Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet., 83, 311–321
CrossRef
Pubmed
Google scholar
|
[66] |
Madsen, B. E. and Browning, S. R. (2009) A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet., 5, e1000384
CrossRef
Pubmed
Google scholar
|
[67] |
Price, A. L., Kryukov, G. V., de Bakker, P. I., Purcell, S. M., Staples, J., Wei, L.-J. and Sunyaev, S. R. (2010) Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet., 86, 832–838
CrossRef
Pubmed
Google scholar
|
[68] |
Zhou, X. (2017) A unified framework for variance component estimation with summary statistics in genome-wide association studies. Ann. Appl. Stat., 11, 2027–2051
CrossRef
Pubmed
Google scholar
|
[69] |
Schork, N. J., Murray, S. S., Frazer, K. A. and Topol, E. J. (2009) Common vs. rare allele hypotheses for complex diseases. Curr. Opin. Genet. Dev., 19, 212–219
CrossRef
Pubmed
Google scholar
|
[70] |
Eichler, E. E., Flint, J., Gibson, G., Kong, A., Leal, S. M., Moore, J. H. and Nadeau, J. H. (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet., 11, 446–450
CrossRef
Pubmed
Google scholar
|
[71] |
Price, A. L., Helgason, A., Thorleifsson, G., McCarroll, S. A., Kong, A. and Stefansson, K. (2011) Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet., 7, e1001317
CrossRef
Pubmed
Google scholar
|
/
〈 | 〉 |