PDF
(828KB)
Abstract
Background: Genome-wide association studies (GWASs) have identified thousands of genetic variants that are associated with many complex traits. However, their biological mechanisms remain largely unknown. Transcriptome-wide association studies (TWAS) have been recently proposed as an invaluable tool for investigating the potential gene regulatory mechanisms underlying variant-trait associations. Specifically, TWAS integrate GWAS with expression mapping studies based on a common set of variants and aim to identify genes whose GReX is associated with the phenotype. Various methods have been developed for performing TWAS and/or similar integrative analysis. Each such method has a different modeling assumption and many were initially developed to answer different biological questions. Consequently, it is not straightforward to understand their modeling property from a theoretical perspective.
Results: We present a technical review on thirteen TWAS methods. Importantly, we show that these methods can all be viewed as two-sample Mendelian randomization (MR) analysis, which has been widely applied in GWASs for examining the causal effects of exposure on outcome. Viewing different TWAS methods from an MR perspective provides us a unique angle for understanding their benefits and pitfalls. We systematically introduce the MR analysis framework, explain how features of the GWAS and expression data influence the adaptation of MR for TWAS, and re-interpret the modeling assumptions made in different TWAS methods from an MR angle. We finally describe future directions for TWAS methodology development.
Conclusions: We hope that this review would serve as a useful reference for both methodologists who develop TWAS methods and practitioners who perform TWAS analysis.
Graphical abstract
Keywords
transcriptome-wide association studies
/
genome-wide association studies
/
expression mapping studies
Cite this article
Download citation ▾
Huanhuan Zhu, Xiang Zhou.
Transcriptome-wide association studies: a view from Mendelian randomization.
Quant. Biol., 2021, 9(2): 107-121 DOI:10.1007/s40484-020-0207-4
| [1] |
Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W. J. H., Jansen, R., de Geus, E. J., Boomsma, D. I., Wright, F. A., (2016) Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet., 48, 245–252
|
| [2] |
Lonsdale, J., Thomas, J., Salvatore, M., Phillips, R., Lo, E., Shad, S., Hasz, R., Walters, G., Garcia, F., Young, N., (2013) The genotype-tissue expression (GTEx) project. Nat. Genet., 45, 580–585
|
| [3] |
Lappalainen, T., Sammeth, M., Friedländer, M. R., ’t Hoen, P. A., Monlong, J., Rivas, M. A., Gonzàlez-Porta, M., Kurbatova, N., Griebel, T., Ferreira, P. G., (2013) Transcriptome and genome sequencing uncovers functional variation in humans. Nature, 501, 506–511
|
| [4] |
Battle, A., Mostafavi, S., Zhu, X., Potash, J. B., Weissman, M. M., McCormick, C., Haudenschild, C. D., Beckman, K. B., Shi, J., Mei, R., (2014) Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res., 24, 14–24
|
| [5] |
Ramasamy, A., Trabzuni, D., Guelfi, S., Varghese, V., Smith, C., Walker, R., De, T., Coin, L., de Silva, R., Cookson, M. R., (2014) Genetic variability in the regulation of gene expression in ten regions of the human brain. Nat. Neurosci., 17, 1418–1428
|
| [6] |
Gibbs, J. R., van der Brug, M. P., Hernandez, D. G., Traynor, B. J., Nalls, M. A., Lai, S.-L., Arepalli, S., Dillman, A., Rafferty, I. P., Troncoso, J., (2010) Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet., 6, e1000952
|
| [7] |
Tung, J., Zhou, X., Alberts, S. C., Stephens, M. and Gilad, Y. (2015) The genetic architecture of gene expression levels in wild baboons. eLife, 4, e04729
|
| [8] |
Pickrell, J. K., Marioni, J. C., Pai, A. A., Degner, J. F., Engelhardt, B. E., Nkadori, E., Veyrieras, J. B., Stephens, M., Gilad, Y. and Pritchard, J. K. (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 464, 768–772
|
| [9] |
Stancáková A., Civelek, M., Saleem, N. K., Soininen, P., Kangas, A. J., Cederberg, H., Paananen, J., Pihlajamäki, J., Bonnycastle, L. L., Morken, M. A., (2012) Hyperglycemia and a common variant of GCKR are associated with the levels of eight amino acids in 9,369 Finnish men. Diabetes, 61, 1895–1902
|
| [10] |
Abeshouse, A., Ahn, J., Akbani, R., Ally, A., Amin, S., Andry, C. D., Annala, M., Aprikian, A., Armenia, J., Arora, A., (2015) The molecular taxonomy of primary prostate cancer. Cell, 163, 1011–1025
|
| [11] |
Fromer, M., Roussos, P., Sieberts, S. K., Johnson, J. S., Kavanagh, D. H., Perumal, T. M., Ruderfer, D. M., Oh, E. C., Topol, A., Shah, H. R., (2016) Gene expression elucidates functional impact of polygenic risk for schizophrenia. Nat. Neurosci., 19, 1442–1453
|
| [12] |
Wright, F. A., Sullivan, P. F., Brooks, A. I., Zou, F., Sun, W., Xia, K., Madar, V., Jansen, R., Chung, W., Zhou, Y. H., (2014) Heritability and genomics of gene expression in peripheral blood. Nat. Genet., 46, 430–437
|
| [13] |
Raitakari, O. T., Juonala, M., Rönnemaa, T., Keltikangas-Järvinen, L., Räsänen, L., Pietikäinen, M., Hutri-Kähönen, N., Taittonen, L., Jokinen, E., Marniemi, J., (2008) Cohort profile: the cardiovascular risk in Young Finns Study. Int. J. Epidemiol., 37, 1220–1226
|
| [14] |
Gamazon, E. R., Wheeler, H. E., Shah, K. P., Mozaffari, S. V., Aquino-Michaels, K., Carroll, R. J., Eyler, A. E., Denny, J. C., Nicolae, D. L., Cox, N. J., (2015) A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet., 47, 1091–1098
|
| [15] |
Zou, H. and Hastie, T. (2005) Regularization and variable selection via the elastic net. J. R. Stat. Soc. B, 67, 301–320
|
| [16] |
Zhou, X., Carbonetto, P. and Stephens, M. (2013) Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet., 9, e1003264
|
| [17] |
Zeng, P. and Zhou, X. (2017) Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models. Nat. Commun., 8, 456
|
| [18] |
Nagpal, S., Meng, X., Epstein, M. P., Tsoi, L. C., Patrick, M., Gibson, G., De Jager, P. L., Bennett, D. A., Wingo, A. P., Wingo, T. S., (2019) TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am. J. Hum. Genet., 105, 258–266
|
| [19] |
Zhu, Z., Zhang, F., Hu, H., Bakshi, A., Robinson, M. R., Powell, J. E., Montgomery, G. W., Goddard, M. E., Wray, N. R., Visscher, P. M., (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet., 48, 481–487
|
| [20] |
Zhu, Z., Zheng, Z., Zhang, F., Wu, Y., Trzaskowski, M., Maier, R., Robinson, M. R., McGrath, J. J., Visscher, P. M., Wray, N. R., (2018) Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun., 9, 224
|
| [21] |
Yuan, Z., Zhu, H., Zeng, P., Yang, S., Sun, S., Yang, C., Liu, J., Zhou, X. (2019) Testing and controlling for horizontal pleiotropy with the probabilistic Mendelian randomization in transcriptome-wide association studies. bioRxiv, 691014
|
| [22] |
Sanderson, E., Davey Smith, G., Windmeijer, F. and Bowden, J. (2019) An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int. J. Epidemiol., 48, 713–727
|
| [23] |
Burgess, S. and Thompson, S. G. (2015) Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol., 181, 251–260
|
| [24] |
Rees, J. M. B., Foley, C. N. and Burgess, S. (2019) Factorial Mendelian randomization: using genetic variants to assess interactions. Int. J. Epidemiol., dyz161
|
| [25] |
Burgess, S., Daniel, R. M., Butterworth, A. S. and Thompson, S. G., and the EPIC-InterAct Consortium. (2015) Network Mendelian randomization: using genetic variants as instrumental variables to investigate mediation in causal pathways. Int. J. Epidemiol., 44, 484–495
|
| [26] |
Porcu, E., Rüeger, S., Lepik, K., the eQTLGen Consortium, the BIOS Consortium, Santoni, F. A., Reymond, A. and Kutalik, Z. (2019) Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits. Nat. Commun., 10, 3300
|
| [27] |
Zuber, V., Colijn, J. M., Klaver, C. and Burgess, S. (2020) Selecting causal risk factors from high-throughput experiments using multivariable Mendelian randomization. Nat. Commun. 11, 29
|
| [28] |
Barbeira, A. N., Pividori, M., Zheng, J., Wheeler, H. E., Nicolae, D. L. and Im, H. K. (2019) Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet., 15, e1007889
|
| [29] |
Hu, Y., Li, M., Lu, Q., Weng, H., Wang, J., Zekavat, S. M., Yu, Z., Li, B., Gu, J., Muchnik, S., (2019) A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet., 51, 568–576
|
| [30] |
Mancuso, N., Gayther, S., Gusev, A., Zheng, W., Penney, K. L., Kote-Jarai, Z., Eeles, R., Freedman, M., Haiman, C.Pasaniuc, B., (2018) Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat. Commun., 9, 4079
|
| [31] |
Park, Y., Sarkar, A. K., Bhutani, K. and Kellis, M. (2017) Multi-tissue polygenic models for transcriptome-wide association studies. bioRxiv, 107623
|
| [32] |
Shi, X., Chai, X., Yang, Y., Cheng, Q., Jiao, Y., Huang, J., Yang, C. and Liu, J. (2019) A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies. bioRxiv, 789396
|
| [33] |
Mancuso, N., Freund, M. K., Johnson, R., Shi, H., Kichaev, G., Gusev, A. and Pasaniuc, B. (2019) Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet., 51, 675–682
|
| [34] |
Wainberg, M., Sinnott-Armstrong, N., Mancuso, N., Barbeira, A. N., Knowles, D. A., Golan, D., Ermel, R., Ruusalepp, A., Quertermous, T., Hao, K., (2019) Opportunities and challenges for transcriptome-wide association studies. Nat. Genet., 51, 592–599
|
| [35] |
Barbeira, A. N., Dickinson, S. P., Bonazzola, R., Zheng, J., Wheeler, H. E., Torres, J. M., Torstenson, E. S., Shah, K. P., Garcia, T., Edwards, T. L., (2018) Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun., 9, 1825
|
| [36] |
Ference, B. A., Robinson, J. G., Brook, R. D., Catapano, A. L., Chapman, M. J., Neff, D. R., Voros, S., Giugliano, R. P., Davey Smith, G., Fazio, S., (2016) Variation in PCSK9 and HMGCR and risk of cardiovascular disease and diabetes. N. Engl. J. Med., 375, 2144–2153
|
| [37] |
Helgadottir, A., Gretarsdottir, S., Thorleifsson, G., Hjartarson, E., Sigurdsson, A., Magnusdottir, A., Jonasdottir, A., Kristjansson, H., Sulem, P., Oddsson, A., (2016) Variants with large effects on blood lipids and the role of cholesterol and triglycerides in coronary disease. Nat. Genet., 48, 634–639
|
| [38] |
Pingault, J.-B., O’Reilly, P. F., Schoeler, T., Ploubidis, G. B., Rijsdijk, F. and Dudbridge, F. (2018) Using genetic data to strengthen causal inference in observational research. Nat. Rev. Genet., 19, 566–580
|
| [39] |
Zheng, J., Baird, D., Borges, M.-C., Bowden, J., Hemani, G., Haycock, P., Evans, D. M. and Smith, G. D. (2017) Recent developments in Mendelian randomization studies. Curr. Epidemiol. Rep., 4, 330–345
|
| [40] |
Haycock, P. C., Burgess, S., Wade, K. H., Bowden, J., Relton, C. and Davey Smith, G. (2016) Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am. J. Clin. Nutr., 103, 965–978
|
| [41] |
Lawlor, D. A. (2016) Commentary: Two-sample Mendelian randomization: opportunities and challenges. Int. J. Epidemiol., 45, 908–915
|
| [42] |
Bowden, J., Davey Smith, G. and Burgess, S. (2015) Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol., 44, 512–525
|
| [43] |
Bowden, J., Davey Smith, G., Haycock, P. C. and Burgess, S. (2016) Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol., 40, 304–314
|
| [44] |
Smith, G. D. and Ebrahim, S. (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol., 32, 1–22
|
| [45] |
Burgess, S., Small, D. S. and Thompson, S. G. (2017) A review of instrumental variable estimators for Mendelian randomization. Stat. Methods Med. Res., 26, 2333–2355
|
| [46] |
Burgess, S., Butterworth, A. and Thompson, S. G. (2013) Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol., 37, 658–665
|
| [47] |
Burgess, S., Dudbridge, F. and Thompson, S. G. (2016) Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat. Med., 35, 1880–1906
|
| [48] |
Burgess, S. and Thompson, S. G. (2011) Bias in causal estimates from Mendelian randomization studies with weak instruments. Stat. Med., 30, 1312–1323
|
| [49] |
Tibshirani, R. (1996) Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B, 58, 267–288
|
| [50] |
Hoerl, A. E. and Kennard, R. W. (2000) Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 42, 80–86
|
| [51] |
Guan, Y. and Stephens, M. (2011) Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann. Appl. Stat., 5, 1780–1815
|
| [52] |
Boyle, E. A., Li, Y. I. and Pritchard, J. K. (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell, 169, 1177–1186
|
| [53] |
Yang, C., Wan, X., Lin, X., Chen, M., Zhou, X. and Liu, J. (2019) CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics, 35, 1644–1652
|
| [54] |
Yang, Y., Shi, X., Jiao, Y., Huang, J., Chen, M., Zhou, X., Sun, L., Lin, X., Yang, C., Liu, J. (2020) CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. Bioinformatics, 36, 2009–2016
|
| [55] |
Hemani, G., Bowden, J. and Davey Smith, G. (2018) Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet., 27, R195–R208
|
| [56] |
Verbanck, M., Chen, C.-Y., Neale, B. and Do, R. (2018) Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet., 50, 693–698
|
| [57] |
Park, Y., Sarkar, A. K., He, L., Davila-Velderrain, J., De Jager, P. L. and Kellis, M. (2017) A Bayesian approach to mediation analysis predicts 206 causal target genes in Alzheimer’s disease. bioRxiv, 219428
|
| [58] |
Burgess, S. and Thompson, S. G. (2017) Interpreting findings from Mendelian randomization using the MR-Egger method. Eur. J. Epidemiol., 32, 377–389
|
| [59] |
Dai, J. Y., Peters, U., Wang, X., Kocarnik, J., Chang-Claude, J., Slattery, M. L., Chan, A., Lemire, M., Berndt, S. I., Casey, G., (2018) Diagnostics for pleiotropy in Mendelian randomization studies: global and individual tests for direct effects. Am. J. Epidemiol., 187, 2672–2680
|
| [60] |
Qi, G. and Chatterjee, N. (2019) Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nat. Commun., 10, 1941
|
| [61] |
Berzuini C, Guo H, Burgess S, Bernardinelli L. (2020) A Bayesian approach to Mendelian randomization with multiple pleiotropic variants. 2018. Biostatistics, 21, 86–101
|
| [62] |
Li, S. (2017) Mendelian randomization when many instruments are invalid: hierarchical empirical Bayes estimation. ArXiv, 170601389
|
| [63] |
Barfield, R., Feng, H., Gusev, A., Wu, L., Zheng, W., Pasaniuc, B. and Kraft, P. (2018) Transcriptome-wide association studies accounting for colocalization using Egger regression. Genet. Epidemiol., 42, 418–433
|
| [64] |
Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. and Lin, X. (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet., 89, 82–93
|
| [65] |
Li, B. and Leal, S. M. (2008) Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet., 83, 311–321
|
| [66] |
Madsen, B. E. and Browning, S. R. (2009) A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet., 5, e1000384
|
| [67] |
Price, A. L., Kryukov, G. V., de Bakker, P. I., Purcell, S. M., Staples, J., Wei, L.-J. and Sunyaev, S. R. (2010) Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet., 86, 832–838
|
| [68] |
Zhou, X. (2017) A unified framework for variance component estimation with summary statistics in genome-wide association studies. Ann. Appl. Stat., 11, 2027–2051
|
| [69] |
Schork, N. J., Murray, S. S., Frazer, K. A. and Topol, E. J. (2009) Common vs. rare allele hypotheses for complex diseases. Curr. Opin. Genet. Dev., 19, 212–219
|
| [70] |
Eichler, E. E., Flint, J., Gibson, G., Kong, A., Leal, S. M., Moore, J. H. and Nadeau, J. H. (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet., 11, 446–450
|
| [71] |
Price, A. L., Helgason, A., Thorleifsson, G., McCarroll, S. A., Kong, A. and Stefansson, K. (2011) Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet., 7, e1001317
|
RIGHTS & PERMISSIONS
Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature