Transcriptome wide association studies: general framework and methods

Yuhan Xie , Nayang Shan , Hongyu Zhao , Lin Hou

Quant. Biol. ›› 2021, Vol. 9 ›› Issue (2) : 141 -150.

PDF (381KB)
Quant. Biol. ›› 2021, Vol. 9 ›› Issue (2) : 141 -150. DOI: 10.15302/J-QB-020-0228
REVIEW
REVIEW

Transcriptome wide association studies: general framework and methods

Author information +
History +
PDF (381KB)

Abstract

Background: Genome-wide association studies (GWAS) have succeeded in identifying tens of thousands of genetic variants associated with complex human traits during the past decade, however, they are still hampered by limited statistical power and difficulties in biological interpretation. With the recent progress in expression quantitative trait loci (eQTL) studies, transcriptome-wide association studies (TWAS) provide a framework to test for gene-trait associations by integrating information from GWAS and eQTL studies.

Results: In this review, we will introduce the general framework of TWAS, the relevant resources, and the computational tools. Extensions of the original TWAS methods will also be discussed. Furthermore, we will briefly introduce methods that are closely related to TWAS, including MR-based methods and colocalization approaches. Connection and difference between these approaches will be discussed.

Conclusion: Finally, we will summarize strengths, limitations, and potential directions for TWAS.

Graphical abstract

Keywords

TWAS / gene imputation / gene-trait association test / eQTL studies / GWAS

Cite this article

Download citation ▾
Yuhan Xie, Nayang Shan, Hongyu Zhao, Lin Hou. Transcriptome wide association studies: general framework and methods. Quant. Biol., 2021, 9(2): 141-150 DOI:10.15302/J-QB-020-0228

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Gamazon, E. R., Wheeler, H. E., Shah, K. P., Mozaffari, S. V., Aquino-Michaels, K., Carroll, R. J., Eyler, A. E., Denny, J. C., Nicolae, D. L., Cox, N. J., (2015) A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet., 47, 1091–1098

[2]

Barbeira, A. N., Dickinson, S. P., Bonazzola, R., Zheng, J., Wheeler, H. E., Torres, J. M., Torstenson, E. S., Shah, K. P., Garcia, T., Edwards, T. L., (2018) Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun., 9, 1825

[3]

Cloney, R. (2016) Integrating gene variation and expression to understand complex traits. Nat. Rev. Genet., 17, 194

[4]

Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W., Jansen, R., de Geus, E. J., Boomsma, D. I., Wright, F. A., (2016) Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet., 48, 245–252

[5]

Mancuso, N., Shi, H., Goddard, P., Kichaev, G., Gusev, A. and Pasaniuc, B. (2017) Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet., 100, 473–487

[6]

Lonsdale, J., Thomas, J., Salvatore, M., Phillips, R., Lo, E., Shad, S., Hasz, R., Walters, G., Garcia, F., Young, N., (2013) The genotype-tissue expression (GTEx) project. Nat. Genet., 45, 580–585

[7]

Lappalainen, T., Sammeth, M., Friedländer, M. R., Peter, P. A., ’t Hoen, Monlong, J., Rivas, M. A., Gonzàlez-Porta, M., Kurbatova, N., Griebel, T., Ferreira, P. G., (2013) Transcriptome and genome sequencing uncovers functional variation in humans. Nature, 501, 506–511

[8]

Battle, A., Mostafavi, S., Zhu, X., Potash, J. B., Weissman, M. M., McCormick, C., Haudenschild, C. D., Beckman, K. B., Shi, J., Mei, R., (2014) Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res., 24, 14–24

[9]

Boomsma, D. I., de Geus, E. J., Vink, J. M., Stubbe, J. H., Distel, M. A., Hottenga, J. J., Posthuma, D., van Beijsterveldt, T. C., Hudziak, J. J., Bartels, M., (2006) Netherlands Twin Register: from twins to twin families. Twin Res. Hum. Genet., 9, 849–857

[10]

Laakso, M., Kuusisto, J., Stančáková A., Kuulasmaa, T., Pajukanta, P., Lusis, A. J., Collins, F. S., Mohlke, K. L. and Boehnke, M. (2017) The metabolic syndrome in men study: a resource for studies of metabolic and cardiovascular diseases. J. Lipid Res., 58, 481–493

[11]

Hoffman, G. E., Bendl, J., Voloudakis, G., Montgomery, K. S., Sloofman, L., Wang, Y. C., Shah, H. R., Hauberg, M. E., Johnson, J. S., Girdhar, K., (2019) CommonMind Consortium provides transcriptomic and epigenomic data for Schizophrenia and Bipolar Disorder. Sci. Data, 6, 180

[12]

Hu, Y., Li, M., Lu, Q., Weng, H., Wang, J., Zekavat, S. M., Yu, Z., Li, B., Gu, J., Muchnik, S., (2019) A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet., 51, 568–576

[13]

Barbeira, A. N., Pividori, M., Zheng, J., Wheeler, H. E., Nicolae, D. L. and Im, H. K. (2019) Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet., 15, e1007889

[14]

Park, Y., Sarkar, A., Bhutani, K. and Kellis, M. (2017) Multi-tissue polygenic models for transcriptome-wide association studies. bioRxiv, 107623

[15]

Yang, Y., Shi, X., Jiao, Y., Huang, J., Chen, M., Zhou, X., Sun, L., Lin, X., Yang, C. and Liu, J. (2020) CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. Bioinformatics, 36, 2009–2016

[16]

Nagpal, S., Meng, X., Epstein, M. P., Tsoi, L. C., Patrick, M., Gibson, G., De Jager, P. L., Bennett, D. A., Wingo, A. P., Wingo, T. S., (2019) Tigar: An improved bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am. J. Hum. Genet., 105, 258–266

[17]

MacArthur, J., Bowler, E., Cerezo, M., Gil, L., Hall, P., Hastings, E., Junkins, H., McMahon, A., Milano, A., Morales, J., (2017) The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res., 45, D896–D901

[18]

Siva, N. (2008) 1000 Genomes project. Nat. Biotechnol., 26, 256

[19]

Wainberg, M., Sinnott-Armstrong, N., Mancuso, N., Barbeira, A. N., Knowles, D. A., Golan, D., Ermel, R., Ruusalepp, A., Quertermous, T., Hao, K., (2019) Opportunities and challenges for transcriptome-wide association studies. Nat. Genet., 51, 592–599

[20]

Yang, C., Wan, X., Lin, X., Chen, M., Zhou, X. and Liu, J. (2019) CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics, 35, 1644–1652

[21]

Tang, Y.-C. and Gottlieb, A. (2018) TF-TWAS: Transcription-factor polymorphism associated with tissue-specific gene expression. bioRxiv, 405936

[22]

Zhang, W., Voloudakis, G., Rajagopal, V. M., Readhead, B., Dudley, J. T., Schadt, E. E., Björkegren, J. L. M., Kim, Y., Fullard, J. F., Hoffman, G. E., (2019) Integrative transcriptome imputation reveals tissue-specific and shared biological mechanisms mediating susceptibility to complex traits. Nat. Commun., 10, 3834

[23]

Mancuso, N., Freund, M. K., Johnson, R., Shi, H., Kichaev, G., Gusev, A. and Pasaniuc, B. (2019) Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet., 51, 675–682

[24]

Giambartolomei, C., Zhenli Liu, J., Zhang, W., Hauberg, M., Shi, H., Boocock, J., Pickrell, J., Jaffe, A. E., Pasaniuc, B., Roussos, P., (2018) A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics, 34, 2538–2545

[25]

Plagnol, V., Smyth, D. J., Todd, J. A. and Clayton, D. G. (2009) Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13. Biostatistics, 10, 327–334

[26]

Giambartolomei, C., Vukcevic, D., Schadt, E. E., Franke, L., Hingorani, A. D., Wallace, C. and Plagnol, V. (2014) Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet., 10, e1004383

[27]

Wen, X., Pique-Regi, R. and Luca, F. (2017) Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization. PLoS Genet., 13, e1006646

[28]

Hormozdiari, F., van de Bunt, M., Segrè A. V., Li, X., Joo, J. W. J., Bilow, M., Sul, J. H., Sankararaman, S., Pasaniuc, B. and Eskin, E. (2016) Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet., 99, 1245–1260

[29]

He, X., Fuller, C. K., Song, Y., Meng, Q., Zhang, B., Yang, X. and Li, H. (2013) Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. Am. J. Hum. Genet., 92, 667–680

[30]

Zou, H. and Hastie, T. (2005) Regularization and variable selection via the elastic net. J. R. Stat. Soc. B, 67, 301–320

[31]

Zhou, X., Carbonetto, P. and Stephens, M. (2013) Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet., 9, e1003264

[32]

Guan, Y. and Stephens, M. (2011) Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann. Appl. Stat., 5, 1780–1815

[33]

Yu, J., Pressoir, G., Briggs, W. H., Vroh Bi, I., Yamasaki, M., Doebley, J. F., McMullen, M. D., Gaut, B. S., Nielsen, D. M., Holland, J. B., (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet., 38, 203–208

[34]

Hoffman, M. D., Blei, D. M., Wang, C. and Paisley, J. (2013) Stochastic variational inference. J. Mach. Learn. Res., 14, 1303–1347

[35]

Zeng, P. and Zhou, X. (2017) Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models. Nat. Commun., 8, 456

[36]

Blei, D. M., Kucukelbir, A. and McAuliffe, J. D. (2017) Variational inference: A review for statisticians. J. Am. Stat. Assoc., 112, 859–877

[37]

Bennett, D. A., Schneider, J. A., Buchman, A. S., Barnes, L. L., Boyle, P. A. and Wilson, R. S. (2012) Overview and findings from the rush memory and aging project. Curr. Alzheimer Res., 9, 646–663

[38]

Ng, B., White, C. C., Klein, H. U., Sieberts, S. K., McCabe, C., Patrick, E., Xu, J., Yu, L., Gaiteri, C., Bennett, D. A., (2017) An xQTL map integrates the genetic architecture of the human brain’s transcriptome and epigenome. Nat. Neurosci., 20, 1418–1426

[39]

Bennett, D. A., Buchman, A. S., Boyle, P. A., Barnes, L. L., Wilson, R. S. and Schneider, J. A. (2018) Religious orders study and rush memory and aging project. J. Alzheimers Dis., 64, S161–S189

[40]

Sun, R. and Lin, X. (2017) Set-based tests for genetic association using the generalized berk-jones statistic. ArXiv, 171002469

[41]

Li, B., Veturi, Y., Bradford, Y., Verma, S. S., Verma, A., Lucas, A. M., Haas, D. W. and Ritchie, M. D. (2019) Influence of tissue context on gene prioritization for predicted transcriptome-wide association studies. Pac. Symp. Biocomput., 24, 296–307

[42]

Bhutani, K., Sarkar, A., Park, Y., Kellis, M. and Schork, N. J. (2017) Modeling prediction error improves power of transcriptome-wide association studies. bioRxiv, 108316

[43]

Liu, C., Rubin, D. B. and Wu, Y. N. (1998) Parameter expansion to accelerate em: The px-em algorithm. Biometrika, 85, 755–770

[44]

Xu, Z., Wu, C., Wei, P. and Pan, W. (2017) A powerful framework for integrating eQTL and GWAS summary data. Genetics, 207, 893–902

[45]

Pan, W. (2009) Asymptotic tests of association with multiple SNPs in linkage disequilibrium. Genet. Epidemiol., 33, 497–507

[46]

Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., Ziller, M. J., (2015) Integrative analysis of 111 reference human epigenomes. Nature, 518, 317–330

[47]

Wu, L., Shi, W., Long, J., Guo, X., Michailidou, K., Beesley, J., Bolla, M. K., Shu, X. O., Lu, Y., Cai, Q., (2018) A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer. Nat. Genet., 50, 968–978

[48]

Gusev, A., Mancuso, N., Won, H., Kousi, M., Finucane, H. K., Reshef, Y., Song, L., Safi, A., McCarroll, S., Neale, B. M., (2018) Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat. Genet., 50, 538–548

[49]

Ardlie, K. G., Deluca, D. S., Segre, A. V., Sullivan, T. J., Young, T. R., Gelfand, E. T., Trowbridge, C. A., Maller, J. B., Tukiainen, T., Lek, M., (2015) The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science, 348, 648–660

[50]

DNA methylation. Wikipedia, The Free Encyclopedia. Accessed: April 23, 2020

[51]

Han, S., Lin, Y., Wang, M., Goes, F. S., Tan, K., Zandi, P., Hyde, T., Weinberger, D. R., Potash, J. B., Kleinman, J. E., (2018) Integrating brain methylome with gwas for psychiatric risk gene discovery. bioRxiv, 440206

[52]

Rawlik, K., Rowlatt, A. and Tenesa, A. (2016) Imputation of DNA methylation levels in the brain implicates a risk factor for Parkinson’s disease. Genetics, 204, 771–781

[53]

Nazarian, A., Yashin, A. I. and Kulminski, A. M. (2018) Methylation-wide association analysis reveals aim2, dguok, gnai3, and st14 genes as potential contributors to the Alzheimer’s disease pathogenesis. bioRxiv, 322503

[54]

Xu, Z., Wu, C. and Pan, W., and the Alzheimer’s Disease Neuroimaging Initiative. (2017) Imaging-wide association study: Integrating imaging endophenotypes in GWAS. Neuroimage, 159, 159–169

[55]

Zhu, Z., Zhang, F., Hu, H., Bakshi, A., Robinson, M. R., Powell, J. E., Montgomery, G. W., Goddard, M. E., Wray, N. R., Visscher, P. M., (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet., 48, 481–487

[56]

Porcu, E., Rüeger, S., Lepik, K., Santoni, F. A., Reymond, A. and Kutalik, Z., the eQTLGen Consortium, and the BIOS Consortium. (2019) Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits. Nat. Commun., 10, 3300

[57]

Lee, D., Williamson, V. S., Bigdeli, T. B., Riley, B. P., Fanous, A. H., Vladimirov, V. I. and Bacanu, S. A. (2015) JEPEG: a summary statistics based tool for gene-level joint testing of functional variants. Bioinformatics, 31, 1176–1182

[58]

Nica, A. C., Montgomery, S. B., Dimas, A. S., Stranger, B. E., Beazley, C., Barroso, I. and Dermitzakis, E. T. (2010) Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet., 6, e1000895

[59]

Hoffman, J. D., Graff, R. E., Emami, N. C., Tai, C. G., Passarelli, M. N., Hu, D., Huntsman, S., Hadley, D., Leong, L., Majumdar, A., (2017) Cis-eQTL-based trans-ethnic meta-analysis reveals novel genes associated with breast cancer risk. PLoS Genet., 13, e1006690

[60]

Lu, Y., Beeghly-Fadiel, A., Wu, L., Guo, X., Li, B., Schildkraut, J. M., Im, H. K., Chen, Y. A., Permuth, J. B., Reid, B. M., (2018) A transcriptome-wide association study among 97,898 women to identify candidate susceptibility genes for epithelial ovarian cancer risk. Cancer Res., 78, 5419–5430

[61]

Mancuso, N., Gayther, S., Gusev, A., Zheng, W., Penney, K. L., Kote-Jarai, Z., Eeles, R., Freedman, M., Haiman, C. and Pasaniuc, B., and the PRACTICAL consortium. (2018) Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat. Commun., 9, 4079

[62]

Ioannidis, N. M., Wang, W., Furlotte, N. A., Hinds, D. A., Bustamante, C. D., Jorgenson, E., Asgari, M. M. and Whittemore, A. S., and the 23andMe Research Team. (2018) Gene expression imputation identifies candidate genes and susceptibility loci associated with cutaneous squamous cell carcinoma. Nat. Commun., 9, 4264

[63]

Huckins, L. M., Dobbyn, A., Ruderfer, D. M., Hoffman, G., Wang, W., Pardiñas, A. F., Rajagopal, V. M., Als, T. D., T Nguyen, H., Girdhar, K., (2019) Gene expression imputation across multiple brain regions provides insights into schizophrenia risk. Nat. Genet., 51, 659–674

[64]

Lamontagne, M., Bérubé J. C., Obeidat, M., Cho, M. H., Hobbs, B. D., Sakornsakolpat, P., de Jong, K., Boezen, H. M., Nickle, D., Hao, K., (2018) Leveraging lung tissue transcriptome to uncover candidate causal genes in COPD genetic associations. Hum. Mol. Genet., 27, 1819–1829

[65]

Thériault, S., Gaudreault, N., Lamontagne, M., Rosa, M., Boulanger, M. C., Messika-Zeitoun, D., Clavel, M. A., Capoulade, R., Dagenais, F., Pibarot, P., (2018) A transcriptome-wide association study identifies PALMD as a susceptibility gene for calcific aortic valve stenosis. Nat. Commun., 9, 988

[66]

Zhao, B., Shan, Y., Yang, Y., Li, T., Luo, T., Zhu, Z., Li, Y. and Zhu, H. (2019) Transcriptome-wide association analysis of 211 neuroimaging traits identifies new genes for brain structures and yields insights into the gene-level pleiotropy with other complex traits. bioRxiv, 842872

[67]

Keys, K. L., Mak, A. C. Y., White, M. J., Eckalbar, W. L., Dahl, A. W., Mefford, J., Mikhaylova, A. V., Contreras, M. G., Elhawary, J. R., Eng, C., (2019) On the cross-population portability of gene expression prediction models. PLoS Genet, 16, e1008927

[68]

Wheeler, H. E., Ploch, S., Barbeira, A. N., Bonazzola, R., Andaleon, A., Fotuhi Siahpirani, A., Saha, A., Battle, A., Roy, S. and Im, H. K. (2019) Imputed gene associations identify replicable trans-acting genes enriched in transcription pathways and complex traits. Genet. Epidemiol., 43, gepi.22205

[69]

Shan, N., Wang, Z. and Hou, L. (2019) Identification of trans-eQTLs using mediation analysis with multiple mediators. BMC Bioinformatics, 20, 126

[70]

Pierce, B. L., Tong, L., Chen, L. S., Rahaman, R., Argos, M., Jasmine, F., Roy, S., Paul-Brutus, R., Westra, H. J., Franke, L., (2014) Mediation analysis demonstrates that trans-eQTLs are often explained by cis-mediation: a genome-wide analysis among 1,800 South Asians. PLoS Genet., 10, e1004818

[71]

The GTEx Consortium, the Laboratory, Data Analysis & Coordinating Center (LDACC)—Analysis Working Group, the Statistical Methods groups—Analysis Working Group, the Enhancing GTEx (eGTEx) groups, the NIH Common Fund, the NIH/NCI, the NIH/NHGRI, the NIH/NIMH, the NIH/NIDA, the Biospecimen Collection Source Site—NDRI (2017) Genetic effects on gene expression across human tissues. Nature, 550, 204–213

[72]

Võsa, U., Claringbould, A., Westra, H.-J., Bonder, M. J., Deelen, P., Zeng, B., Kirsten, H., Saha, A., Kreuzhuber, R., Kasela, S., (2018) Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis. bioRxiv, 447367

[73]

Liao, C., Laporte, A. D., Spiegelman, D., Akçimen, F., Joober, R., Dion, P. A. and Rouleau, G. A. (2019) Transcriptome-wide association study of attention deficit hyperactivity disorder identifies associated genes and phenotypes. Nat. Commun., 10, 4450

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (381KB)

Supplementary files

QB-20228-OF-HL_suppl_1

5656

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/