The statistical practice of the GTEx Project: from single to multiple tissues

Xu Liao, Xiaoran Chai, Xingjie Shi, Lin S. Chen, Jin Liu

PDF(3645 KB)
PDF(3645 KB)
Quant. Biol. ›› 2021, Vol. 9 ›› Issue (2) : 151-167. DOI: 10.1007/s40484-020-0210-9
REVIEW
REVIEW

The statistical practice of the GTEx Project: from single to multiple tissues

Author information +
History +

Abstract

Background: The Genotype-Tissue Expression (GTEx) Project has collected genetic and transcriptome profiles from a wide spectrum of tissues in nearly 1,000 ceased individuals, providing an opportunity to study the regulatory roles of genetic variants in transcriptome activities from both cross-tissue and tissue-specific perspectives. Moreover, transcriptome activities (e.g., transcript abundance and alternative splicing) can be treated as mediators between genotype and phenotype to achieve phenotypic alteration. Knowing the genotype associated transcriptome status, researchers can better understand the biological and molecular mechanisms of genetic risk variants in complex traits.

Results: In this article, we first explore the genetic architecture of gene expression traits, and then review recent methods on quantitative trait locus (QTL) and co-expression network analysis. To further exemplify the usage of associations between genotype and transcriptome status, we briefly review methods that either directly or indirectly integrate expression/splicing QTL information in genome-wide association studies (GWASs).

Conclusions: The GTEx Project provides the largest and useful resource to investigate the associations between genotype and transcriptome status. The integration of results from the GTEx Project and existing GWASs further advances our understanding of roles of gene expression changes in bridging both the genetic variants and complex traits.

Author summary

In the genetic area, people have made extensive efforts to investigate the associations between genetic variants and disease traits. However, we are lacking the knowledge of underlying biological mechanisms through which the genetic factors could affect the phenotypic outcome. Genotype-Tissue Expression (GTEx) Project provided us several angles to think about this question, including quantitative trait locus, alternative spicing patterns, and tissue-specific effect of genetic variants, and so on. In this article, we are providing a comprehensive review of their methods and results, and also suggest several down-stream analysis methods (e.g., TWAS, co-expression network) by which we can go deeper into the regulatory mechanisms triggered by genetic factors.

Graphical abstract

Keywords

the Genotype-Tissue Expression Project / quantitative trait loci (QTL) / transcriptome-wide association studies / genome-wide association studies

Cite this article

Download citation ▾
Xu Liao, Xiaoran Chai, Xingjie Shi, Lin S. Chen, Jin Liu. The statistical practice of the GTEx Project: from single to multiple tissues. Quant. Biol., 2021, 9(2): 151‒167 https://doi.org/10.1007/s40484-020-0210-9

References

[1]
Finucane, H. K., Bulik-Sullivan, B., Gusev, A., Trynka, G., Reshef, Y., Loh, P. R., Anttila, V., Xu, H., Zang, C., Farh, K., (2015) Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet., 47, 1228–1235.
CrossRef Pubmed Google scholar
[2]
Maurano, M. T., Humbert, R., Rynes, E., Thurman, R. E., Haugen, E., Wang, H., Reynolds, A. P., Sandstrom, R., Qu, H., Brody, J., (2012) Systematic localization of common disease-associated variation in regulatory DNA. Science, 337, 1190–1195.
CrossRef Pubmed Google scholar
[3]
Nica, A. C., Montgomery, S. B., Dimas, A. S., Stranger, B. E., Beazley, C., Barroso, I. and Dermitzakis, E. T. (2010) Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet., 6, e1000895
CrossRef Google scholar
[4]
Visscher, P.M., Wray, N.R., Zhang, Q., Sklar, P., McCarthy, M.I., Brown, M.A. and Yang, J. (2017)10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet., 101, 5–22
[5]
ENCODE Project Consortium. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74.
CrossRef Pubmed Google scholar
[6]
Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., Ziller, M. J., (2015) Integrative analysis of 111 reference human epigenomes. Nature, 518, 317–330.
CrossRef Pubmed Google scholar
[7]
Lonsdale, J., Thomas, J., Salvatore, M., Phillips, R., Lo, E., Shad, S., Hasz, R., Walters, G., Garcia, F., Young, N., (2013) The Genotype-Tissue Expression (GTEx) project. Nat. Genet., 45, 580–585.
CrossRef Pubmed Google scholar
[8]
Aguet, F., Barbeira, A.N., Bonazzola, R., Brown, A., Castel, S.E., Jo, B., Kasela, S., Kim-Hellmuth, S., Liang, Y., Oliva, M., (2019) The GTEX consortium atlas of genetic regulatory effects across human tissues. bioRxiv, 787903
[9]
Rockman, M. V. and Kruglyak, L. (2006) Genetics of global gene expression. Nat. Rev. Genet., 7, 862–872.
CrossRef Pubmed Google scholar
[10]
Gilad, Y., Rifkin, S. A. and Pritchard, J. K. (2008) Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet., 24, 408–415.
CrossRef Pubmed Google scholar
[11]
Shabalin, A. A. (2012) Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics, 28, 1353–1358.
CrossRef Pubmed Google scholar
[12]
Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. and Delaneau, O. (2016) Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics, 32, 1479–1485.
CrossRef Pubmed Google scholar
[13]
Grundberg, E., Small, K. S., Hedman, Å. K., Nica, A. C., Buil, A., Keildson, S., Bell, J. T., Yang, T. P., Meduri, E., Barrett, A., (2012) Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet., 44, 1084–1089.
CrossRef Pubmed Google scholar
[14]
Petretto, E., Bottolo, L., Langley, S. R., Heinig, M., McDermott-Roe, C., Sarwar, R., Pravenec, M., Hübner, N., Aitman, T. J., Cook, S. A., (2010) New insights into the genetic control of gene expression using a Bayesian multi-tissue approach. PLOS Comput. Biol., 6, e1000737
CrossRef Pubmed Google scholar
[15]
Sul, J. H., Han, B., Ye, C., Choi, T. and Eskin, E. (2013) Effectively identifying eQTLs from multiple tissues by combining mixed model and meta-analytic approaches. PLoS Genet., 9, e1003491
CrossRef Pubmed Google scholar
[16]
Li, G., Shabalin, A. A., Rusyn, I., Wright, F. A. and Nobel, A. B. (2018) An empirical Bayes approach for multiple tissue eQTL analysis. Biostatistics, 19, 391–406.
CrossRef Pubmed Google scholar
[17]
Urbut, S. M., Wang, G., Carbonetto, P. and Stephens, M. (2019) Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet., 51, 187–195.
CrossRef Pubmed Google scholar
[18]
Castel, S.E., Aguet, F., Mohammadi, P., GTEx Consortium, Ardlie, K.G., Lappalainen, T. (2019) A vast resource of allelic expression data spanning human tissues. bioRxiv, 792911
[19]
Albert, F. W. and Kruglyak, L. (2015) The role of regulatory variation in complex traits and disease. Nat. Rev. Genet., 16, 197–212.
CrossRef Pubmed Google scholar
[20]
Cookson, W., Liang, L., Abecasis, G., Moffatt, M. and Lathrop, M. (2009) Mapping complex disease traits with global gene expression. Nat. Rev. Genet., 10, 184–194.
CrossRef Pubmed Google scholar
[21]
Gamazon, E. R., Wheeler, H. E., Shah, K. P., Mozaffari, S. V., Aquino-Michaels, K., Carroll, R. J., Eyler, A. E., Denny, J. C., Nicolae, D. L., Cox, N. J., (2015) A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet., 47, 1091–1098.
CrossRef Pubmed Google scholar
[22]
Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W., Jansen, R., de Geus, E. J., Boomsma, D. I., Wright, F. A., (2016) Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet., 48, 245–252.
CrossRef Pubmed Google scholar
[23]
Yang, Y., Shi, X., Jiao, Y., Huang, J., Chen, M., Zhou, X., Sun, L., Lin, X., Yang, C. and Liu, J. (2019) CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. bioRxiv, 652263
CrossRef Google scholar
[24]
Barbeira, A. N., Pividori, M., Zheng, J., Wheeler, H. E., Nicolae, D. L. and Im, H. K.. (2019) Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet., 15, e1007889
CrossRef Google scholar
[25]
Hu, Y., Li, M., Lu, Q., Weng, H., Wang, J., Zekavat, S.M., Yu, Z., Li, B., Gu, J., Muchnik, S., Shi, Y., (2019) A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576
[26]
Shi, X., Chai, X., Yang, Y., Cheng, Q., Jiao, Y., Huang, J., Yang, C. and Liu, J. (2019) A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies. bioRxiv, 789396
[27]
Andreassen, O. A., Thompson, W. K., Schork, A. J., Ripke, S., Mattingsdal, M., Kelsoe, J. R., Kendler, K. S., O’Donovan, M. C., Rujescu, D., Werge, T., , (2013) Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate. PLoS Genet., 9, e1003455
CrossRef Pubmed Google scholar
[28]
Chung, D., Yang, C., Li, C., Gelernter, J. and Zhao, H. (2014) GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation. PLoS Genet., 10, e1004787
CrossRef Pubmed Google scholar
[29]
Liu, J., Wan, X., Ma, S. and Yang, C. (2016) EPS: an empirical Bayes approach to integrating pleiotropy and tissue-specific information for prioritizing risk genes. Bioinformatics, 32, 1856–1864.
CrossRef Pubmed Google scholar
[30]
Ming, J., Dai, M., Cai, M., Wan, X., Liu, J. and Yang, C. (2018) LSMM: a statistical approach to integrating functional annotations with genome-wide association studies. Bioinformatics, 34, 2788–2796.
CrossRef Pubmed Google scholar
[31]
Carithers, L. J., Ardlie, K., Barcus, M., Branton, P. A., Britton, A., Buia, S. A., Compton, C. C., DeLuca, D. S., Peter-Demchok, J., Gelfand, E. T., , (2015) A novel approach to high-quality postmortem tissue procurement: the GTEX project. Biopreserv. Biobank., 13, 311–319.
CrossRef Pubmed Google scholar
[32]
Siminoff, L. A., Wilson-Genderson, M., Gardiner, H. M., Mosavel, M. and Barker, K. L. (2018) Consent to a postmortem tissue procurement study: Distinguishing family decision makers’ knowledge of the genotype-tissue expression project. Biopreserv. Biobank., 16, 200–206.
CrossRef Google scholar
[33]
The International Schizophrenia Consortium (2009) Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature, 460, 748–752.
CrossRef Google scholar
[34]
Wheeler, H. E., Shah, K. P., Brenner, J., Garcia, T., Aquino-Michaels, K., Cox, N. J., Nicolae, D. L., Im, H. K., and the GTEx Consortium. (2016) Survey of the heritability and sparse architecture of gene expression traits across human tissues. PLoS Genet., 12, e1006423
CrossRef Pubmed Google scholar
[35]
Zhou, X., Carbonetto, P. and Stephens, M. (2013) Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet., 9, e1003264
CrossRef Pubmed Google scholar
[36]
Moser, G., Lee, S. H., Hayes, B. J., Goddard, M. E., Wray, N. R. and Visscher, P. M. (2015) Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet., 11, e1004969
CrossRef Pubmed Google scholar
[37]
Nicolae, D. L., Gamazon, E., Zhang, W., Duan, S., Dolan, M. E. and Cox, N. J. (2010) Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet., 6, e1000888
CrossRef Google scholar
[38]
Fusi, N., Stegle, O. and Lawrence, N. D. (2012) Joint modelling of confounding factors and prominent genetic regulators provides increased accuracy in genetical genomics studies. PLOS Comput. Biol., 8, e1002330
CrossRef Pubmed Google scholar
[39]
van de Geijn, B., McVicker, G., Gilad, Y. and Pritchard, J. K. (2015) Wasp: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods, 12, 1061–1063.
CrossRef Google scholar
[40]
Robinson, M. D. and Oshlack, A. (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol., 11, R25
CrossRef Pubmed Google scholar
[41]
Stegle, O., Parts, L., Durbin, R. and Winn, J. (2010) A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLOS Comput. Biol., 6, e1000770
CrossRef Pubmed Google scholar
[42]
.The GTEx Consortium (2017) Genetic effects on gene expression across human tissues. Nature, 550, 204–213.
CrossRef Pubmed Google scholar
[43]
Flutre, T., Wen, X., Pritchard, J. and Stephens, M. (2013) A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet., 9, e1003486
CrossRef Pubmed Google scholar
[44]
Wei, Y., Tenzen, T. and Ji, H. (2015) Joint analysis of differential gene expression in multiple studies using correlation motifs. Biostatistics, 16, 31–46.
CrossRef Pubmed Google scholar
[45]
Zhang, B. and Horvath, S. (2005) A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol., 4, e17
CrossRef Pubmed Google scholar
[46]
Langfelder, P. and Horvath, S. (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics, 9, 559
CrossRef Pubmed Google scholar
[47]
Langfelder, P. and Horvath, S. (2014) Tutorials for the WGCNA package
[48]
Elena, A. (2002) Ananko, Nikolay L Podkolodny, Irina L Stepanenko, Elena V Ignatieva, Olga A Podkolodnaya, and Nikolay A Kolchanov. Genenet: a database on structure and functional organisation of gene networks. Nucleic Acids Res., 30, 398–401.
[49]
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9, 432–441.
CrossRef Pubmed Google scholar
[50]
Pierson, E., the GTEx Consortium, Koller, D., Battle, A., (2015) Sharing and specificity of co-expression networks across 35 human tissues. PLOS Comput. Biol., 11, e1004220
CrossRef Pubmed Google scholar
[51]
Gerring, Z.F., Gamazon, E.R., Derks, E.M., for the Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium (2019) A gene co-expression networkbased analysis of multiple brain tissues reveals novel genes and molecular pathways underlying major depression. PLoS Genet 15, e1008245
[52]
Yang, C., Wan, X., Lin, X., Chen, M., Zhou, X. and Liu, J. (2019) CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics, 35, 1644–1652.
CrossRef Pubmed Google scholar
[53]
Barbeira, A. N., Dickinson, S. P., Bonazzola, R., Zheng, J., Wheeler, H. E., Torres, J. M., Torstenson, E. S., Shah, K. P., Garcia, T., Edwards, T. L., (2018) Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun., 9, 1825
CrossRef Pubmed Google scholar
[54]
Wayne, A. Fuller. (2009) Measurement Error Models. Volume 305. New Jersey: John Wiley & Sons
[55]
Liu, C., Rubin, D. B., Wu, Y-. N. (1998) and. Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika, 85, 755–770.
CrossRef Google scholar
[56]
Cheng, Q., Yang, Y., Shi, X., Yang, C., Peng, H. and Liu, J. (2019) MR-LDP: a two-sample Mendelian randomization for GWAS summary statistics accounting linkage disequilibrium and horizontal pleiotropy. bioRxiv, 684746
[57]
Schork, A. J., Thompson, W. K., Pham, P., Torkamani, A., Roddey, J. C., Sullivan, P. F., Kelsoe, J. R., O’Donovan, M. C., Furberg, H., Schork, N. J., , (2013) All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs. PLoS Genet., 9, e1003449
CrossRef Pubmed Google scholar
[58]
Boyle, E. A., Li, Y. I. and Pritchard, J. K. (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell, 169, 1177–1186.
CrossRef Google scholar
[59]
Kichaev, G., Yang, W.-Y., Lindstrom, S., Hormozdiari, F., Eskin, E., Price, A. L., Kraft, P. and Pasaniuc, B. (2014) Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet., 10, e1004722
CrossRef Pubmed Google scholar
[60]
Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. and Eskin, E. (2014) Identifying causal variants at loci with multiple signals of association. Genetics, 198, 497–508.
CrossRef Pubmed Google scholar
[61]
Pickrell, J. K. (2014) Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet., 94, 559–573.
CrossRef Pubmed Google scholar
[62]
Giambartolomei, C., Vukcevic, D., Schadt, E. E., Franke, L., Hingorani, A. D., Wallace, C. and Plagnol, V. (2014) Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet., 10, e1004383
CrossRef Pubmed Google scholar
[63]
Wen, X., Pique-Regi, R. and Luca, F. (2017) Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization. PLoS Genet., 13, e1006646
CrossRef Pubmed Google scholar
[64]
Giambartolomei, C., Zhenli Liu, J., Zhang, W., Hauberg, M., Shi, H., Boocock, J., Pickrell, J., Jaffe, A. E., Pasaniuc, B. and Roussos, P. (2018) A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics, 34, 2538–2545.
CrossRef Pubmed Google scholar
[65]
Efron, B. (2008) Microarrays, empirical bayes and the two-groups model. Stat. Sci., 23, 1–22.
CrossRef Google scholar
[66]
Turcot, V., Lu, Y., Highland, H. M., Schurmann, C., Justice, A. E., Fine, R. S., Bradfield, J. P., Esko, T., Giri, A., Graff, M., (2018) Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity. Nat. Genet., 50, 26–41.
CrossRef Pubmed Google scholar

ACKNOWLEDGEMENTS

We would like to thank the two anonymous reviewers whose constructive comments have greatly improved this manuscript. This work was supported by grant R-913-200-098-263 from the Duke-NUS Medical School, and AcRF Tier 2 (MOE2016-T2-2-029, MOE2018T2-1-046 and MOE2018-T2-2-006) from the Ministry of Education, Singapore. The computational work for this article was partially performed using resources from the National Supercomputing Centre, Singapore (https://www.nscc.sg).

COMPLIANCE WITH ETHICS GUIDELINES

The authors Xu Liao, Xiaoran Chai, Xingjie Shi, Lin S. Chen and Jin Liu declare that they have no conflict of interests.
The article is a review article and does not contain any human or animal subjects performed by any of the authors.

RIGHTS & PERMISSIONS

2020 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(3645 KB)

Accesses

Citations

Detail

Sections
Recommended

/