The statistical practice of the GTEx Project: from single to multiple tissues

Xu Liao , Xiaoran Chai , Xingjie Shi , Lin S. Chen , Jin Liu

Quant. Biol. ›› 2021, Vol. 9 ›› Issue (2) : 151 -167.

PDF (3645KB)
Quant. Biol. ›› 2021, Vol. 9 ›› Issue (2) : 151 -167. DOI: 10.1007/s40484-020-0210-9
REVIEW
REVIEW

The statistical practice of the GTEx Project: from single to multiple tissues

Author information +
History +
PDF (3645KB)

Abstract

Background: The Genotype-Tissue Expression (GTEx) Project has collected genetic and transcriptome profiles from a wide spectrum of tissues in nearly 1,000 ceased individuals, providing an opportunity to study the regulatory roles of genetic variants in transcriptome activities from both cross-tissue and tissue-specific perspectives. Moreover, transcriptome activities (e.g., transcript abundance and alternative splicing) can be treated as mediators between genotype and phenotype to achieve phenotypic alteration. Knowing the genotype associated transcriptome status, researchers can better understand the biological and molecular mechanisms of genetic risk variants in complex traits.

Results: In this article, we first explore the genetic architecture of gene expression traits, and then review recent methods on quantitative trait locus (QTL) and co-expression network analysis. To further exemplify the usage of associations between genotype and transcriptome status, we briefly review methods that either directly or indirectly integrate expression/splicing QTL information in genome-wide association studies (GWASs).

Conclusions: The GTEx Project provides the largest and useful resource to investigate the associations between genotype and transcriptome status. The integration of results from the GTEx Project and existing GWASs further advances our understanding of roles of gene expression changes in bridging both the genetic variants and complex traits.

Graphical abstract

Keywords

the Genotype-Tissue Expression Project / quantitative trait loci (QTL) / transcriptome-wide association studies / genome-wide association studies

Cite this article

Download citation ▾
Xu Liao, Xiaoran Chai, Xingjie Shi, Lin S. Chen, Jin Liu. The statistical practice of the GTEx Project: from single to multiple tissues. Quant. Biol., 2021, 9(2): 151-167 DOI:10.1007/s40484-020-0210-9

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Finucane, H. K., Bulik-Sullivan, B., Gusev, A., Trynka, G., Reshef, Y., Loh, P. R., Anttila, V., Xu, H., Zang, C., Farh, K., (2015) Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet., 47, 1228–1235.

[2]

Maurano, M. T., Humbert, R., Rynes, E., Thurman, R. E., Haugen, E., Wang, H., Reynolds, A. P., Sandstrom, R., Qu, H., Brody, J., (2012) Systematic localization of common disease-associated variation in regulatory DNA. Science, 337, 1190–1195.

[3]

Nica, A. C., Montgomery, S. B., Dimas, A. S., Stranger, B. E., Beazley, C., Barroso, I. and Dermitzakis, E. T. (2010) Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet., 6, e1000895

[4]

Visscher, P.M., Wray, N.R., Zhang, Q., Sklar, P., McCarthy, M.I., Brown, M.A. and Yang, J. (2017)10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet., 101, 5–22

[5]

ENCODE Project Consortium. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74.

[6]

Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., Ziller, M. J., (2015) Integrative analysis of 111 reference human epigenomes. Nature, 518, 317–330.

[7]

Lonsdale, J., Thomas, J., Salvatore, M., Phillips, R., Lo, E., Shad, S., Hasz, R., Walters, G., Garcia, F., Young, N., (2013) The Genotype-Tissue Expression (GTEx) project. Nat. Genet., 45, 580–585.

[8]

Aguet, F., Barbeira, A.N., Bonazzola, R., Brown, A., Castel, S.E., Jo, B., Kasela, S., Kim-Hellmuth, S., Liang, Y., Oliva, M., (2019) The GTEX consortium atlas of genetic regulatory effects across human tissues. bioRxiv, 787903

[9]

Rockman, M. V. and Kruglyak, L. (2006) Genetics of global gene expression. Nat. Rev. Genet., 7, 862–872.

[10]

Gilad, Y., Rifkin, S. A. and Pritchard, J. K. (2008) Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet., 24, 408–415.

[11]

Shabalin, A. A. (2012) Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics, 28, 1353–1358.

[12]

Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. and Delaneau, O. (2016) Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics, 32, 1479–1485.

[13]

Grundberg, E., Small, K. S., Hedman, Å. K., Nica, A. C., Buil, A., Keildson, S., Bell, J. T., Yang, T. P., Meduri, E., Barrett, A., (2012) Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet., 44, 1084–1089.

[14]

Petretto, E., Bottolo, L., Langley, S. R., Heinig, M., McDermott-Roe, C., Sarwar, R., Pravenec, M., Hübner, N., Aitman, T. J., Cook, S. A., (2010) New insights into the genetic control of gene expression using a Bayesian multi-tissue approach. PLOS Comput. Biol., 6, e1000737

[15]

Sul, J. H., Han, B., Ye, C., Choi, T. and Eskin, E. (2013) Effectively identifying eQTLs from multiple tissues by combining mixed model and meta-analytic approaches. PLoS Genet., 9, e1003491

[16]

Li, G., Shabalin, A. A., Rusyn, I., Wright, F. A. and Nobel, A. B. (2018) An empirical Bayes approach for multiple tissue eQTL analysis. Biostatistics, 19, 391–406.

[17]

Urbut, S. M., Wang, G., Carbonetto, P. and Stephens, M. (2019) Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet., 51, 187–195.

[18]

Castel, S.E., Aguet, F., Mohammadi, P., GTEx Consortium, Ardlie, K.G., Lappalainen, T. (2019) A vast resource of allelic expression data spanning human tissues. bioRxiv, 792911

[19]

Albert, F. W. and Kruglyak, L. (2015) The role of regulatory variation in complex traits and disease. Nat. Rev. Genet., 16, 197–212.

[20]

Cookson, W., Liang, L., Abecasis, G., Moffatt, M. and Lathrop, M. (2009) Mapping complex disease traits with global gene expression. Nat. Rev. Genet., 10, 184–194.

[21]

Gamazon, E. R., Wheeler, H. E., Shah, K. P., Mozaffari, S. V., Aquino-Michaels, K., Carroll, R. J., Eyler, A. E., Denny, J. C., Nicolae, D. L., Cox, N. J., (2015) A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet., 47, 1091–1098.

[22]

Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W., Jansen, R., de Geus, E. J., Boomsma, D. I., Wright, F. A., (2016) Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet., 48, 245–252.

[23]

Yang, Y., Shi, X., Jiao, Y., Huang, J., Chen, M., Zhou, X., Sun, L., Lin, X., Yang, C. and Liu, J. (2019) CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. bioRxiv, 652263

[24]

Barbeira, A. N., Pividori, M., Zheng, J., Wheeler, H. E., Nicolae, D. L. and Im, H. K.. (2019) Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet., 15, e1007889

[25]

Hu, Y., Li, M., Lu, Q., Weng, H., Wang, J., Zekavat, S.M., Yu, Z., Li, B., Gu, J., Muchnik, S., Shi, Y., (2019) A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576

[26]

Shi, X., Chai, X., Yang, Y., Cheng, Q., Jiao, Y., Huang, J., Yang, C. and Liu, J. (2019) A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies. bioRxiv, 789396

[27]

Andreassen, O. A., Thompson, W. K., Schork, A. J., Ripke, S., Mattingsdal, M., Kelsoe, J. R., Kendler, K. S., O’Donovan, M. C., Rujescu, D., Werge, T., , (2013) Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate. PLoS Genet., 9, e1003455

[28]

Chung, D., Yang, C., Li, C., Gelernter, J. and Zhao, H. (2014) GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation. PLoS Genet., 10, e1004787

[29]

Liu, J., Wan, X., Ma, S. and Yang, C. (2016) EPS: an empirical Bayes approach to integrating pleiotropy and tissue-specific information for prioritizing risk genes. Bioinformatics, 32, 1856–1864.

[30]

Ming, J., Dai, M., Cai, M., Wan, X., Liu, J. and Yang, C. (2018) LSMM: a statistical approach to integrating functional annotations with genome-wide association studies. Bioinformatics, 34, 2788–2796.

[31]

Carithers, L. J., Ardlie, K., Barcus, M., Branton, P. A., Britton, A., Buia, S. A., Compton, C. C., DeLuca, D. S., Peter-Demchok, J., Gelfand, E. T., , (2015) A novel approach to high-quality postmortem tissue procurement: the GTEX project. Biopreserv. Biobank., 13, 311–319.

[32]

Siminoff, L. A., Wilson-Genderson, M., Gardiner, H. M., Mosavel, M. and Barker, K. L. (2018) Consent to a postmortem tissue procurement study: Distinguishing family decision makers’ knowledge of the genotype-tissue expression project. Biopreserv. Biobank., 16, 200–206.

[33]

The International Schizophrenia Consortium (2009) Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature, 460, 748–752.

[34]

Wheeler, H. E., Shah, K. P., Brenner, J., Garcia, T., Aquino-Michaels, K., Cox, N. J., Nicolae, D. L., Im, H. K., and the GTEx Consortium. (2016) Survey of the heritability and sparse architecture of gene expression traits across human tissues. PLoS Genet., 12, e1006423

[35]

Zhou, X., Carbonetto, P. and Stephens, M. (2013) Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet., 9, e1003264

[36]

Moser, G., Lee, S. H., Hayes, B. J., Goddard, M. E., Wray, N. R. and Visscher, P. M. (2015) Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet., 11, e1004969

[37]

Nicolae, D. L., Gamazon, E., Zhang, W., Duan, S., Dolan, M. E. and Cox, N. J. (2010) Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet., 6, e1000888

[38]

Fusi, N., Stegle, O. and Lawrence, N. D. (2012) Joint modelling of confounding factors and prominent genetic regulators provides increased accuracy in genetical genomics studies. PLOS Comput. Biol., 8, e1002330

[39]

van de Geijn, B., McVicker, G., Gilad, Y. and Pritchard, J. K. (2015) Wasp: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods, 12, 1061–1063.

[40]

Robinson, M. D. and Oshlack, A. (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol., 11, R25

[41]

Stegle, O., Parts, L., Durbin, R. and Winn, J. (2010) A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLOS Comput. Biol., 6, e1000770

[42]

.The GTEx Consortium (2017) Genetic effects on gene expression across human tissues. Nature, 550, 204–213.

[43]

Flutre, T., Wen, X., Pritchard, J. and Stephens, M. (2013) A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet., 9, e1003486

[44]

Wei, Y., Tenzen, T. and Ji, H. (2015) Joint analysis of differential gene expression in multiple studies using correlation motifs. Biostatistics, 16, 31–46.

[45]

Zhang, B. and Horvath, S. (2005) A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol., 4, e17

[46]

Langfelder, P. and Horvath, S. (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics, 9, 559

[47]

Langfelder, P. and Horvath, S. (2014) Tutorials for the WGCNA package

[48]

Elena, A. (2002) Ananko, Nikolay L Podkolodny, Irina L Stepanenko, Elena V Ignatieva, Olga A Podkolodnaya, and Nikolay A Kolchanov. Genenet: a database on structure and functional organisation of gene networks. Nucleic Acids Res., 30, 398–401.

[49]

Friedman, J., Hastie, T. and Tibshirani, R. (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9, 432–441.

[50]

Pierson, E., the GTEx Consortium, Koller, D., Battle, A., (2015) Sharing and specificity of co-expression networks across 35 human tissues. PLOS Comput. Biol., 11, e1004220

[51]

Gerring, Z.F., Gamazon, E.R., Derks, E.M., for the Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium (2019) A gene co-expression networkbased analysis of multiple brain tissues reveals novel genes and molecular pathways underlying major depression. PLoS Genet 15, e1008245

[52]

Yang, C., Wan, X., Lin, X., Chen, M., Zhou, X. and Liu, J. (2019) CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics, 35, 1644–1652.

[53]

Barbeira, A. N., Dickinson, S. P., Bonazzola, R., Zheng, J., Wheeler, H. E., Torres, J. M., Torstenson, E. S., Shah, K. P., Garcia, T., Edwards, T. L., (2018) Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun., 9, 1825

[54]

Wayne, A. Fuller. (2009) Measurement Error Models. Volume 305. New Jersey: John Wiley & Sons

[55]

Liu, C., Rubin, D. B., Wu, Y-. N. (1998) and. Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika, 85, 755–770.

[56]

Cheng, Q., Yang, Y., Shi, X., Yang, C., Peng, H. and Liu, J. (2019) MR-LDP: a two-sample Mendelian randomization for GWAS summary statistics accounting linkage disequilibrium and horizontal pleiotropy. bioRxiv, 684746

[57]

Schork, A. J., Thompson, W. K., Pham, P., Torkamani, A., Roddey, J. C., Sullivan, P. F., Kelsoe, J. R., O’Donovan, M. C., Furberg, H., Schork, N. J., , (2013) All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs. PLoS Genet., 9, e1003449

[58]

Boyle, E. A., Li, Y. I. and Pritchard, J. K. (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell, 169, 1177–1186.

[59]

Kichaev, G., Yang, W.-Y., Lindstrom, S., Hormozdiari, F., Eskin, E., Price, A. L., Kraft, P. and Pasaniuc, B. (2014) Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet., 10, e1004722

[60]

Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. and Eskin, E. (2014) Identifying causal variants at loci with multiple signals of association. Genetics, 198, 497–508.

[61]

Pickrell, J. K. (2014) Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet., 94, 559–573.

[62]

Giambartolomei, C., Vukcevic, D., Schadt, E. E., Franke, L., Hingorani, A. D., Wallace, C. and Plagnol, V. (2014) Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet., 10, e1004383

[63]

Wen, X., Pique-Regi, R. and Luca, F. (2017) Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization. PLoS Genet., 13, e1006646

[64]

Giambartolomei, C., Zhenli Liu, J., Zhang, W., Hauberg, M., Shi, H., Boocock, J., Pickrell, J., Jaffe, A. E., Pasaniuc, B. and Roussos, P. (2018) A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics, 34, 2538–2545.

[65]

Efron, B. (2008) Microarrays, empirical bayes and the two-groups model. Stat. Sci., 23, 1–22.

[66]

Turcot, V., Lu, Y., Highland, H. M., Schurmann, C., Justice, A. E., Fine, R. S., Bradfield, J. P., Esko, T., Giri, A., Graff, M., (2018) Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity. Nat. Genet., 50, 26–41.

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap
PDF (3645KB)

2554

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/