Polygenic risk scores: effect estimation and model optimization

Zijie Zhao , Jie Song , Tuo Wang , Qiongshi Lu

Quant. Biol. ›› 2021, Vol. 9 ›› Issue (2) : 133 -140.

PDF (177KB)
Quant. Biol. ›› 2021, Vol. 9 ›› Issue (2) : 133 -140. DOI: 10.15302/J-QB-021-0238
REVIEW
REVIEW

Polygenic risk scores: effect estimation and model optimization

Author information +
History +
PDF (177KB)

Abstract

Background: Polygenic risk score (PRS) derived from summary statistics of genome-wide association studies (GWAS) is a useful tool to infer an individual’s genetic risk for health outcomes and has gained increasing popularity in human genetics research. PRS in its simplest form enjoys both computational efficiency and easy accessibility, yet the predictive performance of PRS remains moderate for diseases and traits.

Results: We provide an overview of recent advances in statistical methods to improve PRS’s performance by incorporating information from linkage disequilibrium, functional annotation, and pleiotropy. We also introduce model validation methods that fine-tune PRS using GWAS summary statistics.

Conclusion: In this review, we showcase methodological advances and current limitations of PRS, and discuss several emerging issues in risk prediction research.

Graphical abstract

Keywords

GWAS / polygenic risk score / summary statistics / model selection

Cite this article

Download citation ▾
Zijie Zhao, Jie Song, Tuo Wang, Qiongshi Lu. Polygenic risk scores: effect estimation and model optimization. Quant. Biol., 2021, 9(2): 133-140 DOI:10.15302/J-QB-021-0238

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Purcell, S. M., Wray, N. R., Stone, J. L., Visscher, P. M., O’Donovan, M. C., Sullivan, P. F., Sklar, P., and the International Schizophrenia Consortium. (2009) Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature, 460, 748–752

[2]

Wray, N. R., Goddard, M. E. and Visscher, P. M. (2007) Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res., 17, 1520–1528

[3]

Bush, W. S., Sawcer, S. J., de Jager, P. L., Oksenberg, J. R., McCauley, J. L., Pericak-Vance, M. A., Haines, J. L., and the International Multiple Sclerosis Genetics Consortium (IMSGC). (2010) Evidence for polygenic susceptibility to multiple sclerosis‒the shape of things to come. Am. J. Hum. Genet., 86, 621–625

[4]

Zhou, X., Carbonetto, P. and Stephens, M. (2013) Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet., 9, e1003264

[5]

Maier, R., Moser, G., Chen, G. B., Ripke, S., Coryell, W., Potash, J. B., Scheftner, W. A., Shi, J., Weissman, M. M., Hultman, C. M., (2015) Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder. Am. J. Hum. Genet., 96, 283–294

[6]

Speed, D. and Balding, D. J. (2014) MultiBLUP: improved SNP-based prediction for complex traits. Genome Res., 24, 1550–1557

[7]

Lee, J. J., Wedow, R., Okbay, A., Kong, E., Maghzian, O., Zacher, M., Nguyen-Viet, T. A., Bowers, P., Sidorenko, J., Karlsson Linnér, R., (2018) Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet., 50, 1112–1121

[8]

Yengo, L., Sidorenko, J., Kemper, K. E., Zheng, Z., Wood, A. R., Weedon, M. N., Frayling, T. M., Hirschhorn, J., Yang, J. and Visscher, P. M., (2018) Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Hum. Mol. Genet., 27, 3641–3649

[9]

Warrington, N. M., Beaumont, R. N., Horikoshi, M., Day, F. R., Helgeland, Ø., Laurin, C., Bacelis, J., Peng, S., Hao, K., Feenstra, B., (2019) Maternal and fetal genetic effects on birth weight and their relevance to cardio-metabolic risk factors. Nat. Genet., 51, 804–814

[10]

Wray, N. R., Yang, J., Hayes, B. J., Price, A. L., Goddard, M. E. and Visscher, P. M. (2013) Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet., 14, 507–515

[11]

Chatterjee, N., Shi, J. and García-Closas, M. (2016) Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet., 17, 392–406

[12]

Choi, S. W. and O’Reilly, P. F. (2019) PRSice-2: Polygenic Risk Score software for biobank-scale data. Gigascience, 8, giz082

[13]

Vilhjálmsson, B. J., Yang, J., Finucane, H. K., Gusev, A., Lindström, S., Ripke, S., Genovese, G., Loh, P. R., Bhatia, G., Do, R., (2015) Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet., 97, 576–592

[14]

Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A. and Smoller, J. W. (2019) Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun., 10, 1776

[15]

Hu, Y., Lu, Q., Powles, R., Yao, X., Yang, C., Fang, F., Xu, X. and Zhao, H. (2017) Leveraging functional annotations in genetic risk prediction for human complex diseases. PLOS Comput. Biol., 13, e1005589

[16]

Mak, T. S. H., Porsch, R. M., Choi, S. W., Zhou, X. and Sham, P. C. (2017) Polygenic scores via penalized regression on summary statistics. Genet. Epidemiol., 41, 469–480

[17]

Chen, T.-H., Chatterjee, N., Landi, M. T. and Shi, J. (2020) A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information. J. Am. Stat. Assoc., 116, 133–143

[18]

Hu, Y., Lu, Q., Liu, W., Zhang, Y., Li, M. and Zhao, H. (2017) Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction. PLoS Genet., 13, e1006836

[19]

Maier, R. M., Zhu, Z., Lee, S. H., Trzaskowski, M., Ruderfer, D. M., Stahl, E. A., Ripke, S., Wray, N. R., Yang, J., Visscher, P. M., (2018) Improving genetic prediction by leveraging genetic correlations among human diseases and traits. Nat. Commun. 9, 989

[20]

Chung, W., Chen, J., Turman, C., Lindstrom, S., Zhu, Z., Loh, P.-R., Kraft, P. and Liang, L. (2019) Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes. Nat. Commun. 10, 569

[21]

Turley, P., Walters, R. K., Maghzian, O., Okbay, A., Lee, J. J., Fontana, M. A., Nguyen-Viet, T. A., Wedow, R., Zacher, M., Furlotte, N. A., (2018) Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet., 50, 229–237

[22]

Grotzinger, A. D., Rhemtulla, M., de Vlaming, R., Ritchie, S.J., Mallard, T.T., Hill, W.D., Ip, H. F., Marioni, R. E., McIntosh, A. M., Deary, I. J., (2019) Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525

[23]

Song, L., Liu, A., Shi, J., Gejman, P. V., Sanders, A. R., Duan, J., Cloninger, C. R., Svrakic, D. M., Buccola, N. G., Levinson, D. F., (2019) SummaryAUC: a tool for evaluating the performance of polygenic risk prediction models in validation datasets with only summary level statistics. Bioinformatics, 35, 4038–4044

[24]

Zhao, Z., Yi, Y., Wu, Y., Zhong, X., Lin, Y., Hohman, T. J., Fletcher, J. (2019) Fine-tuning polygenic risk scores with GWAS summary statistics. bioRxiv,

[25]

Lloyd-Jones, L.R., Zeng, J., Sidorenko, J., Yengo, L., Moser, G., Kemper, K.E.Wang, H., Zheng, Z., Magi, R., Esko, T., (2019) Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nat. Commun., 10, 5086

[26]

Robinson, M.R., Kleinman, A., Graff, M., Vinkhuyzen, A.A.E., Couper, D., Miller, M.B., Peyrot, W. J., Abdellaoui, A., Zietsch, B. P., Nolte, I. M., (2017) Genetic evidence of assortative mating in humans. Nat. Hum. Behav., 1, 0016

[27]

Yang, S. and Zhou, X. (2020) Accurate and scalable construction of polygenic scores in large biobank data sets. Am. J. Hum. Genet., 106, 679–693

[28]

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., Maller, J., Sklar, P., de Bakker, P. I., Daly, M. J., (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet., 81, 559–575

[29]

GTEx Consortium (2017) Genetic effects on gene expression across human tissues. Nature, 550, 204–213

[30]

The ENCODE Project Consortium (2020) Perspectives on ENCODE. Nature, 583, 693–698

[31]

Roadmap Epigenomics Consortium (2015)Integrative analysis of 111 reference human epigenomes. Nature, 518, 317–330

[32]

Pasaniuc, B. and Price, A. L. (2017) Dissecting the genetics of complex traits using summary association statistics. Nat. Rev. Genet., 18, 117–127

[33]

Finucane, H. K., Bulik-Sullivan, B., Gusev, A., Trynka, G., Reshef, Y., Loh, P.-R., Anttila, V., Xu, H., Zang, C., Farh, K., (2015) Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet., 47, 1228–1235

[34]

Lu, Q., Powles, R. L., Wang, Q., He, B. J. and Zhao, H. (2016) Integrative tissue-specific functional annotations in the human genome provide novel insights on many complex traits and improve signal prioritization in genome wide association studies. PLoS Genet., 12, e1005947

[35]

Yang, C., Li, C., Wang, Q., Chung, D. and Zhao, H. (2015) Implications of pleiotropy: challenges and opportunities for mining Big Data in biomedicine. Front. Genet., 6, 229

[36]

Bulik-Sullivan, B., Finucane, H. K., Anttila, V., Gusev, A., Day, F. R., Loh, P. R., Duncan, L., Perry, J. R., Patterson, N., Robinson, E. B., (2015) An atlas of genetic correlations across human diseases and traits. Nat. Genet., 47, 1236–1241

[37]

Lu, Q., Li, B., Ou, D., Erlendsdottir, M., Powles, R. L., Jiang, T., Hu, Y., Chang, D., Jin, C., Dai, W., (2017) A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics. Am. J. Hum. Genet., 101, 939–964

[38]

Zhang, P. (1993) Model selection via multifold cross validation. Ann. Stat., 21, 299–313

[39]

Kulm, S., Marderstein, A., Mezey, J. and Elemento, O. (2020) Benchmarking the accuracy of polygenic risk scores and their generative methods. medRxiv, 2020.04.06.20055574

[40]

Khera, A. V., Chaffin, M., Aragam, K. G., Haas, M. E., Roselli, C., Choi, S. H., Natarajan, P., Lander, E. S., Lubitz, S. A., Ellinor, P. T., (2018) Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet., 50, 1219–1224

[41]

Wu, Y., Zhong, X., Lin, Y., Zhao, Z., Chen, J., Zheng, B., Li, J. J., Fletcher, J. M. and Lu, Q. (2020) Estimating genetic nurture with summary statistics of multi-generational genome-wide association studies. bioRxiv, 2020.10.06.328724

[42]

Young, A. I., Benonisdottir, S., Przeworski, M. and Kong, A. (2019) Deconstructing the sources of genotype-phenotype associations in humans. Science, 365, 1396–1400

[43]

Mostafavi, H., Harpak, A., Agarwal, I., Conley, D., Pritchard, J. K. and Przeworski, M. (2020) Variable prediction accuracy of polygenic scores within an ancestry group. eLife, 9, e48376

[44]

Martin, A. R., Kanai, M., Kamatani, Y., Okada, Y., Neale, B. M. and Daly, M. J. (2019) Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet., 51, 584–591

[45]

Martin, A. R., Gignoux, C. R., Walters, R. K., Wojcik, G. L., Neale, B. M., Gravel, S., Daly, M. J., Bustamante, C. D. and Kenny, E. E. (2017) Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet., 100, 635–649

[46]

Rosenberg, N. A., Edge, M. D., Pritchard, J. K. and Feldman, M. W. (2019) Interpreting polygenic scores, polygenic adaptation, and human phenotypic differences. Evol. Med. Public Health, 2019, 26–34

[47]

Adhikari, K., Mendoza-Revilla, J., Sohail, A., Fuentes-Guajardo, M., Lampert, J., Chacón-Duque, J. C., Hurtado, M., Villegas, V., Granja, V., Acuña-Alonzo, V., (2019) A GWAS in Latin Americans highlights the convergent evolution of lighter skin pigmentation in Eurasia. Nat. Commun., 10, 358

[48]

Mills, M.C. and Rahal, C. (2019) A scientometric review of genome-wide association studies. Commun. Biol., 2, 9

[49]

Coram, M. A., Fang, H., Candille, S. I., Assimes, T. L. and Tang, H. (2017) Leveraging multi-ethnic evidence for risk assessment of quantitative traits in minority populations. Am. J. Hum. Genet., 101, 218–226

[50]

Amariuta, T., Ishigaki, K., Sugishita, H., Ohta, T., Matsuda, K., Murakami, Y., Price, A. L., Kawakami, E.,Terao, C. and Raychaudhuri, S. (2020) In silico integration of thousands of epigenetic datasets into 707 cell type regulatory annotations improves the trans-ethnic portability of polygenic risk scores. bioRxiv, 2020.02.21.959510

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (177KB)

7487

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/