Cover illustration
GWAS have identified many genetic variants associated with increased risk of Alzheimer’s disease (AD). These susceptibility loci may affect AD indirectly through a combination of physiological brain changes, which are detectable via magnetic resonance imaging. In this issue, Knutson and Pan examined the effects of brain imaging derived phenotypes with genetic etiology on AD, comparing the following summary statistic based methods: two-sample Mendelian randomization, generaliz[Detail] ...
Download coverBackground: Genome-wide association studies (GWASs) have identified thousands of genetic variants that are associated with many complex traits. However, their biological mechanisms remain largely unknown. Transcriptome-wide association studies (TWAS) have been recently proposed as an invaluable tool for investigating the potential gene regulatory mechanisms underlying variant-trait associations. Specifically, TWAS integrate GWAS with expression mapping studies based on a common set of variants and aim to identify genes whose GReX is associated with the phenotype. Various methods have been developed for performing TWAS and/or similar integrative analysis. Each such method has a different modeling assumption and many were initially developed to answer different biological questions. Consequently, it is not straightforward to understand their modeling property from a theoretical perspective.
Results: We present a technical review on thirteen TWAS methods. Importantly, we show that these methods can all be viewed as two-sample Mendelian randomization (MR) analysis, which has been widely applied in GWASs for examining the causal effects of exposure on outcome. Viewing different TWAS methods from an MR perspective provides us a unique angle for understanding their benefits and pitfalls. We systematically introduce the MR analysis framework, explain how features of the GWAS and expression data influence the adaptation of MR for TWAS, and re-interpret the modeling assumptions made in different TWAS methods from an MR angle. We finally describe future directions for TWAS methodology development.
Conclusions: We hope that this review would serve as a useful reference for both methodologists who develop TWAS methods and practitioners who perform TWAS analysis.
Background: Mendelian randomization (MR) analysis has become popular in inferring and estimating the causality of an exposure on an outcome due to the success of genome wide association studies. Many statistical approaches have been developed and each of these methods require specific assumptions.
Results: In this article, we review the pros and cons of these methods. We use an example of high-density lipoprotein cholesterol on coronary artery disease to illuminate the challenges in Mendelian randomization investigation.
Conclusion: The current available MR approaches allow us to study causality among risk factors and outcomes. However, novel approaches are desirable for overcoming multiple source confounding of risk factors and an outcome in MR analysis.
Background: Polygenic risk score (PRS) derived from summary statistics of genome-wide association studies (GWAS) is a useful tool to infer an individual’s genetic risk for health outcomes and has gained increasing popularity in human genetics research. PRS in its simplest form enjoys both computational efficiency and easy accessibility, yet the predictive performance of PRS remains moderate for diseases and traits.
Results: We provide an overview of recent advances in statistical methods to improve PRS’s performance by incorporating information from linkage disequilibrium, functional annotation, and pleiotropy. We also introduce model validation methods that fine-tune PRS using GWAS summary statistics.
Conclusion: In this review, we showcase methodological advances and current limitations of PRS, and discuss several emerging issues in risk prediction research.
Background: Genome-wide association studies (GWAS) have succeeded in identifying tens of thousands of genetic variants associated with complex human traits during the past decade, however, they are still hampered by limited statistical power and difficulties in biological interpretation. With the recent progress in expression quantitative trait loci (eQTL) studies, transcriptome-wide association studies (TWAS) provide a framework to test for gene-trait associations by integrating information from GWAS and eQTL studies.
Results: In this review, we will introduce the general framework of TWAS, the relevant resources, and the computational tools. Extensions of the original TWAS methods will also be discussed. Furthermore, we will briefly introduce methods that are closely related to TWAS, including MR-based methods and colocalization approaches. Connection and difference between these approaches will be discussed.
Conclusion: Finally, we will summarize strengths, limitations, and potential directions for TWAS.
Background: The Genotype-Tissue Expression (GTEx) Project has collected genetic and transcriptome profiles from a wide spectrum of tissues in nearly 1,000 ceased individuals, providing an opportunity to study the regulatory roles of genetic variants in transcriptome activities from both cross-tissue and tissue-specific perspectives. Moreover, transcriptome activities (e.g., transcript abundance and alternative splicing) can be treated as mediators between genotype and phenotype to achieve phenotypic alteration. Knowing the genotype associated transcriptome status, researchers can better understand the biological and molecular mechanisms of genetic risk variants in complex traits.
Results: In this article, we first explore the genetic architecture of gene expression traits, and then review recent methods on quantitative trait locus (QTL) and co-expression network analysis. To further exemplify the usage of associations between genotype and transcriptome status, we briefly review methods that either directly or indirectly integrate expression/splicing QTL information in genome-wide association studies (GWASs).
Conclusions: The GTEx Project provides the largest and useful resource to investigate the associations between genotype and transcriptome status. The integration of results from the GTEx Project and existing GWASs further advances our understanding of roles of gene expression changes in bridging both the genetic variants and complex traits.
Background: Genome-wide association studies (GWAS) have been widely adopted in studies of human complex traits and diseases.
Results: This review surveys areas of active research: quantifying and partitioning trait heritability, fine mapping functional variants and integrative analysis, genetic risk prediction of phenotypes, and the analysis of sequencing studies that have identified millions of rare variants. Current challenges and opportunities are highlighted.
Conclusion: GWAS have fundamentally transformed the field of human complex trait genetics. Novel statistical and computational methods have expanded the scope of GWAS and have provided valuable insights on the genetic architecture underlying complex phenotypes.
Background: Genome wide association studies (GWAS) have identified many genetic variants associated with increased risk of Alzheimer’s disease (AD). These susceptibility loci may effect AD indirectly through a combination of physiological brain changes. Many of these neuropathologic features are detectable via magnetic resonance imaging (MRI).
Methods: In this study, we examine the effects of such brain imaging derived phenotypes (IDPs) with genetic etiology on AD, using and comparing the following methods: two-sample Mendelian randomization (2SMR), generalized summary statistics based Mendelian randomization (GSMR), transcriptome wide association studies (TWAS) and the adaptive sum of powered score (aSPU) test. These methods do not require individual-level genotypic and phenotypic data but instead can rely only on an external reference panel and GWAS summary statistics.
Results: Using publicly available GWAS datasets from the International Genomics of Alzheimer’s Project (IGAP) and UK Biobank’s (UKBB) brain imaging initiatives, we identify 35 IDPs possibly associated with AD, many of which have well established or biologically plausible links to the characteristic cognitive impairments of this neurodegenerative disease.
Conclusions: Our results highlight the increased power for detecting genetic associations achieved by multiple correlated SNP-based methods, i.e., aSPU, GSMR and TWAS, over MR methods based on independent SNPs (as instrumental variables).
Availability: Example code is available at https://github.com/kathalexknuts/ADIDP.
Background: The direct-to-consumer genetic testing (DTC-GT) industry has exploded in recent years, initiated by market pioneers from the United States and quickly followed by companies from Europe and Asia. In addition to their primary objective of providing ancestry and health information to customers, DTC-GT services have emerged as a valuable data resource for large-scale population and genetics studies.
Methods: We assessed DTC-GT market leaders in the U.S. and China, user participation in research, and academic reports based on this information. We also investigated DTC-GT end-user value by tracing key updates of companies provided via health risk reports and evaluating their predictive power. We then assessed the replicability of several genome-wide association studies (GWAS) based on a Chinese DTC-GT biobank.
Results: As recent entrants to the market, Chinese DTC-GT service providers have published less academic research than their Western counterparts; however, a larger proportion of Chinese users consent to participate in research projects. Dramatic increases in user volume and resultant report updates led to reclassification of some users’ polygenic risk levels, but within a reasonable scale and with increased predictive power. Replicability among GWAS using the Chinese DTC-GT biobank varied by studied trait, population background, and sample size.
Conclusions: We speculate that the rapid growth in DTC-GT services, particularly in non-Caucasian populations, will yield an important and much-needed resource for biobanking, large-scale genetic studies, clinical trials, and post-clinical applications.
Background: Whole-exome sequencing (WES) studies have identified multiple genes enriched for de novo mutations (DNMs) in congenital heart disease (CHD) probands. However, risk gene identification based on DNMs alone remains statistically challenging due to heterogenous etiology of CHD and low mutation rate in each gene.
Methods: In this manuscript, we introduce a hierarchical Bayesian framework for gene-level association test which jointly analyzes de novo and rare transmitted variants. Through integrative modeling of multiple types of genetic variants, gene-level annotations, and reference data from large population cohorts, our method accurately characterizes the expected frequencies of both de novo and transmitted variants and shows improved statistical power compared to analyses based on DNMs only.
Results: Applied to WES data of 2,645 CHD proband-parent trios, our method identified 15 significant genes, half of which are novel, leading to new insights into the genetic bases of CHD.
Conclusion: These results showcase the power of integrative analysis of transmitted and de novo variants for disease gene discovery.