Finding susceptible and protective interaction patterns in large-scale genetic association study

Yuan LI, Yuhai ZHAO, Guoren WANG, Xiaofeng ZHU, Xiang ZHANG, Zhanghui WANG, Jun PANG

PDF(987 KB)
PDF(987 KB)
Front. Comput. Sci. ›› 2017, Vol. 11 ›› Issue (3) : 541-554. DOI: 10.1007/s11704-016-5300-5
RESEARCH ARTICLE

Finding susceptible and protective interaction patterns in large-scale genetic association study

Author information +
History +

Abstract

Interaction detection in large-scale genetic association studies has attracted intensive research interest, since many diseases have complex traits. Various approaches have been developed for finding significant genetic interactions. In this article, we propose a novel framework SRMiner to detect interacting susceptible and protective genotype patterns. SRMiner can discover not only probable combination of single nucleotide polymorphisms (SNPs) causing diseases but also the corresponding SNPs suppressing their pathogenic functions, which provides a better prospective to uncover the underlying relevance between genetic variants and complex diseases. We have performed extensive experiments on several real Wellcome Trust Case Control Consortium (WTCCC) datasets. We use the pathway-based and the protein-protein interaction (PPI) network-based evaluation methods to verify the discovered patterns. The results show that SRMiner successfully identifies many disease-related genes verified by the existing work. Furthermore, SRMiner can also infer some uncomfirmed but highly possible disease-related genes.

Keywords

genetic association studies / genotype pattern mining / data mining / bioinformatics

Cite this article

Download citation ▾
Yuan LI, Yuhai ZHAO, Guoren WANG, Xiaofeng ZHU, Xiang ZHANG, Zhanghui WANG, Jun PANG. Finding susceptible and protective interaction patterns in large-scale genetic association study. Front. Comput. Sci., 2017, 11(3): 541‒554 https://doi.org/10.1007/s11704-016-5300-5

References

[1]
LiJ, WangL M, GuoM Z, Zhang R J, DaiQ G , LiuX Y, WangC Y, TengZ, Xuan P, ZhangM M . Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information. FEBS Open Bio, 2015, 5(1): 251–256
CrossRef Google scholar
[2]
CordellH J. Detecting gene-gene interactions that underlie human diseases. Natural Reviews Genetics, 2009, 10(6): 392–404
CrossRef Google scholar
[3]
ZengX X, ZhangX, ZouQ. Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks. Briefings in Bioinformatics, 2015
[4]
ZouQ, LiJ J, SongL, Zeng X X, WangG H . Similarity computation strategies in the microRNA-disease network: a survey. Briefings in Functional Genomics, 2016, 15(1): 55–64
[5]
ZhangL, ChenS C, LiuX J. Detecting differential expression from RNA-seq data with expression measurement uncertainty. Frontiers of Computer Science, 2015, 9(4): 652–663
CrossRef Google scholar
[6]
ShangJ L, ZhangJ Y, SunY, Liu D, YeD J , YinY L. Performance analysis of novel methods for detecting epistasis. BMC Bioinformatics, 2011, 12(1)
CrossRef Google scholar
[7]
WangY, LiuG M, FengM L, Wong L. An empirical comparison of several recent epistatic interaction detection methods. Bioinformatics, 2011, 27(21): 2936–2943
CrossRef Google scholar
[8]
LiP, GuoM Z, WangC Y, Liu X Y, ZouQ . An overview of SNP interactions in genome-wide association studies. Briefings in Functional Genomics, 2014, 14(3): 129–141
[9]
LiJ, HuangD L, GuoM Z, Liu X Y, WangC Y , TengZ X, ZhangR J, JiangY S, Lv H C, WangL M . A gene-based information gain method for detecting gene-gene interactions in case-control studies. European Journal of Human Genetics, 2015
CrossRef Google scholar
[10]
PanJ B, HuS C, WangH, Zou Q, JiZ L . PaGeFinder: quantitative identification of spatiotemporal pattern genes. Bioinformatics, 2012, 28(11): 1544–1545
CrossRef Google scholar
[11]
InfanteJ, SanzC, Fernández-LunaJ L , LlorcaJ, Berciano J, CombarrosO . Gene-gene interaction between interleukin-1A and interleukin-8 increases Alzheimer’s disease risk. Journal of Neurology, 2004, 251(4): 482–483
CrossRef Google scholar
[12]
CombarrosO, van Duijn C M, HammondN , BelbinO, Arias-Vásquez A, Cortina-BorjaM , LehmannM G, Aulchenko Y S, SchuurM , KölschH. Replication by the Epistasis Project of the interaction between the genes for IL-6 and IL-10 in the risk of Alzheimer’s disease. Journal of Neuroinflammation, 2009, 6(1): 22
CrossRef Google scholar
[13]
BaryshnikovaA, Costanzo M, MyersC L , AndrewsB, BooneC. Genetic interaction networks: toward an understanding of heritability. Annual Review of Genomics and Human Genetics, 2013, 14(1)
CrossRef Google scholar
[14]
GoldsteinD B. Common genetic variation and human traits. New England Journal of Medicine, 2009, 360(17): 1696
CrossRef Google scholar
[15]
McCarthyM I, Abecasis G R, CardonL R , GoldsteinD B, LittleJ, IoannidisJ P A , HirschhornJ N. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics, 2008, 9(5): 356–369
CrossRef Google scholar
[16]
MooreJ H, Gilbert J C, TsaiC T , ChiangF T, HoldenT, BarneyN, White B C. A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology, 2006, 241(2): 252–261
CrossRef Google scholar
[17]
WanX, YangC, YangQ, Xue H, FanX D , TangN L S, YuW C. BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. The American Journal of Human Genetics, 2010, 87(3): 325–340
CrossRef Google scholar
[18]
WanX, YangCan, YangQ, Xue H, TangN L S , YuW C. Predictive rule inference for epistatic interaction detection in genome-wide association studies. Bioinformatics, 2010, 26(1): 30–37
CrossRef Google scholar
[19]
ZhangY, LiuJ S. Bayesian inference of epistatic interactions in casecontrol studies. Nature Genetics, 2007, 39(9): 1167–1173
CrossRef Google scholar
[20]
ZhangX, HuangS P, ZouF, Wang W. TEAM: efficient two-locus epistasis tests in human genome-wide association study. Bioinformatics, 2010, 26(12): i217–i227
CrossRef Google scholar
[21]
JanssensA C J W, van Duijn C M. Genome-based prediction of common diseases: advances and prospects. Human Molecular Genetics, 2008, 17(R2): R166–R173
CrossRef Google scholar
[22]
AbdiH, Williams L J. Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2(4): 433–459
CrossRef Google scholar
[23]
ZhaoY H, WangG R, LiY, WangZ H. Finding novel diagnostic gene patterns based on interesting non-redundant contrast sequence rules. In: Proceedings of IEEE International Conference on Data Mining. 2011, 972–981
CrossRef Google scholar
[24]
MontgomeryS. Linkage disequilibrium—understanding the evolutionary past and mapping the medical future. Nature Reviews Genetics, 2008, 9(6): 477–485
CrossRef Google scholar
[25]
PurcellS, NealeB, Todd-BrownK , ThomasL, Ferreira M A R, BenderD , MallerJ, SklarP, de BakkerP I W , DalyM J, ShamP C. PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 2007, 81(3): 559–575
CrossRef Google scholar
[26]
GoldbergA V. Finding a maximum density subgraph. University of California Berkeley, CA, 1984
[27]
CharikarM. Greedy approximation algorithms for finding dense components in a graph. Approximation Algorithms for Combinatorial Optimization, 2000, 139–152
CrossRef Google scholar
[28]
FanW, ZhangK, ChengH, Gao J, YanX F , HanJ W, YuP, VerscheureO . Direct mining of discriminative and essential frequent patterns via model-based search tree. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 230–238
CrossRef Google scholar
[29]
The Well come Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 2007, 447(7145): 661–678
CrossRef Google scholar
[30]
HanJ W, PeiJ, YinY W. Mining frequent patterns without candidate generation. ACM SIGMOD Record, 2000, 29(2): 1–12
CrossRef Google scholar
[31]
PanF, CongG, TungA K H, Yang J, ZakiM J . Carpenter: finding closed patterns in long biological datasets. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining. 2003, 637–642
[32]
SacconeS F, QuanJ X, JonesP L. BioQ: tracing experimental origins in public genomic databases using a novel data provenance model. Bioinformatics, 2012, 28(8): 1189–1191
CrossRef Google scholar
[33]
Chatr-aryamontriA, Breitkreutz B J, HeinickeS , BoucherL, WinterA, StarkC, Nixon J, RamageL , KolasN, O’Donmell L. The BioGRID interaction database: 2013 update. Nucleic Acids Research, 2013, 41(D1): D816–D823
CrossRef Google scholar
[34]
WangK, LiM Y, BucanM. Pathway-based approaches for analysis of genomewide association studies. The American Journal of Human Genetics, 2007, 81(6): 1278–1283
CrossRef Google scholar
[35]
ChenL S, HutterC M, PotterJ D, Liu Y, PrenticeR L , PetersU, HsuL. Insights into colon cancer etiology via a regularized approach to gene set analysis of gwas data. The American Journal of Human Genetics, 2010, 86(6): 860–871
CrossRef Google scholar
[36]
LiM X, KwanJ S H, ShamP C. HYST: A hybrid set-based test for genome-wide association studies, with application to protein-protein interaction-based association analysis. The American Journal of Human Genetics, 2012, 91(3): 478–488
CrossRef Google scholar
[37]
PawsonT, NashP. Protein–protein interactions define specificity in signal transduction. Genes & Development, 2000, 14(9): 1027–1047
[38]
SharanR, Ulitsky I, ShamirR . Network-based prediction of protein function. Molecular Systems Biology, 2007, 3(1): 88
CrossRef Google scholar

RIGHTS & PERMISSIONS

2016 Higher Education Press and Springer-Verlag Berlin Heidelberg
AI Summary AI Mindmap
PDF(987 KB)

Accesses

Citations

Detail

Sections
Recommended

/