Identifying Rarely Mutated Cancer Genes by Heterogeneous Network Embedding

Yurun Lu , Songmao Zhang , Yong Wang

CSIAM Trans. Life Sci. ›› 2025, Vol. 1 ›› Issue (1) : 22 -44.

PDF (463KB)
CSIAM Trans. Life Sci. ›› 2025, Vol. 1 ›› Issue (1) :22 -44. DOI: 10.4208/csiam-ls.SO-2024-0002
Research Articles
research-article

Identifying Rarely Mutated Cancer Genes by Heterogeneous Network Embedding

Author information +
History +
PDF (463KB)

Abstract

Cancer is a multifaceted disease caused by dynamic interaction between genetic mutations and environmental factors. Understanding the genetic mutations underlying the development and progression of cancer is the stepstone for developing effective treatments and therapies. However, these mutations occurred in only a small fraction of cancer patients and it is extremely difficult to associate with cancer. Here, we propose MutNet, a heterogeneous network embedding method which integrate biomolecular network with cancer genomics data. Using pan cancer genomic data from The Cancer Genome Atlas program and public protein-protein interaction and pathway data, MutNet identifies rarely mutated cancer genes often overlooked by conventional genetic studies. In addition, the unified vector representation of biological entities allows us to reveal the tumor type specific cancer genes, cancer gene modules, and potential relationships among different tumor types. Our heterogeneous network embedding method holds the promise for the underlying mechanisms of cancer and potential therapeutic targets.

Keywords

Cancer genomics / cancer gene / network embedding

Cite this article

Download citation ▾
Yurun Lu, Songmao Zhang, Yong Wang. Identifying Rarely Mutated Cancer Genes by Heterogeneous Network Embedding. CSIAM Trans. Life Sci., 2025, 1(1): 22-44 DOI:10.4208/csiam-ls.SO-2024-0002

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

C. Arnedo-Pac, L. Mularoni, F. Mui ños, A. Gonzalez-Perez, and N. Lopez-Bigas, OncodriveCLUSTL: A sequence-based clustering method to identify cancer drivers, Bioinformatics, 35:4788-4790, 2019.

[2]

V. Asati, D. K. Mahapatra, and S. K. Bharti, PI3K/Akt/mTOR and Ras/Raf/MEK/ERK signaling pathways inhibitors as anticancer agents: Structural and pharmacological perspectives, Eur. J. Med. Chem., 109:314-341, 2016.

[3]

A. Balmain, J. Gray, and B. Ponder, The genetics and genomics of cancer, Nat. Genet., 33(3):238-244, 2003.

[4]

K. J. Bussey, L. H. Cisneros, C. H. Lineweaver, and P. C. W. Davies, Ancestral gene regulatory networks drive cancer, Proc. Natl. Acad. Sci. USA, 114:6160-6162, 2017.

[5]

D. M. Camacho, K. M. Collins, R. K. Powers, J. C. Costello, and J. J. Collins, Next-generation machine learning for biological networks, Cell, 173:1581-1592, 2018.

[6]

Cancer Genome Atlas Research Network, The Cancer Genome Atlas Pan-Cancer analysis project, Nat. Genet., 45:1113-1120, 2013.

[7]

D. Chakravarty et al., OncoKB: A precision oncology knowledge base, JCO Precis. Oncol., 1:1-16, 2017.

[8]

S. M. Chan and R. Majeti, Role of DNMT3A, TET2, and IDH1/2 mutations in pre-leukemic stem cells in acute myeloid leukemia, Int. J. Hematol., 98:648-657, 2013.

[9]

T. Davoli et al., Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome, Cell, 155:948-962, 2013.

[10]

N. D. Dees et al., MuSiC: Identifying mutational significance in cancer genomes, Genome Res., 22:1589-1598, 2012.

[11]

F. Dietlein et al., Identification of cancer driver genes based on nucleotide context, Nat. Genet., 52:208-218, 2020.

[12]

Y. Dong, N. V. Chawla, and A. Swami, metapath2vec: Scalable representation learning for heterogeneous networks,in:Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 135-144, 2017.

[13]

H. A. Elmarakeby et al., Biologically informed deep neural network for prostate cancer discovery, Nature, 598:348-352, 2021.

[14]

A. Fabregat et al., The reactome pathway knowledgebase, Nucleic Acids Res., 46:D649-D655, 2018.

[15]

L. A. Garraway and E. S. Lander, Lessons from the cancer genome, Cell, 153:17-37, 2013.

[16]

A. Gonzalez-Perez and N. Functional impact bias reveals cancer drivers, Nucleic Acids Res., 40:e169, 2012.

[17]

H. Greulich, The genomics of lung adenocarcinoma: Opportunities for targeted therapies, Genes Cancer, 1:1200-1210, 2010.

[18]

T. Hamidi, A. K. Singh, and T. Chen, Genetic alterations of DNA methylation machinery in human diseases, Epigenomics, 7:247-265, 2015.

[19]

W. L. Hamilton, R. Ying, and J. Leskovec, Representation learning on graphs: Methods and applications, arXiv:1709.05584v3, 2018.

[20]

D. Hao, L. Wang, and L. Di, Distinct mutation accumulation rates among tissues determine the variation in cancer risk, Sci. Rep., 6:19458, 2016.

[21]

Z. Hu, Y. Dong, K. Wang, and Y. Sun, Heterogeneous graph transformer, in:Proceedings of the Web Conference 2020, ACM, 2704-2710, 2020.

[22]

J. Iranzo, I. Martincorena, and E. V. Koonin, Cancer-mutation network and the number and specificity of driver mutations, Proc. Natl. Acad. Sci. USA,115:E6010-E6019, 2018.

[23]

M. Kanehisa and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nucleic Acids Res., 28:27-30, 2000.

[24]

G. Kar, A. Gursoy, and O. Keskin, Human cancer protein-protein interaction network: A structural perspective, PLOS Comput. Biol., 5:e1000601, 2009.

[25]

M. S. Lawrence et al., Mutational heterogeneity in cancer and the search for new cancer-associated genes, Nature, 499:214-218, 2013.

[26]

M. D. M. Leiserson et al., Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes, Nat. Genet., 47:106-114, 2015.

[27]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, Distributed representations of words and phrases and their compositionality, in: Advances in Neural Information Processing Systems, Curran Associates, Vol. 26, 2013.

[28]

S. Moleirinho, A. Tilston-Lunel, L. Angus, F. Gunn-Moore, and P. A. Reynolds, The expanding family of FERM proteins, Biochem. J., 452:183-193, 2013.

[29]

L. Mularoni, R. Sabarinathan, J. Deu-Pons, A. Gonzalez-Perez, and N. L ópez-Bigas, OncodriveFML: A general framework to identify coding and non-coding regions with cancer driver muta-tions, Genome Biol., 17:128, 2016.

[30]

J. Peng, G. Lu, and X. Shang, A survey of network representation learning methods for link prediction in biological network, Curr. Pharm. Des., 26:3076-3084, 2020.

[31]

R. Rabbie, P. Ferguson, C. Molina-Aguilar, D. J. Adams, and C. D. Robles-Espinoza, Melanoma subtypes: Genomic profiles, prognostic molecular markers and therapeutic possibilities, J. Pathol., 247:539-551, 2019.

[32]

B. J. Raphael et al., Integrated genomic characterization of pancreatic ductal adenocarcinoma, Cancer Cell, 32:185-203.e13, 2017.

[33]

J. Reimand and G. D. Bader, Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers, Mol. Syst. Biol., 9:637, 2013.

[34]

T. Rimkus, R. Carpenter, S. Qasem, M. Chan, and H.-W. Lo, Targeting the sonic hedgehog signaling pathway: Review of smoothened and GLI inhibitors, Cancers, 8:22, 2016.

[35]

L. Roos et al., Integrative DNA methylome analysis of pan-cancer biomarkers in cancer discordant monozygotic twin-pairs, Clin. Epigenetics, 8:7, 2016.

[36]

R. Schulte-Sasse, S. Budach, D. Hnisz, and A. Marsico, Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms, Nat. Mach. Intell., 3:513-526, 2021.

[37]

C. J. Sherr, Principles of tumor suppression, Cell, 116:235-246, 2004.

[38]

X. Shi et al., Integrated profiling of human pancreatic cancer organoids reveals chromatin accessibility features associated with drug sensitivity, Nat. Commun., 13: 2169, 2022.

[39]

B. Silverman and J. Shi, Alterations of epigenetic regulators in pancreatic cancer and their clinical implications, Int. J. Mol. Sci., 17: 2138, 2016.

[40]

Z. Sondka et al., The COSMIC cancer gene census: Describing genetic dysfunction across all human cancers Nat. Rev. Cancer, 18:696-705, 2018.

[41]

D. Szklarczyk et al., STRING v10: Protein-protein interaction networks, integrated over the tree of life, Nucleic Acids Res., 43:D447-D452, 2015.

[42]

D. Tamborero, A. Gonzalez-Perez, and N. Lopez-Bigas, OncodriveCLUST: Exploiting the positional clustering of somatic mutations to identify cancer genes, Bioinformatics, 29:2238-2244, 2013.

[43]

The Cancer Genome Atlas Research Network, Comprehensive genomic characterization of squamous cell lung cancers, Nature, 489:519-525, 2012.

[44]

The Gene Ontology Consortium, The gene ontology resource: 20 years and still GOing strong, Nucleic Acids Res., 47:D330-D338, 2019.

[45]

C. J. Tokheim, N. Papadopoulos, K. W. Kinzler, B. Vogelstein, and R. Karchin, Evaluating the evaluation of cancer driver genes, Proc. Natl. Acad. Sci. USA, 113:14330-14335, 2016.

[46]

D. Van Daele, B. Weytjens, L. De Raedt, and K. Marchal, OMEN: Network-based driver gene identification using mutual exclusivity, Bioinformatics, 38(12):3245-3251, 2022.

[47]

B. Vogelstein et al., Cancer genome landscapes, Science, 339:1546-1558, 2013.

[48]

B. Vogelstein and K. W. Kinzler, Cancer genes and the pathways they control, Nat. Med., 10:789- 799, 2004.

[49]

X. Wang et al., Heterogeneous graph attention network, in: The World Wide Web Conference, ACM, 2022-2032, 2019.

[50]

J. Wei et al., Identification the prognostic value of glutathione peroxidases expression levels in acute myeloid leukemia, Ann. Transl. Med., 8:678-678, 2020.

[51]

C.-C. Wu et al., Integrated analysis of fine-needle-aspiration cystic fluid proteome, cancer cell secretome, and public transcriptome datasets for papillary thyroid cancer biomarker discovery, Oncotarget, 9:12079-12100, 2018.

[52]

Y. Yang et al., Gene co-expression network analysis reveals common system-level properties of prognostic genes across cancer types, Nat. Commun., 5:3231, 2014.

[53]

X. Zhou et al., A systematic pan-cancer analysis of PXDN as a potential target for clinical diagnosis and treatment, Front. Oncol., 12:952849, 2022.

[54]

Y. Zhou et al., Metascape provides a biologist-oriented resource for the analysis of systems-level datasets, Nat. Commun., 10:1523, 2019.

[55]

https://github.com/YurunLu/MutNet.

PDF (463KB)

243

Accesses

0

Citation

Detail

Sections
Recommended

/