A pan-cancer integrative pathway analysis of multi-omics data
Henry Linder, Yuping Zhang
A pan-cancer integrative pathway analysis of multi-omics data
Background: Multi-view -omics datasets offer rich opportunities for integrative analysis across genomic, transcriptomic, and epigenetic data platforms. Statistical methods are needed to rigorously implement current research on functional biology, matching the complex dynamics of systems genomic datasets.
Methods: We apply imputation for missing data and a structural, graph-theoretic pathway model to a dataset of 22 cancers across 173 signaling pathways. Our pathway model integrates multiple data platforms, and we test for differential activation between cancerous tumor and healthy tissue populations.
Results: Our pathway analysis reveals significant disturbance in signaling pathways that are known to relate to oncogenesis. We identify several pathways that suggest new research directions, including the Trk signaling and focal adhesion kinase activation pathways in sarcoma.
Conclusions: Our integrative analysis confirms contemporary research findings, which supports the validity of our findings. We implement an interactive data visualization for exploration of the pathway analyses, which is available online for public access.
multi-platform data integration / pathway analysis / imputation / cancer genomics / data visualization
[1] |
Chandrashekar, D.S., Bashel, B., Akshaya, S., Balasubramanya, H., Creighton, C.J., Ponce-Rodriguez, I., Chakravarthi, B. and Varambally, S. (2017) UALCAN: a portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia, 19, 649–658
|
[2] |
Zhang, Y., Ouyang, Z. and Zhao, H. (2017) A statistical framework for data integration through graphical models with application to cancer genomics. Ann. Appl. Stat., 11, 161–184
CrossRef
Pubmed
Google scholar
|
[3] |
Cancer Genome Atlas Research Network (2017) Integrated genomic and molecular characterization of cervical cancer. Nature, 543, 378–384
CrossRef
Pubmed
Google scholar
|
[4] |
Shen, R., Olshen, A. B. and Ladanyi, M. (2009) Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics, 25, 2906–2912
CrossRef
Pubmed
Google scholar
|
[5] |
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S.
|
[6] |
Yan, J., Risacher, S. L., Shen, L. and Saykin, A. J. (2017) Network approaches to systems biology analysis of complex disease: integrative methods for multi -omics data. Brief. Bioinform., 19, 1370–1381
Pubmed
|
[7] |
Ge, Z., Leighton, J. S., Wang, Y., Peng, X., Chen, Z., Chen, H., Sun, Y., Yao, F., Li, J., Zhang, H.,
CrossRef
Pubmed
Google scholar
|
[8] |
Huang, J. K., Carlin, D.E., Yu, M. K., Zhang, W., Kreisberg, J. F., Tamayo, P. and Ideker, T. (2018) Systematic evaluation of molecular networks for discovery of disease genes. Cell Syst., 6, 484–495
|
[9] |
Baryshnikova, A. (2016) Systematic functional annotation and visualization of biological networks. Cell Syst., 2, 412–421
CrossRef
Pubmed
Google scholar
|
[10] |
Vaske, C. J., Benz, S. C., Sanborn, J. Z., Earl, D., Szeto, C., Zhu, J., Haussler, D. and Stuart, J. M. (2010) Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using paradigm. Bioinformatics, 26, i237–i245
CrossRef
Google scholar
|
[11] |
Campbell, J. D., Yau, C., Bowlby, R., Liu, Y., Brennan, K., Fan, H., Taylor, A. M., Wang, C., Walter, V., Akbani, R.,
CrossRef
Pubmed
Google scholar
|
[12] |
Ma, J., Shojaie, A. and Michailidis, G. (2016) Network-based pathway enrichment analysis with incomplete network information. Bioinformatics, 32, 3165–3174
CrossRef
Pubmed
Google scholar
|
[13] |
Robinson, D., Van Allen, E. M., Wu, Y. M., Schultz, N., Lonigro, R. J., Mosquera, J. M., Montgomery, B., Taplin, M. E., Pritchard, C. C., Attard, G.,
CrossRef
Pubmed
Google scholar
|
[14] |
Sanchez-Vega, F., Mina, M., Armenia, J., Chatila, W. K., Luna, A., La, K. C., Dimitriadoy, S., Liu, D. L., Kantheti, H. S., Saghafinia, S.,
CrossRef
Pubmed
Google scholar
|
[15] |
Bonnet, E., Calzone, L. and Michoel, T. (2015) Integrative multi -omics module network inference with Lemon-Tree. PLOS Comput. Biol., 11, e1003983
CrossRef
Pubmed
Google scholar
|
[16] |
Hadfield, J., Croucher, N. J., Goater, R. J., Abudahab, K., Aanensen, D. M. and Harris, S. R. (2017) Phandango: an interactive viewer for bacterial population genomics. Bioinformatics, 34, 292–293
CrossRef
Google scholar
|
[17] |
Wickham, H. (2016) ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag
|
[18] |
Yin, T., Cook, D. and Lawrence, M. (2012) ggbio: an R package for extending the grammar of graphics for genomic data. Genome Biol., 13, R77
CrossRef
Pubmed
Google scholar
|
[19] |
Stempor, P. and Ahringer, J. (2016) SeqPlots–Interactive software for exploratory data analyses, pattern discovery and visualization in genomics. Wellcome Open Res., 1, 14
CrossRef
Pubmed
Google scholar
|
[20] |
Linder, H. and Zhang, Y. (2019) Iterative integrated imputation for missing data and pathway models with applications to breast cancer subtypes. Comm. Statis. Appl. Meth., 26, 411–430
CrossRef
Google scholar
|
[21] |
Zhang, Y., Linder, H. M.Shojaie
|
[22] |
Tomczak, K., Czerwińska, P. and Wiznerowicz, M. (2015) The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol. (Pozn.), 19, A68–A77
CrossRef
Pubmed
Google scholar
|
[23] |
Schaefer, C.F., Anthony, K., Krupa, S., Buchoff, J., Day, M., Hannay, T. and Buetow, K. H. (2008) Pid: the pathway interaction database. Nucleic acids research, 37 (suppl), D674–D679
|
[24] |
Cai, T., Cai, T. T. and Zhang, A. (2016) Structured matrix completion with applications to genomic data integration. J. Am. Stat. Assoc., 111, 621–633
CrossRef
Pubmed
Google scholar
|
[25] |
Shojaie, A. and Michailidis, G. (2009) Analysis of gene sets based on the underlying regulatory network. J. Comput. Biol., 16, 407–426
CrossRef
Pubmed
Google scholar
|
[26] |
Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B, 57, 289–300
CrossRef
Google scholar
|
[27] |
Fowler, N. and Davis, E. (2013) Targeting B-cell receptor signaling: changing the paradigm. Hematology, 553–560
CrossRef
Pubmed
Google scholar
|
[28] |
Burger, J. A. and Wiestner, A. (2018) Targeting B cell receptor signalling in cancer: preclinical and clinical advances. Nat. Rev. Cancer, 18, 148–167
CrossRef
Pubmed
Google scholar
|
[29] |
Roskoski, R. Jr. (2014) The ErbB/HER family of protein-tyrosine kinases and cancer. Pharmacol. Res., 79, 34–74
CrossRef
Pubmed
Google scholar
|
[30] |
Jakowlew, S. B. (2006) Transforming growth factor-β in cancer and metastasis. Cancer Metastasis Rev., 25, 435–457
CrossRef
Pubmed
Google scholar
|
[31] |
Massagué, J. (2008) TGFbeta in Cancer. Cell, 134, 215–230
CrossRef
Pubmed
Google scholar
|
[32] |
Fabregat, I., Fernando, J., Mainez, J. and Sancho, P. (2014) TGF-beta signaling in cancer treatment. Curr. Pharm. Des., 20, 2934–2947
CrossRef
Pubmed
Google scholar
|
[33] |
Iengar, P. (2018) Identifying pathways affected by cancer mutations. Genomics, 110, 318–328
|
[34] |
Leiserson, M. D. M., Blokh, D., Sharan, R. and Raphael. B. J., (2013) Simultaneous identification of multiple driver pathways in cancer. PLOS Comput. Biol., 9, e1003054
|
[35] |
Barletta, C., Lazzaro, D., Prosperi Porta, R., Testa, U., Grignani, F., Ragusa, R. M., Leone, R., Patella, A., Carenza, L. and Peschle, C. (1992) C-MYB activation and the pathogenesis of ovarian cancer. Eur. J. Gynaecol. Oncol., 13, 53–59
Pubmed
|
[36] |
Jin, Y., Zhu, H., Cai, W., Fan, X., Wang, Y., Niu, Y., Song, F. and Bu, Y. (2017) B-myb is up-regulated and promotes cell growth and motility in non-small cell lung cancer. Int. J. Mol. Sci., 18, 860
CrossRef
Pubmed
Google scholar
|
[37] |
Lawn, S., Krishna, N., Pisklakova, A., Qu, X., Fenstermacher, D. A., Fournier, M., Vrionis, F. D., Tran, N., Chan, J. A., Kenchappa, R. S.,
CrossRef
Pubmed
Google scholar
|
[38] |
Meng, L., Liu, B., Ji, R., Jiang, X., Yan, X. and Xin, Y. (2019) Targeting the BDNF/TrkB pathway for the treatment of tumors. Oncol Lett, 17, 2031–2039
Pubmed
|
[39] |
Drilon, A., Siena, S., Ou, S. I., Patel, M., Ahn, M. J., Lee, J., Bauer, T. M., Farago, A. F., Wheler, J. J., Liu, S. V.,
CrossRef
Pubmed
Google scholar
|
[40] |
Heinen, T. E., Dos Santos, R. P., da Rocha, A., Dos Santos, M. P., Lopez, P. L., Silva Filho, M. A., Souza, B. K., Rivero, L. F., Becker, R. G., Gregianin, L. J.,
CrossRef
Pubmed
Google scholar
|
[41] |
Perry, B. C., Wang, S. and Basson, M. D. (2010) Extracellular pressure stimulates adhesion of sarcoma cells via activation of focal adhesion kinase and akt. Am. J. Surg., 200, 610–614
CrossRef
Pubmed
Google scholar
|
[42] |
Crompton, B. D., Carlton, A. L., Thorner, A. R., Christie, A. L., Du, J., Calicchio, M. L., Rivera, M. N., Fleming, M. D., Kohl, N. E., Kung, A. L.,
CrossRef
Pubmed
Google scholar
|
[43] |
Wang, S., Hwang, E. E., Guha, R., O’Neill, A. F., Melong, N., Veinotte, C. J., Conway, A.S., Wuerthele, K., Shen, M., McKnight, C.
|
[44] |
Pihlajamaa, P., Sahu, B., Lyly, L., Aittomäki, V., Hautaniemi, S. and Jänne, O. A. (2014) Tissue-specific pioneer factors associate with androgen receptor cistromes and transcription programs. EMBO J., 33, 312–326
CrossRef
Pubmed
Google scholar
|
[45] |
Foersch, S., Schindeldecker, M., Keith, M., Tagscherer, K. E., Fernandez, A., Stenzel, P. J., Pahernik, S., Hohenfellner, M., Schirmacher, P., Roth, W.,
CrossRef
Pubmed
Google scholar
|
[46] |
Zhao, H., Leppert, J. T. and Peehl, D. M. (2016) A protective role for androgen receptor in clear cell renal cell carcinoma based on mining tcga data. PLoS One, 11, e0146505
CrossRef
Pubmed
Google scholar
|
[47] |
Grossman, R. L., Heath, A. P., Ferretti, V., Varmus, H. E., Lowy, D.R., Kibbe, W. A. and Staudt, L. M. (2016) Toward a shared vision for cancer genomic data. N. Engl. J. Med., 375, 1109–1112
|
[48] |
Zhu, Y., Qiu, P. and Ji, Y. (2014) TCGA-assembler: open-source software for retrieving and processing TCGA data. Nat. Methods, 11, 599–600
CrossRef
Pubmed
Google scholar
|
[49] |
Wei, L., Jin, Z., Yang, S., Xu, Y., Zhu, Y. and Ji, Y. (2018) TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data. Bioinformatics, 34, 1615–1617
CrossRef
Pubmed
Google scholar
|
[50] |
Sales, G., Calura, E. and Romualdi, C. (2018) graphite: GRAPH Interaction from pathway Topological Environment. R package version 1.26.1
|
[51] |
Krämer, N., Schäfer, J. and Boulesteix, A.-L. (2009) Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinformatics, 10, 384
CrossRef
Pubmed
Google scholar
|
[52] |
Kim, S. (2015) ppcor: an r package for a fast calculation to semi-partial correlation coefficients. Commun. Stat. Appl. Methods, 22, 665–674
CrossRef
Pubmed
Google scholar
|
[53] |
Shojaie, A. and Michailidis, G. (2010) Network enrichment analysis in complex experiments. Stat. Appl. Genet. Mol. Biol., 9, e22
CrossRef
Pubmed
Google scholar
|
[54] |
Chang, W., Cheng, J., Allaire, J. J., Xie, Y. H. and McPherson, J. (2018) shiny: Web Application Framework for R. R package version 1.2.0
|
[55] |
Csardi, G. and Nepusz, T. (2006) The igraph software package for complex network research. InterJournal, Complex Syst., 1695
|
[56] |
Almende B. V., Thieurmel, B. and Robert, T. (2018) visNetwork: Network Visualization using vis.js Library. R package version 2.0.4
|
/
〈 | 〉 |