PDF
(1887KB)
Abstract
Background: Single-cell RNA sequencing (scRNA-seq) data provides a whole new view to study disease and cell differentiation development. With the explosive increment of scRNA-seq data, effective models are demanded for mining the intrinsic biological information.
Methods: This paper proposes a novel non-negative matrix factorization (NMF) method for clustering and gene co-expression network analysis, termed Adaptive Total Variation Constraint Hypergraph Regularized NMF (ATV-HNMF). ATV-HNMF can adaptively select the different schemes to denoise the cluster or preserve the cluster boundary information between clusters based on the gradient information. Besides, ATV-HNMF incorporates hypergraph regularization, which can consider high-order relationships between cells to reserve the intrinsic structure of the space.
Results: Experiments show that the performances on clustering outperform other compared methods, and the network construction results are consistent with previous studies, which illustrate that our model is effective and useful.
Conclusion: From the clustering results, we can see that ATV-HNMF outperforms other methods, which can help us to understand the heterogeneity. We can discover many disease-related genes from the constructed network, and some are worthy of further clinical exploration.
Graphical abstract
Keywords
adaptive total variation
/
single-cell RNA sequencing
/
network analysis
/
nonnegative matrix factorization
/
hypergraph
Cite this article
Download citation ▾
Ya-Li Zhu, Xiao-Ning Zhang, Chuan-Yuan Wang, Jin-Xing Liu, Xiang-Zhen Kong.
Adaptive total variation constraint hypergraph regularized NMF for single-cell RNA-seq data analysis.
Quant. Biol., 2021, 9(4): 451-462 DOI:10.15302/J-QB-021-0261
| [1] |
Villani, A. C., Satija, R., Reynolds, G., Sarkizova, S., Shekhar, K., Fletcher, J., Griesbeck, M., Butler, A., Zheng, S., Lazo, S., (2017) Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science, 356, eaah4573
|
| [2] |
Tang, F., Barbacioru, C., Wang, Y., Nordman, E., Lee, C., Xu, N., Wang, X., Bodeau, J., Tuch, B. B., Siddiqui, A., (2009) mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods, 6, 377–382.
|
| [3] |
Islam, S., Kjällquist, U., Moliner, A., Zajac, P., Fan, J. B., Lönnerberg, P. and Linnarsson, S. (2011) Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res., 21, 1160–1167.
|
| [4] |
Xin, Y., Kim, J., Okamoto, H., Ni, M., Wei, Y., Adler, C., Murphy, A. J., Yancopoulos, G. D., Lin, C. and Gromada, J. (2016) RNA sequencing of single human islet cells reveals type 2 diabetes genes. Cell Metab., 24, 608–615.
|
| [5] |
Xu, Y., Mizuno, T., Sridharan, A., Du, Y., Guo, M., Tang, J., Wikenheiser-Brokamp, K. A., Perl, A. T., Funari, V. A., Gokey, J. J., (2016) Single-cell RNA sequencing identifies diverse roles of epithelial cells in idiopathic pulmonary fibrosis. JCI Insight, 1, e90558
|
| [6] |
Usoskin, D., Furlan, A., Islam, S., Abdo, H., Lönnerberg, P., Lou, D., Hjerling-Leffler, J., Haeggström, J., Kharchenko, O., Kharchenko, P. V., (2015) Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat. Neurosci., 18, 145–153.
|
| [7] |
Patel, A. P., Tirosh, I., Trombetta, J. J., Shalek, A. K., Gillespie, S. M., Wakimoto, H., Cahill, D. P., Nahed, B. V., Curry, W. T., Martuza, R. L., (2014) Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science, 344, 1396–1401.
|
| [8] |
Treutlein, B., Brownfield, D. G., Wu, A. R., Neff, N. F., Mantalas, G. L., Espinoza, F. H., Desai, T. J., Krasnow, M. A. and Quake, S. R. (2014) Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature, 509, 371–375.
|
| [9] |
Maaten, L. d. and Hinton, G. (2008) Visualizing data using t-SNE. J. Mach. Learn. Res., 9, 2579–2605.
|
| [10] |
Wold, S., Esbensen, K. and Geladi, P. (1987) Principal component analysis. Chemom. Intell. Lab. Syst., 2, 37–52.
|
| [11] |
von Luxburg, U. (2007) A tutorial on spectral clustering. Stat. Comput., 17, 395–416.
|
| [12] |
Wang, B., Zhu, J., Pierson, E., Ramazzotti, D. and Batzoglou, S. (2017) Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods, 14, 414–416.
|
| [13] |
Jiao, C.-N., Gao, Y.-L., Yu, N., Liu, J.-X. and Qi, L.-Y. (2020) Hyper-graph regularized constrained NMF for selecting differentially expressed genes and tumor classification. IEEE J. Biomed. Health Inform., 24, 3002–3011.
|
| [14] |
Lin, X. and Boutros, P. C. (2020) Optimization and expansion of non-negative matrix factorization. BMC Bioinformatics, 21, 7
|
| [15] |
Yu, N., Wu, M. J., Liu, J. X., Zheng, C. H. and Xu, Y. (2020) Correntropy-based hypergraph regularized NMF for clustering and feature selection on multi-cancer integrated data. IEEE Trans. Cybern.,
|
| [16] |
Gao, Z., Wang, Y.-T., Wu, Q.-W., Ni, J.-C. and Zheng, C.-H. (2020) Graph regularized L2,1-nonnegative matrix factorization for miRNA-disease association prediction. BMC Bioinformatics, 21, 61
|
| [17] |
Zhu, X., Ching, T., Pan, X., Weissman, S. M. and Garmire, L. (2017) Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization. PeerJ, 5, e2888
|
| [18] |
Moon, K. R., Stanley, J. S. III, Burkhardt, D., van Dijk, D., Wolf, G. and Krishnaswamy, S. (2018) Manifold learning-based methods for analyzing single-cell RNA-sequencing data. Curr. Opin. Syst. Biol., 7, 36–46.
|
| [19] |
Cai, D., He, X., Han, J. and Huang, T. S. (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell., 33, 1548–1560.
|
| [20] |
Zeng, K., Yu, J., Li, C., You, J. and Jin, T. (2014) Image clustering by hyper-graph regularized non-negative matrix factorization. Neurocomputing, 138, 209–217.
|
| [21] |
Rudin, L. I., Osher, S. and Fatemi, E. (1992) Nonlinear total variation based noise removal algorithms. Physica. D, 60, 259–268.
|
| [22] |
Leng, C., Cai, G., Yu, D. and Wang, Z. (2017) Adaptive total-variation for non-negative matrix factorization on manifold. Pattern Recognit. Lett., 98, 68–74.
|
| [23] |
Darmanis, S., Sloan, S. A., Zhang, Y., Enge, M., Caneda, C., Shuer, L. M., Hayden Gephart, M. G., Barres, B. A. and Quake, S. R. (2015) A survey of human brain transcriptome diversity at the single cell level. Proc. Natl. Acad. Sci. USA, 112, 7285–7290.
|
| [24] |
Goolam, M., Scialdone, A., Graham, S. J. L., Macaulay, I. C., Jedrusik, A., Hupalowska, A., Voet, T., Marioni, J. C. and Zernicka-Goetz, M. (2016) Heterogeneity in oct4 and sox2 targets biases cell fate in 4-cell mouse embryos. Cell, 165, 61–74.
|
| [25] |
Treutlein, B., Brownfield, D. G., Wu, A. R., Neff, N. F., Mantalas, G. L., Espinoza, F. H., Desai, T. J., Krasnow, M. A. and Quake, S. R. (2014) Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature, 509, 371–375.
|
| [26] |
Grover, A., Sanjuan-Pla, A., Thongjuea, S., Carrelha, J., Giustacchini, A., Gambardella, A., Macaulay, I., Mancini, E., Luis, T. C., Mead, A., (2016) Single-cell RNA sequencing reveals molecular and functional platelet bias of aged haematopoietic stem cells. Nat. Commun., 7, 11075
|
| [27] |
Breton, G., Zheng, S., Valieris, R., Tojal da Silva, I., Satija, R. and Nussenzweig, M. C. (2016) Human dendritic cells (DCs) are derived from distinct circulating precursors that are precommitted to become CD1c+ or CD141+ DCs. J. Exp. Med., 213, 2861–2870.
|
| [28] |
Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B. and Ideker, T. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res., 13, 2498–2504.doi:10.1101/gr.1239303.
|
| [29] |
Yamada, N., Horikawa, Y., Oda, N., Iizuka, K., Shihara, N., Kishi, S. and Takeda, J. (2005) Genetic variation in the hypoxia-inducible factor-1α gene is associated with type 2 diabetes in Japanese. J. Clin. Endocrinol. Metab., 90, 5841–5847.
|
| [30] |
Zhou, J.-C., Zhou, J., Su, L., Huang, K. and Lei, X. G. (2018) Selenium and Diabetes. In: Selenium. MICHALKE, B, 317–344. Cham: Springer International Publishing
|
| [31] |
Brina, D., Miluzio, A., Ricciardi, S., Clarke, K., Davidsen, P. K., Viero, G., Tebaldi, T., Offenhäuser, N., Rozman, J., Rathkolb, B., (2015) eIF6 coordinates insulin sensitivity and lipid metabolism by coupling translation to transcription. Nat. Commun., 6, 8261
|
| [32] |
Olsson, A. H., Yang, B. T., Hall, E., Taneera, J., Salehi, A., Dekker Nitert, M. and Ling, C.. (2011) Decreased expression of genes involved in oxidative phosphorylation in human pancreatic islets from patients with type 2 diabetes. Eur. J. Endocrinol., 165, 589–595.
|
| [33] |
Molina, M. F., Qu, H.-Q., Rentfro, A. R., Nair, S., Lu, Y., Hanis, C. L., McCormick, J. B. and Fisher-Hoch, S. P. (2011) Decreased expression of ATP6V1H in type 2 diabetes: a pilot report on the diabetes risk study in Mexican Americans. Biochem. Biophys. Res. Commun., 412, 728–731.
|
| [34] |
Crétien, A., Proust, A., Delaunay, J., Rincé P., Leblanc, T., Ducrocq, R., Simansour, M., Marie, I., Tamary, H., Meerpohl, J., (2010) Genetic variants in the noncoding region of RPS19 gene in Diamond-Blackfan anemia: potential implications for phenotypic heterogeneity. Am J Hematol, 85, 111–116
|
| [35] |
Liu, S., Kim, T.-H., Franklin, D. A. and Zhang, Y. (2017) Protection against high-fat-diet-induced obesity in mdm2c305f mice due to reduced p53 activity and enhanced energy expenditure. Cell Rep., 18, 1005–1018.
|
| [36] |
Chen, H., Fang, X., Zhu, H., Li, S., He, J., Gu, P., Fan, D., Han, F., Zeng, Y., Yu, X., (2014) Gene expression profile analysis for different idiopathic interstitial pneumonias subtypes. Exp. Lung Res., 40, 367–379.
|
| [37] |
Gharbi-Ayachi, A., Labbé J. C., Burgess, A., Vigneron, S., Strub, J. M., Brioudes, E., Van-Dorsselaer, A., Castro, A. and Lorca, T. (2010) The substrate of Greatwall kinase, Arpp19, controls mitosis by inhibiting protein phosphatase 2A. Science, 330, 1673–1677.
|
| [38] |
Gong, Y., Wu, W., Zou, X., Liu, F., Wei, T. and Zhu, J. (2018) MiR-26a inhibits thyroid cancer cell proliferation by targeting ARPP19. Am J Cancer Res, 8, 1030–1039
|
| [39] |
Miyazaki, H., Patel, V., Wang, H., Edmunds, R. K., Gutkind, J. S. and Yeudall, W. A. (2006) Down-regulation of CXCL5 inhibits squamous carcinogenesis. Cancer Res., 66, 4279–4284.
|
| [40] |
Begley, L. A., Kasina, S., Mehra, R., Adsule, S., Admon, A. J., Lonigro, R. J., Chinnaiyan, A. M. and Macoska, J. A. (2008) CXCL5 promotes prostate cancer progression. Neoplasia, 10, 244–254.
|
| [41] |
Plowman, J., Bolderson, E., Burgess, J., Richard, D. and O’Byrne, K. (2019) Banf1 as a marker of lung cancer cell sensitivity to cisplatin. Lung Cancer, 127, S3
|
| [42] |
Hu, J., Yang, D., Zhang, H., Liu, W., Zhao, Y., Lu, H., Meng, Q., Pang, H., Chen, X., Liu, Y., (2015) USP22 promotes tumor progression and induces epithelial-mesenchymal transition in lung adenocarcinoma. Lung Cancer, 88, 239–245.
|
| [43] |
Levine, S., Chen, Y., and Stanich, J. (2004) Image restoration via nonstandard diffusion. Duquesne University, Department of Mathematics and Computer Science Technical Report. 04-01
|
| [44] |
Huang, S., Wang, H., Ge, Y., Huangfu, L., Zhang, X. and Yang, D. (2018) Improved hypergraph regularized nonnegative matrix factorization with sparse representation. Pattern Recognit. Lett., 102, 8–14.
|
| [45] |
Jin, T., Yu, Z., Gao, Y., Gao, S., Sun, X. and Li, C. (2019) Robust ℓ2− hypergraph and its applications. Inf. Sci., 501, 708–723.
|
| [46] |
Yin, H. and Liu, H. (2010) Nonnegative matrix factorization with bounded total variational regularization for face recognition. Pattern Recognit. Lett., 31, 2468–2473.
|
| [47] |
Hong, M., Razaviyayn, M., Luo, Z.-Q. and Pang, J.-S. (2016) A unified algorithmic framework for block-structured optimization involving big data: With applications in machine learning and signal processing. IEEE Signal Proc. Mag ., 33, 57–77
|
RIGHTS & PERMISSIONS
The Author(s) 2021. Published by Higher Education Press