Single‐cell gene regulatory network analysis for mixed cell populations

Junjie Tang , Changhu Wang , Feiyi Xiao , Ruibin Xi

Quant. Biol. ›› 2024, Vol. 12 ›› Issue (4) : 375 -388.

PDF (4147KB)
Quant. Biol. ›› 2024, Vol. 12 ›› Issue (4) : 375 -388. DOI: 10.1002/qub2.64
RESEARCH ARTICLE

Single‐cell gene regulatory network analysis for mixed cell populations

Author information +
History +
PDF (4147KB)

Abstract

Gene regulatory network (GRN) refers to the complex network formed by regulatory interactions between genes in living cells. In this paper, we consider inferring GRNs in single cells based on single‐cell RNA sequencing (scRNA‐seq) data. In scRNA‐seq, single cells are often profiled from mixed populations, and their cell identities are unknown. A common practice for single‐cell GRN analysis is to first cluster the cells and infer GRNs for every cluster separately. However, this two‐step procedure ignores uncertainty in the clustering step and thus could lead to inaccurate estimation of the networks. Here, we consider the mixture Poisson log‐normal model (MPLN) for network inference of count data from mixed populations. The precision matrices of the MPLN are the GRNs of different cell types. To avoid the intractable optimization of the MPLN’s log‐likelihood, we develop an algorithm called variational mixture Poisson log‐normal (VMPLN) to jointly estimate the GRNs of different cell types based on the variational inference method. We compare VMPLN with state‐of‐the‐art single‐cell regulatory network inference methods. Comprehensive simulation shows that VMPLN achieves better performance, especially in scenarios where different cell types have a high mixing degree. Benchmarking on real scRNA‐seq data also demonstrates that VMPLN can provide more accurate network estimation in most cases. Finally, we apply VMPLN to a large scRNA‐seq dataset from patients infected with severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) and find that VMPLN identifies critical differences in regulatory networks in immune cells between patients with moderate and severe symptoms. The source codes are available on the GitHub website (github.com/XiDsLab/SCVMPLN).

Keywords

gene regulatory network / graphical model / precision matrix / variational inference / single‐cell RNA sequencing

Cite this article

Download citation ▾
Junjie Tang, Changhu Wang, Feiyi Xiao, Ruibin Xi. Single‐cell gene regulatory network analysis for mixed cell populations. Quant. Biol., 2024, 12(4): 375-388 DOI:10.1002/qub2.64

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Arendt D , Musser JM , Baker CV , Bergman A , Cepko C , Erwin DH , et al. The origin and evolution of cell types. Nat Rev Genet. 2016; 17 (12): 744- 57.

[2]

Marbach D , Costello JC , Küffner R , Vega NM , Prill RJ , Camacho DM , et al. Wisdom of crowds for robust gene network inference. Nat Methods. 2012; 9 (8): 796- 804.

[3]

Feng L , Zhang X , Liu B . High-dimensional proportionality test of two covariance matrices and its application to gene expression data. Stat Theory Relat Fields. 2022; 6 (2): 161- 74.

[4]

Gohil SH , Iorgulescu JB , Braun DA , Keskin DB , Livak KJ . Applying high-dimensional single-cell technologies to the analysis of cancer immunotherapy. Nat Rev Clin Oncol. 2021; 18 (4): 244- 56.

[5]

Nam AS , Chaligne R , Landau DA . Integrating genetic and non-genetic determinants of cancer evolution by single-cell multi-omics. Nat Rev Genet. 2021; 22 (1): 3- 18.

[6]

Aibar S , González-Blas CB , Moerman T , Huynh-Thu VA , Imrichova H , Hulselmans G , et al. SCENIC: single-cell regulatory network inference and clustering. Nat Methods. 2017; 14 (11): 1083- 6.

[7]

Specht AT , Li J . LEAP: constructing gene co-expression networks for single-cell RNA-sequencing data using pseudotime ordering. Bioinformatics. 2017; 33 (5): 764- 6.

[8]

Chan T , Stumpf M , Babtie A . Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst. 2017; 5 (3): 251- 67.e3.

[9]

Meinshausen N , Bühlmann P . High‐dimensional graphs and variable selection with the lasso. Ann Stat. 2006; 34 (3).

[10]

Friedman J , Hastie T , Tibshirani R . Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008; 9 (3): 432- 41.

[11]

Zhang Y , Zhou H , Zhou J , Sun W . Regression models for multivariate count data. J Comput Graph Stat. 2017; 26 (1): 1- 13.

[12]

Yang E , Allen G , Liu Z , Ravikumar P . Graphical models via generalized linear models. Adv Neural Inf Process Syst. 2012; 25.

[13]

Allen GI , Liu Z . A local Poisson graphical model for inferring networks from sequencing data. IEEE Trans NanoBioscience. 2013; 12 (3): 189- 98.

[14]

Wu H , Deng X , Ramakrishnan N . Sparse estimation of multivariate Poisson log-normal models from count data. Stat Anal Data Min. 2018; 11 (2): 66- 77.

[15]

Chiquet J , Robin S , Mariadassou M . Variational inference for sparse network reconstruction from count data. International conference on machine learning. PMLR; 2019.

[16]

Silva A , Rothstein SJ , McNicholas PD , Subedi S . A multivariate Poisson-log normal mixture model for clustering transcriptome sequencing data. BMC Bioinf. 2019; 20 (1): 1- 11.

[17]

Choi Y , Coram M , Peng J , Tang H . A Poisson log-normal model for constructing gene covariation network using RNA-seq data. J Comput Biol. 2017; 24 (7): 721- 31.

[18]

Ziegenhain C , Vieth B , Parekh S , Reinius B , Guillaumet-Adkins A , Smets M , et al. Comparative analysis of single-cell RNA sequencing methods. Mol Cell. 2017; 65 (4): 631- 43.e4.

[19]

Jordan MI , Ghahramani Z , Jaakkola TS , Saul LK . An introduction to variational methods for graphical models. Mach Learn. 1999; 37 (2): 183- 233.

[20]

Wainwright MJ , Jordan MI . Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning. 2008; 1: 1- 305.

[21]

Hafemeister C , Satija R . Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019; 20 (1): 296.

[22]

Lun AT L , Bach K , Marioni JC . Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 2016; 17: 1- 14.

[23]

Biernacki C , Celeux G , Govaert G . Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell. 2000; 22 (7): 719- 25.

[24]

Kim S . ppcor: an R package for a fast calculationto semi-partial correlation coefficients. Commun Stat Appl Methods. 2015; 22 (6): 665- 74.

[25]

Huynh-Thu VA , Irrthum A , Wehenkel L , Geurts P . Inferring regulatory networks from expression data using tree-based methods. PLoS One. 2010; 5 (9): e12776.

[26]

Barabási AL , Albert R . Emergence of scaling in random networks. Science. 1999; 286 (5439): 509- 12.

[27]

Pratapa A , Jalihal AP , Law JN , Bharadwaj A , Murali T . Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods. 2020; 17 (2): 147- 54.

[28]

Kang HM , Subramaniam M , Targ S , Nguyen M , Maliskova L , McCarthy E , et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat Biotechnol. 2018; 36 (1): 89- 94.

[29]

Zheng GX , Terry JM , Belgrader P , Ryvkin P , Bent ZW , Wilson R , et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017; 8 (1): 14049.

[30]

Stuart T , Butler A , Hoffman P , Hafemeister C , Papalexi E , Mauck WM , et al. Comprehensive integration of single-cell data. Cell. 2019; 177 (7): 1888- 902.e21.

[31]

Dai H , Li L , Zeng T , Chen L . Cell-specific network constructed by single-cell RNA sequencing data. Nucleicacids research. 2019; 47 (11): e62.

[32]

Papili Gao N , Ud-Dean SM , Gandrillon O , Gunawan R . SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics. 2018; 34 (2): 258- 66.

[33]

Liao M , Liu Y , Yuan J , Wen Y , Xu G , Zhao J , et al. Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19. Nat Med. 2020; 26 (6): 842- 4.

[34]

Grant RA , Morales-Nebreda L , Markov NS , Swaminathan S , Querrey M , Guzman ER , et al. Circuits between infected macrophages and T cells in SARS-CoV-2 pneumonia. Nature. 2021; 590 (7847): 635- 41.

[35]

Janssens S , Pulendran B , Lambrecht BN . Emerging functions of the unfolded protein response in immunity. Nat Immunol. 2014; 15 (10): 910- 9.

[36]

Chan CP , Siu KL , Chin KT , Yuen KY , Zheng B , Jin DY . Modulation of the unfolded protein response by the severe acute respiratory syndrome coronavirus spike protein. J Virol. 2006; 80 (18): 9279- 87.

[37]

Echavarría-Consuegra L , Cook GM , Busnadiego I , Lefèvre C , Keep S , Brown K , et al. Manipulation of the unfolded protein response: a pharmacological strategy against coronavirus infection. PLoS Pathog. 2021; 17 (6): e1009644.

[38]

Shaban MS , Müller C , Mayr-Buro C , Weiser H , Meier-Soelch J , Albert BV , et al. Multi-level inhibition of coronavirus replication by chemical ER stress. Nat Commun. 2021; 12 (1): 5536.

[39]

Chappell L , Russell AJ , Voet T . Single-cell (multi) omics technologies. Annu Rev Genom Hum Genet. 2018; 19 (1): 15- 41.

[40]

Kiselev VY , Andrews TS , Hemberg M . Challenges in unsupervised clustering of single-cell RNA-seq data. Nat Rev Genet. 2019; 20 (5): 273- 82.

[41]

Fan W , Bouguila N . Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications. Multimed Tool Appl. 2014; 70 (3): 1685- 702.

[42]

Boyd S , Parikh N , Chu E , Peleato B , Eckstein J . Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning. 2011; 3: 1- 122.

[43]

Hartigan JA , Wong MA . Algorithm AS 136: a k-means clustering algorithm. J R Stat Soc Ser C. 1979; 28 (1): 100- 8.

RIGHTS & PERMISSIONS

2024 The Author(s). Quantitative Biology published by John Wiley & Sons Australia, Ltd on behalf of Higher Education Press.

AI Summary AI Mindmap
PDF (4147KB)

385

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/