Local causal pathway discovery for single-cell RNA sequencing count data: a benchmark study

Sisi Ma , Jinhua Wang , Cameron Bieganek , Roshan Tourani , Constantin Aliferis

Journal of Translational Genetics and Genomics ›› 2023, Vol. 7 ›› Issue (1) : 50 -65.

PDF
Journal of Translational Genetics and Genomics ›› 2023, Vol. 7 ›› Issue (1) :50 -65. DOI: 10.20517/jtgg.2022.22
review-article

Local causal pathway discovery for single-cell RNA sequencing count data: a benchmark study

Author information +
History +
PDF

Abstract

Aim: Recent developments in single-cell RNA sequencing (scRNAseq) and analysis have revealed regulatory behaviors not previously described using bulk analysis. scRNAseq features resolution at the level of the individual cell and provides opportunities for identifying cell type-specific gene regulatory networks. The technology promises to discover biomarkers and targeted treatments with enhanced effectiveness and reduced side effects. Pathway reverse engineering and causal algorithms have been validated in bulk sequencing transcriptomic data successfully for gene regulatory network reconstruction. In the current study, we evaluated the performance of local causal discovery algorithms for de novo reconstruction of local gene regulatory networks tailored to scRNAseq count data.

Method: We benchmarked the performance of the state-of-the-art local causal discovery algorithm generalized local learning with five conditional independent tests in controlled conditions (simulated count data) and real-world single-cell RNA sequencing datasets.

Results: The simulation study showed that local causal discovery methods with appropriate conditional independence tests could result in excellent discovery performance (given a sufficient sample size). As expected, various conditional independence tests possess different power-sample characteristics. The discovery performance for all tested conditional independence tests on real-world data is relatively low, potentially due to imperfect standards or deviation of simulated data distribution from real-world data.

Conclusion: Our findings provide insights and practical guidance for applying causal discovery methods to single-cell RNAseq data for gene regulatory network reconstruction.

Keywords

scRNAseq / regulatory network reconstruction / multivariate count data / causal discovery

Cite this article

Download citation ▾
Sisi Ma, Jinhua Wang, Cameron Bieganek, Roshan Tourani, Constantin Aliferis. Local causal pathway discovery for single-cell RNA sequencing count data: a benchmark study. Journal of Translational Genetics and Genomics, 2023, 7(1): 50-65 DOI:10.20517/jtgg.2022.22

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

ChengY,WangM.Targeting epigenetic regulators for cancer therapy: mechanisms and advances in clinical trials.Signal Transduct Target Ther2019;4:62 PMCID:PMC6915746

[2]

MalloryXF,NavinN.Methods for copy number aberration detection from single-cell DNA-sequencing data.Genome Biol2020;21:208

[3]

SonesonC.Bias, robustness and scalability in single-cell differential expression analysis.Nat Methods2018;15:255-61

[4]

Arzalluz-LuqueÁ.Single-cell RNAseq for the study of isoforms-how is that possible?.Genome Biol2018;19:110 PMCID:PMC6085759

[5]

PatelAP,TrombettaJJ.Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma.Science2014;344:1396-401 PMCID:PMC4123637

[6]

AziziE,PlitasG.Single-cell map of diverse immune phenotypes in the breast tumor microenvironment.Cell2018;174:1293-1308.e36 PMCID:PMC6348010

[7]

JangJS,MitraAK.Molecular signatures of multiple myeloma progression through single cell RNA-Seq.Blood Cancer J2019;9:2 PMCID:PMC6318319

[8]

MitraAK,HardingT.Single-cell analysis of targeted transcriptome predicts drug sensitivity of single cells within human myeloma tumors.Leukemia2016;30:1094-102

[9]

KimKT,LeeHO.Single-cell mRNA sequencing identifies subclonal heterogeneity in anti-cancer drug responses of lung adenocarcinoma cells.Genome Biol2015;16:127 PMCID:PMC4506401

[10]

HorningAM,LinCK.Single-cell RNA-seq reveals a subpopulation of prostate cancer cells with enhanced cell-cycle-related transcription and attenuated androgen response.Cancer Res2018;78:853-64 PMCID:PMC5983359

[11]

TiroshI,PrakadanSM.Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq.Science2016;352:189-96 PMCID:PMC4944528

[12]

LeeHW,LeeHO.Single-cell RNA sequencing reveals the tumor microenvironment and facilitates strategic choices to circumvent treatment failure in a chemorefractory bladder cancer patient.Genome Med2020;12:47 PMCID:PMC7251908

[13]

ChuT,ScheinesR.A statistical problem for inference to regulatory structure from associations of gene expression measurements with microarrays.Bioinformatics2003;19:1147-52

[14]

BerksonJ.Limitations of the application of fourfold table analysis to hospital data.Int J Epidemiol2014;43:511-5

[15]

ShalekAK.Single-cell analyses to tailor treatments.Sci Transl Med2017;9 PMCID:PMC5645080

[16]

GrubmanA,OuyangJF.A single-cell atlas of entorhinal cortex from individuals with Alzheimer’s disease reveals cell-type-specific gene expression regulation.Nat Neurosci2019;22:2087-97

[17]

RegevA,LanderES.Science forum: the human cell atlas.Elife2017;6:e27041

[18]

Rozenblatt-RosenO,RegevA.The human cell atlas: from vision to reality.Nature2017;550:451-3

[19]

SchillerHB,SimonLM.The human lung cell atlas: a high-resolution reference map of the human lung in health and disease.Am J Respir Cell Mol Biol2019;61:31-41 PMCID:PMC6604220

[20]

WagnerJ,ChevrierS.A single-cell atlas of the tumor and immune ecosystem of human breast cancer.Cell2019;177:1330-1345.e18 PMCID:PMC6526772

[21]

MatsumotoH.SCOUP: a probabilistic model based on the Ornstein-Uhlenbeck process to analyze single-cell expression data during differentiation.BMC Bioinform2016;17:232 PMCID:PMC4898467

[22]

KimJ,NatarajanKN.TENET: gene network reconstruction using transfer entropy reveals key regulatory factors from single cell transcriptomic data.Nucleic Acids Res2021;49:e1 PMCID:PMC7797076

[23]

DeshpandeA,StewartR.Network inference with Granger causality ensembles on single-cell transcriptomics.Cell Rep2022;38:110333 PMCID:PMC9093087

[24]

AibarS,MoermanT.SCENIC: single-cell regulatory network inference and clustering.Nat Methods2017;14:1083-6 PMCID:PMC5937676

[25]

PratapaA,LawJN,MuraliTM.Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data.Nat Methods2020;17:147-54 PMCID:PMC7098173

[26]

MaS,GreshamD.De-novo learning of genome-scale regulatory networks in S. cerevisiae.PLoS One2014;9:e106479 PMCID:PMC4162580

[27]

MaS,AliferisCF.An evaluation of active learning causal discovery methods for reverse-engineering local causal pathways of gene regulation.Sci Rep2016;6:22558 PMCID:PMC4778024

[28]

FriedmanN,PeerD.Learning bayesian network structure from massive datasets: The “sparse candidate” algorithm.arXiv2013;1301.6696

[29]

MaathuisMH,BühlmannP.Estimating high-dimensional intervention effects from observational data.Ann Statist2009;37

[30]

SachsK,Pe’erD,NolanGP.Causal protein-signaling networks derived from multiparameter single-cell data.Science2005;308:523-9

[31]

SpirtesP,ScheinesR.Constructing Bayesian network models of gene expression networks from microarray data. 2000;1390833.

[32]

O’haraRB.Do not log-transform count data.Nature2010;1:118-22

[33]

St-PierreAP,SchneiderDC.Count data in biology-Data transformation or model reformation?.Ecol Evol2018;8:3077-85

[34]

IvesAR.For testing the significance of regression coefficients, go ahead and log-transform count data.Methods Ecol Evol2015;6:828-35

[35]

CameronAC. Regression analysis of count data. Cambridge: Cambridge University Press; 2013.

[36]

ZhangK,JanzingD.Kernel-based conditional independence test and application in causal discovery. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence UAI’11. AUAI Press; 2011, pp. 804-13.

[37]

SzékelyGJ.Partial distance correlation with methods for dissimilarities.Ann Statist2014;42:2382-412

[38]

YangE,LiuZ. Advances in neural information processing systems 25. 2012. Available from: https://proceedings.neurips.cc/paper/2012/file/0ff8033cf9437c213ee13937b1c4c455-Paper.pdf [Last accessed on 23 Feb 2023]

[39]

YangE,AllenGI.On graphical models via univariate exponential family distributions.J Mach Learn Res2015;16:3813-47

[40]

YangE,AllenGI. On poisson graphical models. In advances in neural information processing systems (Curran Associates, Inc.). 2013. Available from: https://dl.acm.org/doi/abs/10.5555/2999792.2999804 [Last accessed on 23 Feb 2023]

[41]

AllenGI.A local poisson graphical model for inferring networks from sequencing data.IEEE Trans Nanobiosci2013;12:189-98

[42]

HanSW.Estimation of sparse directed acyclic graphs for multivariate counts data.Biometrics2016;72:791-803 PMCID:PMC4975686

[43]

HadijiF,NatarajanS.Poisson dependency networks: gradient boosted models for multivariate count data.Mach Learn2015;100:477-507

[44]

PearlJ. Causality. Cambridge University Press. 2009. Available from: https://www.cambridge.org/core/books/causality/B0046844FAE10CBF274D4ACBDAEB5F5B [Last accessed on 23 Feb 2023]

[45]

SpirtesP,ScheinesR. Causation, prediction, and search. Berlin, Germany: Springer-Verlag; 1993.

[46]

AliferisCF,TsamardinosI,KoutsoukosXD.Local causal and markov blanket induction for causal discovery and feature selection for classification Part I: algorithms and empirical evaluation.J Mach Learn Res2010;11:171-234https://jmlr.org/papers/volume11/aliferis10a/aliferis10a.pdf [Last accessed on 23 Feb 2023]

[47]

AliferisCF,TsamardinosI,KoutsoukosXD.Local causal and markov blanket induction for causal discovery and feature selection for classification Part II: analysis and extensions.J Mach Learn Res2010;11:235-84Available from: https://www.jmlr.org/papers/volume11/aliferis10b/aliferis10b.pdf [Last accessed on 23 Feb 2023]

[48]

AliferisCF,StatnikovA.HITON: a novel markov blanket algorithm for optimal variable selection.AMIA Annu Symp Proc2003;2003:21-5 PMCID:PMC1480117

[49]

SaxeGN,RenJ.Machine learning methods to predict child posttraumatic stress: a proof of concept study.BMC Psychiatry2017;17:223 PMCID:PMC5502325

[50]

Galatzer-LevyIR,StatnikovA,ShalevAY.Utilization of machine learning for prediction of post-traumatic stress: a re-examination of cortisol in the prediction and pathways to non-remitting PTSD.Transl Psychiatry2017;7:e0 PMCID:PMC5416681

[51]

Gunlicks-StoesselM,VanZomerenA.Developing a data-driven algorithm for guiding selection between cognitive behavioral therapy, fluoxetine, and combination treatment for adolescent depression.Transl Psychiatry2020;10:321 PMCID:PMC7506003

[52]

WinterhoffB,HeitzF.Developing a clinico-molecular test for individualized treatment of ovarian cancer: the interplay of precision medicine informatics with clinical and health economics dimensions.AMIA Annu Symp Proc2018;2018:1093-102 PMCID:PMC6371365

[53]

StatnikovA,LytkinN.Improving development of the molecular signature for diagnosis of acute respiratory viral infections.Cell Host Microbe2010;7:100-1 PMCID:PMC2824607

[54]

StatnikovA,LemeireJ.Algorithms for discovery of multiple markov boundaries.J Mach Learn Res2013;14:499-566 PMCID:PMC4184048

[55]

PearlJ.Causal inference in statistics: an overview.Statist Surv2009;2009:3

[56]

FisherRA.The distribution of the partial correlation coefficient.Metron1924;3:329-32.Available from: https://digital.library.adelaide.edu.au/dspace/handle/2440/15182 [Last accessed on 23 Feb 2023]

[57]

KobakD.The art of using t-SNE for single-cell transcriptomics.Nat Commun2019;10:5416 PMCID:PMC6882829

[58]

PomboAntunes AR,LodiF.Single-cell profiling of myeloid cells in glioblastoma across species and disease stage reveals macrophage competition and specialization.Nat Neurosci2021;24:595-610

[59]

MelstedP,LiuL.Modular, efficient and constant-memory single-cell RNA-seq preprocessing.Nat Biotechnol2021;39:813-8

[60]

TownesFW,AryeeMJ.Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.Genome Biol2019;20:295 PMCID:PMC6927135

[61]

ParkHT,KimS,NahG.Revealing immune responses in the Mycobacterium avium subsp. paratuberculosis-infected THP-1 cells using single cell RNA-sequencing.PLoS One2021;16:e0254194 PMCID:PMC8253428

[62]

TomaruY,ForrestAR.Regulatory interdependence of myeloid transcription factors revealed by Matrix RNAi analysis.Genome Biol2009;10:R121 PMCID:PMC2810662

[63]

JacksonCA,SaldiGA,GreshamD.Gene regulatory network reconstruction using single-cell RNA sequencing of barcoded genotypes in diverse environments.Elife2020;2020:9 PMCID:PMC7004572

[64]

TchourineK,BonneauR.Condition-Specific Modeling of Biophysical Parameters Advances Inference of Regulatory Networks.Cell Rep2018;23:376-88 PMCID:PMC5987223

[65]

MurphyKR,WolachA. Statistical power analysis: a simple and general model for traditional and modern hypothesis tests. Taylor & Francis: Routledge; 2014.

[66]

KummerfeldE,MaS.Power analysis for causal discovery.Res Square2022;PPR553586

[67]

KummerfeldE,GladW,Ma. Important topics in causal analysis: summary of the caws 2021 round table discussion. In Causal Analysis Workshop Series (PMLR); 2021, pp. 52-4. Available from: https://proceedings.mlr.press/v160/kummerfeld21a/kummerfeld21a.pdf [Last accessed on 23 Feb 2023]

AI Summary AI Mindmap
PDF

73

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/