On the Mathematics of RNA Velocity I: Theoretical Analysis

Tiejun Li , Jifan Shi , Yichong Wu , Peijie Zhou

CSIAM Trans. Appl. Math. ›› 2021, Vol. 2 ›› Issue (1) : 1 -55.

PDF (363KB)
CSIAM Trans. Appl. Math. ›› 2021, Vol. 2 ›› Issue (1) : 1 -55. DOI: 10.4208/csiam-am.SO-2020-0001
research-article

On the Mathematics of RNA Velocity I: Theoretical Analysis

Author information +
History +
PDF (363KB)

Abstract

The RNA velocity provides a new avenue to study the stemness and lin- eage of cells in the development in scRNA-seq data analysis. Some promising exten- sions of it are proposed and the community is experiencing a fast developing period. However, in this stage, it is of prime importance to revisit the whole process of RNA velocity analysis from the mathematical point of view, which will help to understand the rationale and drawbacks of different proposals. The current paper is devoted to this purpose. We present a thorough mathematical study on the RNA velocity model from dynamics to downstream data analysis. We derived the analytical solution of the RNA velocity model from both deterministic and stochastic point of view. We presented the parameter inference framework based on the maximum likelihood esti- mate. We also derived the continuum limit of different downstream analysis methods, which provides insights on the construction of transition probability matrix, root and ending-cells identification, and the development routes finding. The overall analysis aims at providing a mathematical basis for more advanced design and development of RNA velocity type methods in the future.

Keywords

RNA velocity / stochastic model / continuum limit / kNN density estimate

Cite this article

Download citation ▾
Tiejun Li, Jifan Shi, Yichong Wu, Peijie Zhou. On the Mathematics of RNA Velocity I: Theoretical Analysis. CSIAM Trans. Appl. Math., 2021, 2(1): 1-55 DOI:10.4208/csiam-am.SO-2020-0001

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

P. Ao. Potential in stochastic differential equations: novel construction. J. Phys. A: Math. Gen., 37(3):L25, 2004.

[2]

C. Bender and S. Orszag. Advanced Mathematical Methods for Scientists and Engineers I: Asymptotic Methods and Perturbation Theory. Springer-Verlag, New York, 1999.

[3]

V. Bergen, M. Lange, S. Peidli, F. A. Wolf, and F. J. Theis. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol., 2020.

[4]

T. Berry and T. Sauer. Local kernels and the geometric structure of data. Appl. Comput. Harmon. Anal., 40(3):439-469, 2016.

[5]

G. R. Bowman, V. S. Pande, and F. Noé. An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation, volume 797. Springer Science & Business Media, 2013.

[6]

A. Butler, P. Hoffman, P. Smibert, E. Papalexi, and R. Satija. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol., 36(5):411420, 2018.

[7]

Z. Cao and R. Grima. Analytical distributions for detailed models of stochastic gene expression in eukaryotic cells. Proc. Natl. Acad. Sci. USA, 117(9):4682-4692, 2020.

[8]

W. Chung, H. H. Eum, H.-O. Lee, K.-M. Lee, H.-B. Lee, K.-T. Kim, H. S. Ryu, S. Kim, J. E. Lee, Y. H. Park, Z. Kan, W. Han, and W.-Y. Park. Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer. Nat. Commun., 8(1):1-12, 2017.

[9]

R. R. Coifman and S. Lafon. Diffusion maps. Appl. Comput. Harmon. Anal., 21(1):5-30, 2006.

[10]

T. Cover and J. Thomas. Elements of Information Theory. John Wiley & Sons, Hoboken, 2nd edition, 2006.

[11]

A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B, 39:1-38, 1977.

[12]

L. Devroye and T. Wagner. The strong uniform consistency of nearest neighbor density estimate. Ann. Stat., 5:536-540, 1977.

[13]

J. Dong, P. Zhou, Y. Wu, W. Wang, Y. Chen, X. Zhou, H. Xie, Y. Gao, J. Lu, J. Yang, X. Zhang, L. Wen, W. Fu, T. Li, and F. Tang. Enhancing single-cell cellular state inference by incorporating molecular network features. bioRxiv:699959,2019.

[14]

W. E, T. Li, and E. Vanden-Eijnden. Optimal partition and effective dynamics of complex networks. Proc. Natl. Acad. Sci. USA, 105:7907-7912, 2008.

[15]

W. E and E. Vanden-Eijnden. Towards a theory of transition paths. J. Stat. Phys., 123(3):503, 2006.

[16]

W. E and E. Vanden-Eijnden. Transition-path theory and path-finding algorithms for the study of rare events. Annu. Rev. Phys. Chem., 61:391-420, 2010.

[17]

D. S. Fischer, A. K. Fiedler, E. M. Kernfeld, R. M. Genga, A. Bastidas-Ponce, M. Bakhti, H. Lickert, J. Hasenauer, R. Maehr, and F. J. Theis. Inferring population dynamics from single-cell RNA-Sequencing time series data. Nat. Biotechnol., 37(4):461-468, 2019.

[18]

H. Ge, M. Qian, and H. Qian. Stochastic theory of nonequilibrium steady states. Part II: Applications in chemical biophysics. Phys. Rep., 510:87-118, 2012.

[19]

D. Gillespie. Markov Processes:An Introduction for Physical Scientists. Academic Press, San Diego, 1992.

[20]

D. Gillespie. The chemical langevin equation. J. Chem. Phys., 113(1):297-306, 2000.

[21]

L. Haghverdi, M. Büttner, F. A. Wolf, F. Buettner, and F. J. Theis. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods, 13(10):845, 2016.

[22]

J. R. Heath, A. Ribas, and P. S. Mischel. Single-cell analysis tools for drug discovery and development. Nat. Rev. Drug Discov., 15(3):204, 2016.

[23]

S. Huang, G. Eichler, Y. Bar-Yam, and D. E. Ingber. Cell fates as high-dimensional attractor states of a complex gene regulatory network. Phys. Rev. Lett., 94(12):128701, 2005.

[24]

T. Jahnke and W. Huisinga. Solving the chemical master equation for monomolecular reaction systems analytically. J. Math. Biol., 54(1):1-26, 2007.

[25]

D. A. Jaitin, E. Kenigsberg, H. Keren-Shaul, N. Elefant, F. Paul, I. Zaretsky, A. Mildner, N. Cohen, S. Jung, A. Tanay, and I. Amit. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science, 343(6172):776-779, 2014.

[26]

S. Jin, A. L. MacLean, T. Peng, and Q. Nie. scEpath: energy landscape-based inference of transition probabilities and cellular trajectories from single-cell transcriptomic data. Bioinformatics, 34(12): 2077-2086, 2018.

[27]

J. Kemeny and J. Snell. Finite Markov Chains. Springer-Verlag, New York, 1976.

[28]

V. Y. Kiselev, K. Kirschner, M. T. Schaub, T. Andrews, A. Yiu, T. Chandra, K. N. Natarajan, W. Reik, M. Barahona, A. R. Green, and M. Hemberg. SC3: consensus clustering of singlecell RNA-seq data. Nat. Methods, 14(5):483-486, 2017.

[29]

G. La Manno, R. Soldatov, A. Zeisel, E. Braun, H. Hochgerner, V. Petukhov, K. Lidschreiber, M. E. Kastriti, P. Lönnerberg, A. Furlan, J. Fan, L. E. Borm, Z. Liu, D. van Bruggen, J. Guo, X. He, R. Barker, E. Sundström, G. Castelo-Branco, P. Cramer, I. Adameyko, S. Linnarsson, and P. V. Kharchenko. RNA velocity of single cells. Nature, 560(7719):494-498, 2018.

[30]

T. Li, J. Shi, Y. Wu, and P. Zhou. On the mathematics of RNA velocity II: algorithmic aspects. In preparation, 2020.

[31]

D. Loftsgaarden and C. Quesenberry. A nonparametric estimate of a multivariate density function. Ann. Math. Stat., 36:1049-1051, 1965.

[32]

E. Z. Macosko, A. Basu, R. Satija, J. Nemesh, K. Shekhar, M. Goldman, I. Tirosh, A. R. Bialas, N. Kamitaki, E. M. Martersteck, J. J. Trombetta, D. A. Weitz, J. R. Sanes, A. K. Shalek, A. Regev, and S. A. McCarroll. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell, 161(5):1202-1214, 2015.

[33]

E. Marco, R. L. Karp, G. Guo, P. Robson, A. H. Hart, L. Trippa, and G.-C. Yuan. Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape. Proc. Natl. Acad. Sci. USA, 111(52):E5643-E5650, 2014.

[34]

P. Metzner, C. Schütte, and E. Vanden-Eijnden. Illustration of transition path theory on a collection of simple examples. J. Chem. Phys., 125(8):084110, 2006.

[35]

P. Metzner, C. Schütte, and E. Vanden-Eijnden. Transition path theory for Markov jump processes. Multiscale Model. Simul., 7(3):1192-1219, 2009.

[36]

F. Noé, C. Schütte, E. Vanden-Eijnden, L. Reich, and T. R. Weikl. Constructing the equilibrium ensemble of folding pathways from short off-equilibrium simulations. Proc. Natl. Acad. Sci. USA, 106(45):19011-19016, 2009.

[37]

A. P. Patel, I. Tirosh, J. J. Trombetta, A. K. Shalek, S. M. Gillespie, H. Wakimoto, D. P. Cahill, B. V. Nahed, W. T. Curry, R. L. Martuza, D. N. Louis, O. Rozenblatt-Rosen, M. L. Suvà, A. Regev, and B. E. Bernstein. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science, 344(6190):1396-1401, 2014.

[38]

S. Picelli, Å. K. Björklund, O. R. Faridani, S. Sagasser, G. Winberg, and R. Sandberg. Smart-seq 2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods, 10(11):1096-1098, 2013.

[39]

Q. Qiu, P. Hu, X. Qiu, K. W. Govek, P. G. Cámara, and H. Wu. Massively parallel and timeresolved RNA sequencing in single cells with scNT-seq. Nat. Methods, pages 1-11, 2020.

[40]

X. Qiu, Q. Mao, Y. Tang, L. Wang, R. Chawla, H. A. Pliner, and C. Trapnell. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods, 14(10):979, 2017.

[41]

X. Qiu, Y. Zhang, D. Yang, S. Hosseinzadeh, L. Wang, R. Yuan, S. Xu, Y. Ma, J. Replogle, S. Darmanis, J. Xing, and J. S. Weissman. Mapping vector field of single cells. bioRxiv:696724, 2019.

[42]

A.-E. Saliba, A. J. Westermann, S. A. Gorski, and J. Vogel. Single-cell RNA-seq: advances and future challenges. Nucleic Acids Res., 42(14):8845-8860, 2014.

[43]

G. Schiebinger, J. Shu, M. Tabaka, B. Cleary, V. Subramanian, A. Solomon, J. Gould, S. Liu, S. Lin, P. Berube, L. Lee, J. Chen, J. Brumbaugh, P. Rigollet, K. Hochedlinger, R. Jaenisch, A. Regev, and E. S. Lander. Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell, 176(4):928-943, 2019.

[44]

Y. Sha, S. Wang, P. Zhou, and Q. Nie. Inference and multiscale model of epithelial-tomesenchymal transition via single-cell transcriptomic data. Nucleic Acids Res., 2020.

[45]

V. Shahrezaei and P. Swain. Analytical distributions for stochastic gene expression. Proc. Natl. Acad. Sci. USA, 105:17256-17261, 2008.

[46]

J. Shi, T. Li, L. Chen, and K. Aihara. Quantifying pluripotency landscape of cell differentiation from scRNA-seq data by continuous birth-death process. PLoS Comput. Biol., 15(11):e1007488, 2019.

[47]

J. Shi, A. E. Teschendorff, W. Chen, L. Chen, and T. Li. Quantifying Waddington's epigenetic landscape: a comparison of single-cell potency measures. Briefings Bioinf., 21(1):248-261, 2020.

[48]

V. Svensson and L. Pachter. RNA velocity: molecular kinetics from single-cell RNA-Seq. Mol. Cell, 72(1):7-9, 2018.

[49]

P. S. Swain, M. B. Elowitz, and E. D. Siggia. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl. Acad. Sci. USA, 99(20):12795-12800, 2002.

[50]

F. Tang, C. Barbacioru, Y. Wang, E. Nordman, C. Lee, N. Xu, X. Wang, J. Bodeau, B. B. Tuch, A. Siddiqui, K. Lao, and M. A. Surani. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods, 6(5):377-382, 2009.

[51]

A. E. Teschendorff and T. Enver. Single-cell entropy for accurate estimation of differentiation potency from a cell's transcriptome. Nat. Commun., 8(1):1-15, 2017.

[52]

D. Ting, L. Huang, and M. Jordan. An analysis of the convergence of graph Laplacians. Proc. 27th Int. Conf. Mach. Learn., 2010.

[53]

C. Trapnell, D. Cacchiarelli, J. Grimsby, P. Pokharel, S. Li, M. Morse, N. J. Lennon, K. J. Livak, T. S. Mikkelsen, and J. L. Rinn. Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions. Nat. Biotechnol., 32(4):381, 2014.

[54]

B. Treutlein, D. G. Brownfield, A. R. Wu, N. F. Neff, G. L. Mantalas, F. H. Espinoza, T. J. Desai, M. A. Krasnow, and S. R. Quake. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature, 509(7500):371-375, 2014.

[55]

C. Waddington. The Strategy of the Genes: a Discussion of Some Aspects of Theoretical Biology. Allen & Unwin, 1957.

[56]

J. Wang, L. Xu, and E. Wang. Potential landscape and flux framework of nonequilibrium networks: robustness, dissipation, and coherence of biochemical oscillations. Proc. Natl. Acad. Sci. USA, 105(34):12271-12276, 2008.

[57]

C. Weinreb, S. Wolock, B. Tusi, M. Socolovsky, and A. Klein. Fundamental limits on dynamic inference from single-cell snapshots. Proc. Natl. Acad. Sci. USA, 115:E2467-E2476, 2018.

[58]

A. R. Wu, N. F. Neff, T. Kalisky, P. Dalerba, B. Treutlein, M. E. Rothenberg, F. M. Mburu, G. L. Mantalas, S. Sim, M. F. Clarke, and S. R. Quake. Quantitative assessment of single-cell RNA-sequencing methods. Nat. Methods, 11(1):41, 2014.

[59]

H. Wu, C. Wang, and S. Wu. Single-cell sequencing for drug discovery and drug development. Curr. Top. Med. Chem., 17(15):1769-1777, 2017.

[60]

Y. Wu, P. Zhou, J. Shi, and T. Li Inferring the RNA velocity based on the stochstic dynamics. In preparation, 2020.

[61]

J. Xing. Mapping between dissipative and Hamiltonian systems. J. Phys. A: Math. Theor., 43(37):375003, 2010.

[62]

L. Zappia, B. Phipson, and A. Oshlack. Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database. PLoS Comput. Biol., 14(6):e1006245, 2018.

[63]

J. Zhang, Q. Nie, and T. Zhou. Revealing dynamic mechanisms of cell fate decisions from single-cell transcriptomic data. Front. Genet., 10:1280, 2019.

[64]

X. Zhang, S. L. Marjani, Z. Hu, S. M. Weissman, X. Pan, and S. Wu. Single-cell sequencing for precise cancer research: progress and prospects. Cancer Res., 76(6):1305-1312, 2016.

[65]

X. Zhang, H. Qian, and M. Qian. Stochastic theory of nonequilibrium steady states and its applications. Part I. Phys. Rep., 510:1-86, 2012.

[66]

G. X. Zheng, J. M. Terry, P. Belgrader, P. Ryvkin, Z. W. Bent, R. Wilson, S. B. Ziraldo, T. D. Wheeler, G. P. McDermott, J. Zhu, M. T. Gregory, J. Shuga, L. Montesclaros, J. G. Underwood, D. A. Masquelier, S. Y. Nishimura, M. Schnall-Levin, P. W. Wyatt, C. M. Hindson, R. Bharadwaj, A. Wong, K. D. Ness, L. W. Beppu, H. J. Deeg, C. McFarland, K. R. Loeb, W. J. Valente, N. G. Ericson, E. A. Stevens, J. P. Radich, T. S. Mikkelsen, B. J. Hindson, and J. H. Bielas. Massively parallel digital transcriptional profiling of single cells. Nat. Commun., 8(1):1-12, 2017.

[67]

P. Zhou and T. Li. Construction of the landscape for multi-stable systems: potential landscape, quasi-potential, A-type integral and beyond. J. Chem. Phys., 144:94109, 2016.

[68]

P. Zhou, S. Wang, T. Li, and Q. Nie Dissecting transition cells from single-cell transcriptome data through multiscale stochastic dynamics. preprint, 2020.

AI Summary AI Mindmap
PDF (363KB)

131

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/