Algorithmic approaches to clonal reconstruction in heterogeneous cell populations

Wazim Mohammed Ismail, Etienne Nzabarushimana, Haixu Tang

PDF(693 KB)
PDF(693 KB)
Quant. Biol. ›› 2019, Vol. 7 ›› Issue (4) : 255-265. DOI: 10.1007/s40484-019-0188-3
REVIEW
REVIEW

Algorithmic approaches to clonal reconstruction in heterogeneous cell populations

Author information +
History +

Abstract

Background: The reconstruction of clonal haplotypes and their evolutionary history in evolving populations is a common problem in both microbial evolutionary biology and cancer biology. The clonal theory of evolution provides a theoretical framework for modeling the evolution of clones.

Results: In this paper, we review the theoretical framework and assumptions over which the clonal reconstruction problem is formulated. We formally define the problem and then discuss the complexity and solution space of the problem. Various methods have been proposed to find the phylogeny that best explains the observed data. We categorize these methods based on the type of input data that they use (space-resolved or time-resolved), and also based on their computational formulation as either combinatorial or probabilistic. It is crucial to understand the different types of input data because each provides essential but distinct information for drastically reducing the solution space of the clonal reconstruction problem. Complementary information provided by single cell sequencing or from whole genome sequencing of randomly isolated clones can also improve the accuracy of clonal reconstruction. We briefly review the existing algorithms and their relationships. Finally we summarize the tools that are developed for either directly solving the clonal reconstruction problem or a related computational problem.

Conclusions: In this review, we discuss the various formulations of the problem of inferring the clonal evolutionary history from allele frequeny data, review existing algorithms and catergorize them according to their problem formulation and solution approaches. We note that most of the available clonal inference algorithms were developed for elucidating tumor evolution whereas clonal reconstruction for unicellular genomes are less addressed. We conclude the review by discussing more open problems such as the lack of benchmark datasets and comparison of performance between available tools.

Keywords

clonal theory / infinite sites assumption / clonal reconstruction problem / bacteria evolution / tumor evolution / combinatorial algorithm / probabilistic algorithm

Cite this article

Download citation ▾
Wazim Mohammed Ismail, Etienne Nzabarushimana, Haixu Tang. Algorithmic approaches to clonal reconstruction in heterogeneous cell populations. Quant. Biol., 2019, 7(4): 255‒265 https://doi.org/10.1007/s40484-019-0188-3

References

[1]
Shapiro, B. J. (2016) How clonal are bacteria over time? Curr. Opin. Microbiol., 31, 116–123
CrossRef Pubmed Google scholar
[2]
Tibayrenc, M., Kjellberg, F. and Ayala, F. J. (1990) A clonal theory of parasitic protozoa: the population structures of Entamoeba, Giardia, Leishmania, Naegleria, Plasmodium, Trichomonas, and Trypanosoma and their medical and taxonomical consequences. Proc. Natl. Acad. Sci. USA, 87, 2414–2418
CrossRef Pubmed Google scholar
[3]
Blount, Z. D., Barrick, J. E., Davidson, C. J. and Lenski, R. E. (2012) Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature, 489, 513–518
CrossRef Pubmed Google scholar
[4]
Wielgoss, S., Barrick, J. E., Tenaillon, O., Cruveiller, S., Chane-Woon-Ming, B., Médigue, C., Lenski, R. E. and Schneider, D. (2011) Mutation rate inferred from synonymous substitutions in a long-term evolution experiment with Escherichia coli. G3: Genes, Genom. Genet., 1, 183–186
CrossRef Pubmed Google scholar
[5]
Behringer, M. G., Choi, B. I., Miller, S. F., Doak, T. G., Karty, J. A., Guo, W. and Lynch, M. (2018) Escherichia coli cultures maintain stable subpopulation structure during long-term evolution. Proc. Natl. Acad. Sci. USA, 115, E4642–E4650
[6]
Pon, J. R. and Marra, M. A. (2015) Driver and passenger mutations in cancer. Annu. Rev. Pathol., 10, 25–50
CrossRef Pubmed Google scholar
[7]
Lenski, R. E., Rose, M. R., Simpson, S. C. and Tadler, S. C. (1991) Long-term experimental evolution in Escherichia coli. I. adaptation and divergence during 2,000 generations. Am. Nat., 138, 1315–1341
CrossRef Google scholar
[8]
Lenski, R. E., Wiser, M. J., Ribeck, N., Blount, Z. D., Nahum, J. R., Morris, J. J., Zaman, L., Turner, C. B., Wade, B. D., Maddamsetti, R., (2015) Sustained fitness gains and variability in fitness trajectories in the long-term evolution experiment with Escherichia coli. P. Roy. Soc. B-Biol. Sci. 282, 20152292
[9]
Plucain, J., Hindré, T., Le Gac, M., Tenaillon, O., Cruveiller, S., Médigue, C., Leiby, N., Harcombe, W. R., Marx, C. J., Lenski, R. E., (2014) Epistasis and allele specificity in the emergence of a stable polymorphism in Escherichia coli. Science, 343, 1366–1369
CrossRef Pubmed Google scholar
[10]
Rozen, D. E. and Lenski, R. E. (2000) Long-term experimental evolution in Escherichia coli. VIII. dynamics of a balanced polymorphism. Am. Nat., 155, 24–35
CrossRef Pubmed Google scholar
[11]
Wiser, M. J., Ribeck, N. and Lenski, R. E. (2013) Long-term dynamics of adaptation in asexual populations. Science, 342, 1364–1367
CrossRef Pubmed Google scholar
[12]
Taus, T., Futschik, A. and Schlötterer, C. (2017) Quantifying selection with pool-seq time series data. Mol. Biol. Evol., 34, 3023–3034
CrossRef Pubmed Google scholar
[13]
Schwartz, R., Schöffer, A.A. (2017) The evolution of tumour phylogenetics: principles and practice. Nat. Re. Genet ., 18, 213–229
[14]
Kimura, M. (1969) The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics, 61, 893–903
Pubmed
[15]
El-Kebir, M., Oesper, L., Acheson-Field, H. and Raphael, B. J. (2015) Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics, 31, i62–i70
CrossRef Pubmed Google scholar
[16]
Ng, C. K., Cooke, S. L., Howe, K., Newman, S., Xian, J., Temple, J., Batty, E. M., Pole, J. C., Langdon, S. P., Edwards, P. A., (2012) The role of tandem duplicator phenotype in tumour evolution in high-grade serous ovarian cancer. J. Pathol., 226, 703–712
CrossRef Pubmed Google scholar
[17]
Yang, L., Luquette, L. J., Gehlenborg, N., Xi, R., Haseley, P. S., Hsieh, C. H., Zhang, C., Ren, X., Protopopov, A., Chin, L., (2013) Diverse mechanisms of somatic structural variations in human cancer genomes. Cell, 153, 919–929
CrossRef Pubmed Google scholar
[18]
Quigley, D. A., Dang, H. X., Zhao, S. G., Lloyd, P., Aggarwal, R., Alumkal, J. J., Foye, A., Kothari, V., Perry, M. D., Bailey, A. M., (2018) Genomic hallmarks and structural variation in metastatic prostate cancer. Cell, 174, 758–769.e9
CrossRef Pubmed Google scholar
[19]
Malikic, S., McPherson, A. W., Donmez, N. and Sahinalp, C. S. (2015) Clonality inference in multiple tumor samples using phylogeny. Bioinformatics, 31, 1349–1356
CrossRef Pubmed Google scholar
[20]
Zare, H., Wang, J., Hu, A., Weber, K., Smith, J., Nickerson, D., Song, C., Witten, D., Blau, C. A. and Noble, W. S. (2014) Inferring clonal composition from multiple sections of a breast cancer. PLOS Comput. Biol., 10, e1003703
CrossRef Pubmed Google scholar
[21]
Fischer, A., Vázquez-García, I., Illingworth J. R. C., and Mustonen, V. (2014) High-definition reconstruction of clonal composition in cancer. Cell Reports, 7, 1740–1752
CrossRef Pubmed Google scholar
[22]
Zaccaria, S., El-Kebir, M., Klau, G. W. and Raphael, B. J. (2017) The copy-number tree mixture deconvolution problem and applications to multi-sample bulk sequencing tumor data. In: International Conference on Research in Computational Molecular Biology, pp. 318–335. Springer
[23]
Husić, E., Li, X., Hujdurović, A., Mehine, M., Rizzi, R., Mäkinen, V., Milanič, M. and Tomescu, A. I. (2019) MIPUP: minimum perfect unmixed phylogenies for multi-sampled tumors via branchings and ILP. Bioinformatics, 35, 769–777
CrossRef Pubmed Google scholar
[24]
Popic, V., Salari, R., Hajirasouliha, I., Kashef-Haghighi, D., West, R. B. and Batzoglou, S. (2015) Fast and scalable inference of multi-sample cancer lineages. Genome Biol., 16, 91
CrossRef Pubmed Google scholar
[25]
Jiao, W., Vembu, S., Deshwar, A. G., Stein, L. and Morris, Q. (2014) Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinformatics, 15, 35
CrossRef Pubmed Google scholar
[26]
Deshwar, A. G., Vembu, S., Yung, C. K., Jang, G. H., Stein, L. and Morris, Q. (2015) PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol., 16, 35
CrossRef Pubmed Google scholar
[27]
Roth, A., Khattra, J., Yap, D., Wan, A., Laks, E., Biele, J., Ha, G., Aparicio, S., Bouchard-Côté, A. and Shah, S. P. (2014) PyClone: statistical inference of clonal population structure in cancer. Nat. Methods, 11, 396–398
CrossRef Pubmed Google scholar
[28]
Hajirasouliha, I., Mahmoody, A. and Raphael, B. J. (2014) A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data. Bioinformatics, 30, i78–i86
CrossRef Pubmed Google scholar
[29]
Miller, C. A., White, B. S., Dees, N. D., Griffith, M., Welch, J. S., Griffith, O. L., Vij, R., Tomasson, M. H., Graubert, T. A., Walter, M. J., (2014) SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLOS Comput. Biol., 10, e1003665
CrossRef Pubmed Google scholar
[30]
Oesper, L., Mahmoody, A. and Raphael, B. J. (2013) THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol., 14, R80
CrossRef Pubmed Google scholar
[31]
Strino, F., Parisi, F., Micsinai, M. and Kluger, Y. (2013) TrAp: a tree approach for fingerprinting subclonal tumor composition. Nucleic Acids Res., 41, e165
CrossRef Pubmed Google scholar
[32]
Deveau, P., Colmet Daage, L., Oldridge, D., Bernard, V., Bellini, A., Chicard, M., Clement, N., Lapouble, E., Combaret, V., Boland, A., (2018) QuantumClone: clonal assessment of functional mutations in cancer based on a genotype-aware method for clonal reconstruction. Bioinformatics, 34, 1808–1816
CrossRef Pubmed Google scholar
[33]
Donmez, N., Malikic, S., Wyatt, A. W., Gleave, M. E., Collins, C. C. and Sahinalp, S. C. (2017) Clonality inference from single tumor samples using low-coverage sequence data. J. Comput. Biol., 24, 515–523
CrossRef Pubmed Google scholar
[34]
Mohammed Ismail, W. and Tang, H. (2019) Clonal reconstruction from time course genomic sequencing data. In: International Conference on Intelligent Biology and Medicine
[35]
El-Kebir, M., Satas, G., Oesper, L. and Raphael, B. J. (2016) Inferring the mutational history of a tumor using multi-state perfect phylogeny mixtures. Cell Syst., 3, 43–53
CrossRef Pubmed Google scholar
[36]
Nieboer, M. M., Dorssers, L. C. J., Straver, R., Looijenga, L. H. J. and de Ridder, J. (2018) TargetClone: A multi-sample approach for reconstructing subclonal evolution of tumors. PLoS One, 13, e0208002
CrossRef Pubmed Google scholar
[37]
Yuan, K., Sakoparnig, T., Markowetz, F. and Beerenwinkel, N. (2015) BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies. Genome Biol., 16, 36
CrossRef Pubmed Google scholar
[38]
Jiang, Y., Qiu, Y., Minn, A. J. and Zhang, N. R. (2016) Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proc. Natl. Acad. Sci. USA, 113, E5528–E5537
CrossRef Pubmed Google scholar
[39]
Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., (2014) TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res., 24, 1881–1893
CrossRef Pubmed Google scholar
[40]
Myers, M. A., Satas, G. and Raphael, B. J. (2019) Calder: Inferring phylogenetic trees from longitudinal tumor samples. Cell Syst., 8, 514–522.e5
CrossRef Pubmed Google scholar
[41]
Sengupta, S., Wang, J., Lee, J., Müller, P., Gulukota, K., Banerjee, A. and Ji, Y. (2014) Bayclone: Bayesian nonparametric inference of tumor subclones using NGS data. In: Pacific Symposium on Biocomputing Co-Chairs, pp. 467–478. World Scientific
[42]
Lee, J., Müller, P., Sengupta, S., Gulukota, K. and Ji, Y. (2016) Bayesian inference for intratumour heterogeneity in mutations and copy number variation. J. R. Stat. Soc. Ser. C Appl. Stat., 65, 547–563
CrossRef Pubmed Google scholar
[43]
Miura, S., Gomez, K., Murillo, O., Huuki, L. A., Vu, T., Buturla, T. and Kumar, S. (2018) Predicting clone genotypes from tumor bulk sequencing of multiple samples. Bioinformatics, 34, 4017–4026
CrossRef Pubmed Google scholar
[44]
Marass, F., Mouliere, F., Yuan, K., Rosenfeld, N. and Markowetz, F. (2016) A phylogenetic latent feature model for clonal deconvolution. Ann. Appl. Stat., 10, 2377–2404
CrossRef Google scholar
[45]
Zhou, T., Sengupta, S., Müller, P. and Ji, Y. (2019) Treeclone: Reconstruction of tumor subclone phylogeny based on mutation pairs using next generation sequencing data. Ann. Appl. Stat., 13, 874–899
CrossRef Google scholar
[46]
Zhou, T., Müller, P., Sengupta, S. and Ji, Y. (2019) Pairclone: a bayesian subclone caller based on mutation pairs. J. R. Stat. Soc. Ser. C Appl. Stat., 68, 705–725
[47]
Qiao, Y., Quinlan, A. R., Jazaeri, A. A., Verhaak, R. G., Wheeler, D. A. and Marth, G. T. (2014) SubcloneSeeker: a computational framework for reconstructing tumor clone structure for cancer variant interpretation and prioritization. Genome Biol., 15, 443
CrossRef Pubmed Google scholar
[48]
Zafar, H., Tzen, A., Navin, N., Chen, K. and Nakhleh, L. (2016) Sifit: a method for inferring tumor trees from single-cell sequencing data under finite-site models. Genome Biol., 18, 178
[49]
Davis, A. and Navin, N. E. (2016) Computing tumor trees from single cells. Genome Biol., 17, 113
CrossRef Pubmed Google scholar
[50]
Ross, E. M. and Markowetz, F. (2016) OncoNEM: inferring tumor evolution from single-cell sequencing data. Genome Biol., 17, 69
CrossRef Pubmed Google scholar
[51]
El-Kebir, M. (2018) SPhyR: tumor phylogeny estimation from single-cell sequencing data under loss and error. Bioinformatics, 34, i671–i679
CrossRef Pubmed Google scholar
[52]
Malikic, S., Jahn, K., Kuipers, J., Sahinalp, C. and Beerenwinkel, N. (2017) Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data. Nat. Commu., 10, 2750
[53]
Salehi, S., Steif, A., Roth, A., Aparicio, S., Bouchard-Côté, A. and Shah, S. P. (2017) ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data. Genome Biol., 18, 44
CrossRef Pubmed Google scholar
[54]
Eaton, J., Wang, J. and Schwartz, R. (2018) Deconvolution and phylogeny inference of structural variations in tumor genomic samples. Bioinformatics, 34, i357–i365
CrossRef Pubmed Google scholar
[55]
Lei, H., Lyu, B., Gertz, E. M., Schaeffer, A. A., Shi, X., Wu, K., Li, G., Xu, L., Hou, Y., Dean, M., (2019) Tumor copy number deconvolution integrating bulk and single-cell sequencing data. In: International Conference on Research in Computational Molecular Biology, pp. 174–189. Springer
[56]
Aganezov, S. and Raphael, B. J. (2019) Reconstruction of clone- and haplotype-specific cancer genome karyotypes from bulk tumor samples. bioRxiv
CrossRef Google scholar
[57]
Chen, G., Ning, B., Shi, T. (2019) Single-cell RNA-seq technologies and related computational data analysis. Front. Genet ., 10, 317–317
CrossRef Google scholar
[58]
Ferreira, P. F., Carvalho, A. M. and Vinga, S. (2018) Scalable probabilistic matrix factorization for single-cell RNA-seq analysis. bioRxiv
CrossRef Google scholar
[59]
Durif, G., Modolo, L., Mold, J. E., Lambert-Lacroix, S. and Picard, F. (2019) Probabilistic count matrix factorization for single cell expression data analysis. Bioinformatics, 35, 4011–4019
CrossRef Pubmed Google scholar
[60]
Sun, S., Chen, Y., Liu, Y. and Shang, X. (2019) A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data. BMC Syst. Biol., 13, 28
CrossRef Pubmed Google scholar

ACKNOWLEDGEMENTS

This research was partially supported by a Multidisciplinary University Research Initiative Award W911NF-09-1-0444 from the US Army Research Office, the National Institute of Health grant 1R01AI108888 and Indiana University (IU) Precision Health Initiative (PHI). We thank Drs. Megan Behringer and Michael Lynch for very inspiring discussions.

COMPLIANCE WITH ETHICS GUIDELINES

The authors Wazim Mohammed Ismail, Etienne Nzabarushimana and Haixu Tang declare that they have no conflict of interests.
This article is a review article and does not contain any studies with human or animal subjects performed by any of the authors.

RIGHTS & PERMISSIONS

2019 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(693 KB)

Accesses

Citations

Detail

Sections
Recommended

/