Introduction
Phenotypic heterogeneity in cell populations is a widespread phenomenon observed across physiologic processes and pathological conditions [
1-
9]. Such heterogeneity stems from fluctuations in cellular components (e.g., RNA and protein abundance) and states (e.g., cell-cycle phase and stress), compounded by differences in the microenvironments experienced by the cell (e.g., local cell density and cell-cell contact)[
8-
18]. The fluctuation in internal cellular state fundamentally arises from the discrete nature of molecules and the randomness in their interactions, also known as “noise” [
19-
21]. Phenotypic heterogeneity often manifests as differential fate determination, including cell growth, senescence, and death in response to both natural (e.g., growth factors) and artificial (e.g., therapeutics) stimulus. Functional implications of cell-to-cell variability under various biologic contexts have been firmly established in the past decade [
3,
7,
22-
30].
Historically, biologic models (e.g., cellular networks and signal processing) have been conceived based on experimental measurements of population-average behavior (e.g., western blot). However, given the large degree of phenotypic cell-to-cell variability even between genetically homogeneous individuals and in response to identical stimulus, conclusions reached as such may be misleading [
9,
31,
32]. For example, p53 dynamics in response to DNA damage was previously thought of as damped oscillations, until later single-cell experiments showed that individual cells give rise to varying numbers of p53 pulses of fixed amplitude and duration [
33]. In another case, single-cell analysis of E2F responses to varying levels of Myc expression, which was enabled through viral gene delivery with drastically heterogeneous efficiency, revealed a biphasic dependence of the former on the latter, shedding new light on cell-cycle regulation [
34].
Most studies to date have been focusing on the explanation of cell-to-cell variability (the forward problem), such as the delineation of its source, mechanisms of generation, propagation along cellular networks, as well as the development of computational methods to predict variability given a particular mechanistic structure [
21,
35-
46]. The reverse problem, that is, if and how critical information can be gained from cell-to-cell variability, is only starting to be appreciated and will be the subject under review here. We focus on two specific aspects of the information capacity in heterogeneity observation: First, in constraining mathematical models of the biologic system (e.g., signaling pathways); second, in associating with and predicting functional population-level phenotypes (e.g., drug efficacy).
Using cell-to-cell variability to infer regulatory mechanisms
Efforts in addressing the aforementioned forward problem have offered evidence of regulatory mechanisms underlying cellular pathways dictating the characteristic cell-to-cell variability pattern in single-cell attributes, such as gene or protein expression levels (Figure 1A)[
35,
39,
47-
50]. Such regulatory mechanisms are often described in the form of mathematical models that quantitatively capture the production, degradation, and interaction between cellular components. Different mechanisms correspond to different model structures and parameterization. Given such a model, the variability in single-cell behaviors can be attributed to either intrinsic or extrinsic noise sources. Intrinsic noise can be explicitly accounted for by the mechanistic model itself and the assumed probability distribution dictating the random events involved in the model. Extrinsic noise, on the other hand, refers to the fluctuation in the cellular state or the microenvironment not included in the model, yet affecting system dynamics in a “predetermined manner,” usually by influencing model parameters. Great efforts have been dedicated to identifying such extrinsic factors, the success in which would eliminate the uncertainty in such sources and result in what’s called “regulated cell-to-cell variability.” We refer the readers elsewhere for extensive review on this topic [
18].
The inverse problem of inferring the regulatory mechanism is much more challenging than the forward problem [
51], largely due to the daunting complexity incurred by the inclusion of stochastic elements in the modeling framework and the requirement of more sophisticated statistical toolbox to handle distributional data.
Several studies have made a case for a general framework whereby
a priori construction of a mapping between the regulatory mechanism (e.g., network structure or parameter) and the distribution of some output measure (e.g., protein abundance) allows subsequent inference of the mechanism based on variability observation. In one case, analytically derived relationships between the moments of distribution from an expanded chemical master equation and the structure of the cis-regulatory input function are established to infer the cooperativity as well as the combinatorial logic (‘AND’ vs. ‘OR’) between transcription factors in promoter regulation [
52,
53]. Different sources of intrinsic noise, namely promoter fluctuation and mRNA birth/death fluctuation in a transcriptional system, are also distinguishable based on variability measures [
54]. It was shown in this case that the coefficient of variance in the protein product as a function of time after transcription blockade changes more slowly if promoter fluctuation dominates. It is interesting to note that the steady-state distribution of the protein alone is not sufficient in discriminating between these two noise sources. This is partly because the variability in protein expression mainly scales with the natural protein abundance, which can be treated as a constant extrinsic noise regardless of how the intrinsic noise is generated [
55].
Efforts have also been made in developing parameter estimation schemes based on variability measures to better constrain the mathematical model of a biologic system, as the majority of parameters are difficult to measure directly through experimentation. Although stochastic time-series data in single cells may be highly informative [
56], we focus on the analysis of snapshot distributional data over the population at fixed time points. Such distributional data can be easily made available using mature platforms such as flow cytometry and microscopy with low cost and high efficiency. Note that when time series data of high temporal resolution are of interest, especially when correlation in the temporal dynamics of single cell attributes within individual cells is under investigation, microscopy or other continuous-time measurement platforms are better suited. Various experimental methodologies for studying single cell dynamics and population heterogeneity are reviewed in great detail in the recent literature [
57].
In one of the pioneering studies, Munsky et al. developed a scheme that takes single-cell readout distributions at multiple time points before reaching steady-state and applies discrete state space probability matching [
58]. Importantly, they showed that stationary moments of any arbitrary order are insufficient in uniquely identifying model parameters. For simple systems, analytical expressions of parameters as functions of distribution statistics can be derived. For complex systems, a parameter search process, which identifies the parameters generating distributions most consistent with the experimental observation, proved to be equally effective (Figure 1B). Using IPTG-induced gene expression as a test model, the authors showed that the fitted model captures gene expression distribution at unlearnt induction levels despite drastically different overall shape of the distributions. The efficiency and accuracy of the scheme, however, do depend on the choice of the molecular species and the order of the statistics.
Similar to the earlier examples of regulatory mechanism inference, extrinsic noise, which has been shown to be dominating in most biologic systems, is ignored [
6,
42,
55]. Subsequent studies took on the task of incorporating extrinsic noise into the inference scheme [
6,
42,
55]. This can be accomplished by imposing a distribution over the model parameters and initial conditions [
59]. Instead of finding the best point estimation of each parameter, a distribution of parameter values is obtained [
60]. Despite the constraints on the extrinsic noise structure and the necessity to parameterize its density, results from these studies would allow better statistical modeling of experimentally observed variability.
One concern in performing such inference is the scalability of the scheme: Whether the algorithm can handle systems of much greater size than what is demonstrated as a proof of concept with reasonably high efficiency and low computational cost. To address this issue, moment-based approaches and convolution techniques have been proposed to combine intrinsic and extrinsic noise [
45,
61]. Conventional parameter search algorithms, such as the Metropolis-Hastings MCMC sampler, are used to maximize the likelihood of the parameter posterior. The power of such a framework has been demonstrated by fitting a stochastic model of osmo-stress induced transcriptional activation in budding yeast using only the mean and the variance of the measured distribution. The fitted model successfully predicted the transient bimodality in gene expression, and provided evidence for the role of chromatin remodeling in regulating population variability. Such moment-based frameworks have the potential of alleviating the curse of dimensionality and the prohibitive computational cost of the integer-valued state-space models [
34,
62]. However, their success may depend on the choice of moment closure method, the accuracy of which depends on both the network structure and the unknown parameters. Aside from such heuristic aspects, existing work seems to warrant certain level of confidence in moment-based parameter estimation, which has been argued to be inadequate to handle complicated distributions [
51].
Others have proposed various moment or density based frameworks to perform parameter estimation, as well as model selection, using cell-to-cell heterogeneity distributional data as input [
63-
65]. Alternative approaches include noise autocorrelation function [
66] and spectral analysis [
67]. Studies examining relationships between noise properties and network topology have been examined as well [
68]. These alternative approaches simply take advantage of different statistical features of the distribution of cellular readouts. Regardless of the specific implementation, inference based on variability measures has been proven advantageous over population mean measures, and even necessary under certain circumstances.
For the mechanistic model selection task, a ‘library’ of models has to be constructed a priori, which is limited to our current biologic knowledge base and the scope of our conception. It would be interesting to further investigate more efficient, exhaustive, and less biased ways of constructing such model candidate library.
The explicit modeling of extrinsic noise statistics enables the computational model to shed light on the roles certain modulators play in the regulation of cell-cell variability, whose mechanism of noise generation is not directly captured by the mechanistic model. We note that regulated cell-to-cell variability falls exactly under the category of extrinsic noise. The fundamental idea in regulated cell-to-cell heterogeneity is that there exists high degree of correlation between molecular or dynamical readouts, which may be hidden from the observer, and single cell behaviors, as well as between molecular readouts (e.g., different signaling molecules). As almost all studies in identifying hidden determinants of cell-to-cell variability take a single-cell readout to single-cell activity mapping approach, adequate and flexible modeling of extrinsic noise may be exactly what we need to bridge the gap between population level manifestation of regulated variability and underlying regulatory mechanisms. The regularity and determinism at the single-cell level may be reflected in the exact pattern of the extrinsic noise parameter distribution [
69].
Functional characterization of population through heterogeneity profiling
So far, we have discussed how cell-to-cell variability can provide useful information on the underlying regulatory mechanisms. However, the complexity of cellular heterogeneity often stretches beyond our knowledge base and cognitive boundaries. Nonetheless, as we will show in this section, such complexity does not keep us from exploiting cell-to-cell variability as an effective tool to capture population-level phenotypes (Figure 2A) that may be difficult to characterize or even ill-posed at the single-cell level.
The awareness of phenotypic heterogeneity at the population or tissue level being a critical issue in disease (e.g., cancer) diagnostics and treatment has been raised substantially [
17,
45,
57,
70-
75]. Subpopulations of cancer cells are believed to underlie the emergence of drug resistance [
57,
70,
72,
75-
77]. It has been suggested that the dynamic maintenance of a specific fractional composition of different subpopulation, each with different cellular characteristics, distinguishes different cancer cell populations [
78]. Most studies on cancer heterogeneity so far aim at associating subpopulations with differentially activated pathways, and consequently targeting them with rationally designed drugs [
57,
73,
74,
79-
83]. Though this strategy may seem promising in many cases, the limitation is still the confinement to the existing biologic knowledge base. As demonstrated in a recent study [
84], functional subpopulations may be masked by ‘blind’ adoption of previously established gating strategy and marker set. This is an example of how hidden subpopulations, and thus information critical to the functional output of the entire population, can be lost in naïve handling of distributional heterogeneity profiles. As discussed in the previous section, cell-to-cell variability patterns contain an enormous amount of information regarding the regulatory network underlying cellular behavior; heterogeneity profiling may therefore serve as a robust predictor of population phenotype, which must emerge from the aggregation of single cell behaviors. This is a largely under-appreciated field. However, we point to a couple of pioneering studies from the Altschuler and Wu group along this line of inquiry.
In one study, Slack et al. developed an imaging-assay based platform to extract measurements on the co-localization patterns of fluorescently labeled cellular components from a large number of cells under diverse conditions. For different cells in the same population under identical environmental condition, such co-localization patterns differ significantly and represent phenotypic variability in the signaling state of individual cells. Here, each cell is represented by a point in the pattern space and the phenotypically heterogeneous population hence corresponds to a distribution in such a space (Figure 2B). By applying an unsupervised learning algorithm to phenotypic distributions of a large number of cell populations under diverse conditions, unique phenotypic “stereotypes” are assigned to cells residing in different regions of the pattern space. Each population is subsequently represented by the fractional composition, or a heterogeneity profile, of each stereotype. They found that drugs of similar mechanism usually induce similar patterns in the population heterogeneity profile (Figure 2). Although the markers used to establish the subpopulation profile may not be directly involved in the pathways relevant to the perturbation of interest (such as drug action), the proximity in the heterogeneity profile does suggest a high likelihood of resemblance in the underlying biologic process between different perturbations. This correspondence can facilitate the generation of novel biologic insights (such as modes of drug action) by referencing to established biologic knowledge base (compared to the heterogeneity profile that is induced by drugs of known action mode). The authors also explored the potential benefit of increasing the number of phenotypic stereotypes (the number of subpopulations), and found that the performance of classification plateaus at a relative small number of subpopulations, around 5. This diminished marginal return may point to the drawback of introducing artificial non-biologic separation in phenotypes. That is, not all dimensions in the heterogeneity profile are informative. One can potentially incur unnecessary computational cost by non-selectively keeping all facets of such distributions. Note that the lack of clear mechanistic understanding of each subpopulation and their association with environmental perturbation does not jeopardize the predictive power of the framework. The operation of the computational platform does not depend on how the heterogeneity profile is generated, as long as the single-cell level measurements can be converted into distributional data.
The same group later used the same platform to demonstrate that basal signaling heterogeneity contains information predictive of population response to perturbation, specifically drug sensitivity of both cancer and noncancerous clonal populations. For multiple clones derived from the same population of parental clone of H460 cancer cells, the resemblance in drug sensitivity highly correlate with the similarity in the pattern of basal signaling heterogeneity (Figure 2). Moreover, the predictive information resides in multiple marker sets examined, implying the potential of using patterns of general nonspecific markers in reflecting “deeper” similarities of underlying regulatory networks. This may not be too surprising given the densely connected nature of cellular networks [
85]. Interestingly, a collection of noncancerous clone populations (human bronchial epithelial cell, HBEC) showed reduced ranges of overall heterogeneity and drug sensitivities in comparison with the cancer cell clones. This observation supports the notion that cancer cells possess much greater and clinically relevant phenotypic heterogeneity compared to normal cells. They also demonstrated the power of heterogeneity profiling in differentiating different cancer types. It was noted that the stereotypes of subpopulations are common between different cell populations, and the difference lies in the exact composition of different subpopulation. Critically, increase or decrease in the fraction of a certain subpopulation may be sufficient to account for functional differences between populations in some cases, resembling conventional gating practices in subpopulation identification [
74]; however, higher order characterization of the entire ensemble of subpopulations is generally required.
It is important to note that molecular states of each specific subpopulation and their relationship to population phenotype (drug response) is far from clear, highlighting the potential of unbiased and systematic heterogeneity profiling of general biomarkers in serving as a robust predictor of complex population phenotype as well as a tool to aid the generation of novel biologic insights. Such mapping may not be captured by characterization of single-cell behavior, as it is by definition a statistical property of the population. Several subsequent studies have applied similar imaging based heterogeneity profiling platforms in different biologic context to classify population phenotypes [
86,
87]. We note that imaging only represents one of many mature and emerging techniques of acquiring single-cell attributes. Others include flow cytometry and single-cell sequencing. In fact, distributional data collected through flow cytometry has been strongly suggested to provide diagnostically useful information in myelodysplastic syndrome and related marrow diseases, where differential mixtures of various cell types and maturation stages influence disease progression and treatment outcomes [
88-
91]. However, in such cases, the analysis of the distribution still focus on the extraction of fractional composition of various pre-defined cell types rather than through an unbiased and unsupervised learning algorithm. On the other hand, several studies have laid some computational ground work for carrying out such systematic unbiased analysis on flow cytometry data [
92-
95]. We speculate that by applying such computational platforms to various biologic and clinical contexts, where analysis of distributional data has been of paramount importance, such as immunology, stem cell biology and cancer, major advances will be on the horizon.
Combining the herein reviewed two aspects of heterogeneity-based inquiries, we think that the major gap in making sense of cell-to-cell variability and population-level phenotype remains in relating single-cell activity output to population-level phenotypes where single-cell phenotypic heterogeneity is functionally relevant. For example, in the aforementioned signal colocalization heterogeneity measure, answering questions like how different signaling pathways are differentially activated within individual cells when such cells reside in different regions of the colocalization feature distribution or whether the colocalization feature can be quantitatively related to the probability of cell death or proliferation is key to transforming the statistical correlation between heterogeneity measures and population phenotypes into a more mechanistically insightful relationship amenable to quantitative modeling at the single-cell level using first principles, as illustrated in the first section of the review.
If such a mapping can be defined, one can potentially predict population-level phenotype more accurately and efficiently by first predicting single-cell activity output using either deterministic or stochastic model simulations, and then applying the mapping function. This union may be confounded by spatial factors in cell-to-cell communication and cooperation in defining population context and phenotype, which dictate and depend on single-cell level activities. We envision that the integration of stochastic modeling, parameter inference, and functional mapping between heterogeneity profile and population phenotype will provide novel insights into how variable single-cell activities are determined by underlying molecular regulatory mechanisms and how they cooperate to generate a complex population.
Higher Education Press and Springer-Verlag Berlin Heidelberg