Background: In eukaryotic genome, chromatin is not randomly distributed in cell nuclei, but instead is organized into higher-order structures. Emerging evidence indicates that these higher-order chromatin structures play important roles in regulating genome functions such as transcription and DNA replication. With the advancement in 3C (chromosome conformation capture) based technologies, Hi-C has been widely used to investigate genome-wide long-range chromatin interactions during cellular differentiation and oncogenesis. Since the first publication of Hi-C assay in 2009, lots of bioinformatic tools have been implemented for processing Hi-C data from mapping raw reads to normalizing contact matrix and high interpretation, either providing a whole workflow pipeline or focusing on a particular process.
Results: This article reviews the general Hi-C data processing workflow and the currently popular Hi-C data processing tools. We highlight on how these tools are used for a full interpretation of Hi-C results.
Conclusions: Hi-C assay is a powerful tool to investigate the higher-order chromatin structure. Continued development of novel methods for Hi-C data analysis will be necessary for better understanding the regulatory function of genome organization.
Background: Aging is a complex systems level problem that needs a systems level solution. However, system models of aging and longevity, although urgently needed, are still lacking, largely due to the paucity of conceptual frameworks for modeling such a complex process.
Results: We propose that aging can be viewed as a decline in system capacity, defined as the maximum level of output that a system produces to fulfill demands. Classical aging hallmarks and anti-aging strategies can be well-aligned to system capacity. Genetic variants responsible for lifespan variation across individuals or species can also be explained by their roles in system capacity. We further propose promising directions to develop systems approaches to modulate system capacity and thus extend both healthspan and lifespan.
Conclusions: The system capacity model of aging provides an opportunity to examine aging at the systems level. This model predicts that the extent to which aging can be modulated is normally limited by the upper bound of the system capacity of a species. Within such a boundary, aging can be delayed by moderately increasing an individual’s system capacity. Beyond such a boundary, increasing the upper bound is required, which is not unrealistic given the unlimited potential of regenerative medicine in the future, but it requires increasing the capacity of the whole system instead of only part of it.
Background: The shortage of available organs for transplantation is the major obstacle hindering the application of regenerative medicine, and has also become the desperate problem faced by more and more patients nowadays. The recent development and application of 3D printing technique in biological research (bioprinting) has revolutionized the tissue engineering methods, and become a promising solution for tissue regeneration.
Results: In this review, we summarize the current application of bioprinting in producing tissues and organoids, and discuss the future directions and challenges of 3D bioprinting.
Conclusions: Currently, 3D bioprinting is capable to generate patient-specialized bone, cartilage, blood vascular network, hepatic unit and other simple components/tissues, yet pure cell-based functional organs are still desired.
Background: Restricted Boltzmann machines (RBMs) are endowed with the universal power of modeling (binary) joint distributions. Meanwhile, as a result of their confining network structure, training RBMs confronts less difficulties when dealing with approximation and inference issues. But little work has been developed to fully exploit the capacity of these models to analyze cancer data, e.g., cancer genomic, transcriptomic, proteomic and epigenomic data. On the other hand, in the cancer data analysis task, the number of features/predictors is usually much larger than the sample size, which is known as the “p≫N ” problem and is also ubiquitous in other bioinformatics and computational biology fields. The “p≫N ” problem puts the bias-variance trade-off in a more crucial place when designing statistical learning methods. However, to date, few RBM models have been particularly designed to address this issue.
Methods: We propose a novel RBMs model, called elastic restricted Boltzmann machines (eRBMs), which incorporates the elastic regularization term into the likelihood function, to balance the model complexity and sensitivity. Facilitated by the classic contrastive divergence (CD) algorithm, we develop the elastic contrastive divergence (eCD) algorithm which can train eRBMs efficiently.
Results: We obtain several theoretical results on the rationality and properties of our model. We further evaluate the power of our model based on a challenging task — predicting dichotomized survival time using the molecular profiling of tumors. The test results show that the prediction performance of eRBMs is much superior to that of the state-of-the-art methods.
Conclusions: The proposed eRBMs are capable of dealing with the “p≫ N” problems and have superior modeling performance over traditional methods. Our novel model is a promising method for future cancer data analysis.
Background: Synthetic microbial consortia are conglomerations of genetically engineered microbes programmed to cooperatively bring about population-level phenotypes. By coordinating their activity, the constituent strains can display emergent behaviors that are difficult to engineer into isogenic populations. To do so, strains are engineered to communicate with one another through intercellular signaling pathways that depend on cell density.
Methods: Here, we used computational modeling to examine how the behavior of synthetic microbial consortia results from the interplay between population dynamics governed by cell growth and internal transcriptional dynamics governed by cell-cell signaling. Specifically, we examined a synthetic microbial consortium in which two strains each produce signals that down-regulate transcription in the other. Within a single strain this regulatory topology is called a “co-repressive toggle switch” and can lead to bistability.
Results: We found that in co-repressive synthetic microbial consortia the existence and stability of different states depend on population-level dynamics. As the two strains passively compete for space within the colony, their relative fractions fluctuate and thus alter the strengths of intercellular signals. These fluctuations drive the consortium to alternative equilibria. Additionally, if the growth rates of the strains depend on their transcriptional states, an additional feedback loop is created that can generate oscillations.
Conclusions: Our findings demonstrate that the dynamics of microbial consortia cannot be predicted from their regulatory topologies alone, but are also determined by interactions between the strains. Therefore, when designing synthetic microbial consortia that use intercellular signaling, one must account for growth variations caused by the production of protein.
Background: The CRISPR-Cas system is a widespread prokaryotic defense system which targets and cleaves invasive nucleic acids, such as plasmids or viruses. So far, a great number of studies have focused on the components and mechanisms of this system, however, a direct visualization of CRISPR-Cas degrading invading DNA in real-time has not yet been studied at the single-cell level.
Methods: In this study, we fluorescently label phage lambda DNA in vivo, and track the labeled DNA over time to characterize DNA degradation at the single-cell level.
Results: At the bulk level, the lysogenization frequency of cells harboring CRISPR plasmids decreases significantly compared to cells with a non-CRISPR control. At the single-cell level, host cells with CRISPR activity are unperturbed by phage infection, maintaining normal growth like uninfected cells, where the efficiency of our anti-lambda CRISPR system is around 26%. During the course of time-lapse movies, the average fluorescence of invasive phage DNA in cells with CRISPR activity, decays more rapidly compared to cells without, and phage DNA is fully degraded by around 44 minutes on average. Moreover, the degradation appears to be independent of cell size or the phage DNA ejection site suggesting that Cas proteins are dispersed in sufficient quantities throughout the cell.
Conclusions: With the CRISPR-Cas visualization system we developed, we are able to examine and characterize how a CRISPR system degrades invading phage DNA at the single-cell level. This work provides direct evidence and improves the current understanding on how CRISPR breaks down invading DNA.
Background: Single-cell RNA sequencing (scRNA-seq) is an emerging technology that enables high resolution detection of heterogeneities between cells. One important application of scRNA-seq data is to detect differential expression (DE) of genes. Currently, some researchers still use DE analysis methods developed for bulk RNA-Seq data on single-cell data, and some new methods for scRNA-seq data have also been developed. Bulk and single-cell RNA-seq data have different characteristics. A systematic evaluation of the two types of methods on scRNA-seq data is needed.
Results: In this study, we conducted a series of experiments on scRNA-seq data to quantitatively evaluate 14 popular DE analysis methods, including both of traditional methods developed for bulk RNA-seq data and new methods specifically designed for scRNA-seq data. We obtained observations and recommendations for the methods under different situations.
Conclusions: DE analysis methods should be chosen for scRNA-seq data with great caution with regard to different situations of data. Different strategies should be taken for data with different sample sizes and/or different strengths of the expected signals. Several methods for scRNA-seq data show advantages in some aspects, and DEGSeq tends to outperform other methods with respect to consistency, reproducibility and accuracy of predictions on scRNA-seq data.
Background: Symmetry of biological structures can be thought as the repetition of their parts in different positions and orientations. Asymmetry analyses, therefore, focuses on identifying and measuring the location and extent of symmetry departures in such structures. In the context of geometric morphometrics, a key step when studying morphological variation is the estimation of the symmetric shape. The standard procedure uses the least-squares Procrustes superimposition, which by averaging shape differences often underestimates the symmetry departures thus leading to an inaccurate description of the asymmetry pattern. Moreover, the corresponding asymmetry values are neither geometrically intuitive nor visually perceivable.
Methods: In this work, a resistant method for landmark-based asymmetry analysis of individual bilateral symmetric structures in 2D is introduced. A geometrical derivation of this new approach is offered, while its advantages in comparison with the standard method are examined and discussed through a few illustrative examples.
Results: Experimental tests on both artificial and real data show that asymmetry is more effectively measured by using the resistant method because the underlying symmetric shape is better estimated. Therefore, the most asymmetric (respectively symmetric) landmarks are better determined through their large (respectively small) residuals. The percentage of asymmetry that is accounted for by each landmark is an additional revealing measure the new method offers which agrees with the displayed results while helping in their biological interpretation.
Conclusions: The resistant method is a useful exploratory tool for analyzing shape asymmetry in 2D, and it might be the preferable method whenever a non homogeneous deformation of bilateral symmetric structures is possible. By offering a more detailed and rather exhaustive explanation of the asymmetry pattern, this new approach will hopefully contribute to improve the quality of biological or developmental inferences.
X-ray Free Electron Lasers (XFELs) have advanced research in structure biology, by exploiting their ultra-short and bright X-ray pulses. The resulting “diffraction before destruction” experimental approach allows data collection to outrun radiation damage, a crucial factor that has often limited resolution in the structure determination of biological molecules. Since the first hard X-ray laser (the Linac Coherent Light Source (LCLS) at SLAC) commenced operation in 2009, serial femtosecond crystallography (SFX) has rapidly matured into a method for the structural analysis of nano- and micro-crystals. At the same time, single particle structure determination by coherent diffractive imaging, with one particle (such as a virus) per shot, has been under intense development. In this review we describe these applications of X-ray lasers in structural biology, with a focus particularly on aspects of data analysis for the computational research community. We summarize the key problems in data analysis and model reconstruction, and provide perspectives on future research using computational methods.
Genome-wide chromatin interaction analysis has become important for understanding 3D topological structure of a genome as well as for linking distal cis-regulatory elements to their target genes. Compared to the Hi-C method, chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) is unique, in that one can interrogate thousands of chromatin interactions (in a genome) mediated by a specific protein of interest at high resolution and reasonable cost. However, because of the noisy nature of the data, efficient analytical tools have become necessary. Here, we review some new computational methods recently developed by us and compare them with other existing methods. Our intention is to help readers to better understand ChIA-PET results and to guide the users on selection of the most appropriate tools for their own projects.