Please wait a minute...

Quantitative Biology

  • Cover Illustration

    2017, Vol.5  No.3

    Biological studies are like fishing in the sea of information. We draw samples from the boundless sea using technologies like Hi-C, Chip-seq, RNA-seq and so on. Even with the high-throughput technologies, we could only see a small piece of the whole system. Computational methods are needed to filter the signal from noise and to obtain a complete picture of the system.

    Table of Contents

Current Issue

, Volume 5 Issue 3 Previous Issue   
For Selected: View Abstracts Toggle Thumbnails
PREFACE
PICB: facing the new challenge and opportunity of big data
Zefeng Wang
Quant. Biol.. 2017, 5 (3): 203-204.   DOI: 10.1007/s40484-017-0118-6
Abstract   HTML   PDF (38KB)
References | Related Articles | Metrics
MINI REVIEW
Multifaceted roles of complementary sequences on circRNA formation
Qin Yang, Ying Wang, Li Yang
Quant. Biol.. 2017, 5 (3): 205-209.   DOI: 10.1007/s40484-017-0112-7
Author Summary   Abstract   HTML   PDF (417KB)

Background: Circular RNAs (circRNAs) from back-spliced exon(s) are characterized by the covalently closed loop feature with neither 5′ to 3′ polarity nor polyadenylated tail. By using specific computational approaches that identify reads mapped to back-splice junctions with a reversed genomic orientation, ten thousands of circRNAs have been recently re-identified in various cell lines/tissues and across different species. Increasing lines of evidence suggest that back-splicing is catalyzed by the canonical spliceosomal machinery and modulated by cis-elements and trans-factors.

Results: In this mini-review, we discuss our current understanding of circRNA biogenesis regulation, mainly focusing on the complex regulation of complementary sequences, especially Alus in human, on circRNA formation.

Conclusions: Back-splicing can be significantly facilitated by RNA pair formed by orientation-opposite complementary sequences that juxtapose flanking introns of circularized exon(s). RNA pair formed within individual introns competes with RNA pair formed across flanking introns in the same gene locus, leading to distinct choices for either canonical splicing or back-splicing. Multiple RNA pairs that bracket different circle-forming exons compete for alternative back-splicing selection, resulting in multiple circRNAs generated in a single gene locus.

A large amount of circRNAs have been recently re-discovered from thousands of gene loci in various cell lines/tissues and across different species, and have been suggested to play important roles in gene expression regulation with different mechanisms of action. These results thus expand our understanding on the complexity and diversity of eukaryotic circular RNAs. Recent studies have shown that both cis-elements and trans-factors can promote back-splicing for circRNA biogenesis. We review recent research progress on the regulation of circRNA biogenesis, focusing on our current understanding of the complex regulation of cis complementary sequences, especially Alus in human, on circRNA formation.
Figures and Tables | References | Related Articles | Metrics
Decoding nervous system by single-cell RNA sequencing
Ganlu Hu, Guang-Zhong Wang
Quant. Biol.. 2017, 5 (3): 210-214.   DOI: 10.1007/s40484-017-0116-3
Author Summary   Abstract   HTML   PDF (125KB)

Background: Mammalian brain are composed of a large number of specialized cell types with diverse molecular composition, functions and differentiation potentials. The application of recently developed single-cell RNA sequencing (scRNA-seq) technology in this filed has provided us new insights about this sophisticated system, deepened our understanding of the cell type diversity and led to the discovery of novel cell types.

Results: Here we review recent progresses of applying this technology on studying brain cell heterogeneity, adult neurogenesis as well as brain tumors, then we discuss some current limitations and future directions of using scRNA-seq on the investagation of nervous system.

Conclusions: We believe the application of single-cell RNA sequencing in neuroscience will accelerate the progress of big brain projects.

The development of single cell RNA sequencing technology enable researchers to profile the expression of large number of genes at single cell level. Based on these information, a comprehensive survey of the list of cell types in a given tissue becomes possible and the marker genes in a specific cell state or developmental stage can be characterized. The application of single cell RNA sequencing in neuroscience has led to very fruitful results. This review summarize the recent applications and findings of single cell RNA sequencing in this field, and highlight its potential challenges and future directions in brain related researches.
References | Related Articles | Metrics
REVIEW
Computational tools for Hi-C data analysis
Zhijun Han, Gang Wei
Quant. Biol.. 2017, 5 (3): 215-225.   DOI: 10.1007/s40484-017-0113-6
Author Summary   Abstract   HTML   PDF (466KB)

Background: In eukaryotic genome, chromatin is not randomly distributed in cell nuclei, but instead is organized into higher-order structures. Emerging evidence indicates that these higher-order chromatin structures play important roles in regulating genome functions such as transcription and DNA replication. With the advancement in 3C (chromosome conformation capture) based technologies, Hi-C has been widely used to investigate genome-wide long-range chromatin interactions during cellular differentiation and oncogenesis. Since the first publication of Hi-C assay in 2009, lots of bioinformatic tools have been implemented for processing Hi-C data from mapping raw reads to normalizing contact matrix and high interpretation, either providing a whole workflow pipeline or focusing on a particular process.

Results: This article reviews the general Hi-C data processing workflow and the currently popular Hi-C data processing tools. We highlight on how these tools are used for a full interpretation of Hi-C results.

Conclusions: Hi-C assay is a powerful tool to investigate the higher-order chromatin structure. Continued development of novel methods for Hi-C data analysis will be necessary for better understanding the regulatory function of genome organization.

Hi-C, the derivative of the chromosome conformation capture (3C) technology, has been widely used to dissect chromatin architecture and greatly contributed to our understanding of the relationship between genome organization and genome function. The computational methods for data analysis are essential for a full interpretation of Hi-C data. In this article, we review the general Hi-C data processing workflow and popular Hi-C data processing tools. We also discuss the challenges and future perspective regarding the improvement of Hi-C data analysis.
Figures and Tables | References | Related Articles | Metrics
An introduction to computational tools for differential binding analysis with ChIP-seq data
Shiqi Tu, Zhen Shao
Quant. Biol.. 2017, 5 (3): 226-235.   DOI: 10.1007/s40484-017-0111-8
Author Summary   Abstract   HTML   PDF (658KB)

Background: Gene transcription in eukaryotic cells is collectively controlled by a large panel of chromatin associated proteins and ChIP-seq is now widely used to locate their binding sites along the whole genome. Inferring the differential binding sites of these proteins between biological conditions by comparing the corresponding ChIP-seq samples is of general interest, yet it is still a computationally challenging task.

Results: Here, we briefly review the computational tools developed in recent years for differential binding analysis with ChIP-seq data. The methods are extensively classified by their strategy of statistical modeling and scope of application. Finally, a decision tree is presented for choosing proper tools based on the specific dataset.

Conclusions: Computational tools for differential binding analysis with ChIP-seq data vary significantly with respect to their applicability and performance. This review can serve as a practical guide for readers to select appropriate tools for their own datasets.

ChIP-seq experiment is now widely used to locate transcription factor binding sites and histone modification enrichments on a genome wide scale. Identifying the genomic regions with differential ChIP-seq signals across conditions is of broad biological and biomedical interests, yet it is still a computationally challenging task. By briefly reviewing the computational tools developed for differential binding analysis with ChIP-seq data, we summarize their characteristics in terms of strategy of statistical modeling and scope of application, providing a practical guide for readers to select appropriate tools for their own datasets.
Figures and Tables | References | Related Articles | Metrics
Models, methods and tools for ancestry inference and admixture analysis
Kai Yuan, Ying Zhou, Xumin Ni, Yuchen Wang, Chang Liu, Shuhua Xu
Quant. Biol.. 2017, 5 (3): 236-250.   DOI: 10.1007/s40484-017-0117-2
Author Summary   Abstract   HTML   PDF (1218KB)

Background: Genetic admixture refers to the process or consequence of interbreeding between two or more previously isolated populations within a species. Compared to many other evolutionary driving forces such as mutations, genetic drift, and natural selection, genetic admixture is a quick mechanism for shaping population genomic diversity. In particular, admixture results in “recombination” of genetic variants that have been fixed in different populations, which has many evolutionary and medical implications.

Results: However, it is challenging to accurately reconstruct population admixture history and to understand of population admixture dynamics. In this review, we provide an overview of models, methods, and tools for ancestry inference and admixture analysis.

Conclusions: Many methods and tools used for admixture analysis were originally developed to analyze human data, but these methods can also be directly applied and/or slightly modified to study non-human species as well.

Recent advances in genotyping and sequencing technologies have facilitated genome-wide investigation of genetic variations in diverse populations, which also unveiled prevalent genetic admixture among previously separated populations. Accordingly, many methods have been developed to reconstruct population admixture history and to understand of population admixture dynamics. Here we provide an overview of the relevant methods for ancestry inference and admixture analysis that have been published to date.
Figures and Tables | References | Related Articles | Metrics
PERSPECTIVE
The system capacity view of aging and longevity
Jing-Dong J. Han, Lei Hou, Na Sun, Chi Xu, Joseph McDermott, Dan Wang
Quant. Biol.. 2017, 5 (3): 251-259.   DOI: 10.1007/s40484-017-0115-4
Author Summary   Abstract   HTML   PDF (965KB)

Background: Aging is a complex systems level problem that needs a systems level solution. However, system models of aging and longevity, although urgently needed, are still lacking, largely due to the paucity of conceptual frameworks for modeling such a complex process.

Results: We propose that aging can be viewed as a decline in system capacity, defined as the maximum level of output that a system produces to fulfill demands. Classical aging hallmarks and anti-aging strategies can be well-aligned to system capacity. Genetic variants responsible for lifespan variation across individuals or species can also be explained by their roles in system capacity. We further propose promising directions to develop systems approaches to modulate system capacity and thus extend both healthspan and lifespan.

Conclusions: The system capacity model of aging provides an opportunity to examine aging at the systems level. This model predicts that the extent to which aging can be modulated is normally limited by the upper bound of the system capacity of a species. Within such a boundary, aging can be delayed by moderately increasing an individual’s system capacity. Beyond such a boundary, increasing the upper bound is required, which is not unrealistic given the unlimited potential of regenerative medicine in the future, but it requires increasing the capacity of the whole system instead of only part of it.

We draw an analogy of “system capacity” concept in aging to that in manufacturing system. We define system capacity as the maximum level of output that a system produces to fulfill demands. The system capacity of a whole organism is maintained through the capacity of different components, together with the interactions and regulations among these components. When an animal is young, system capacity is enough to meet most if not all physiological demand; when it ages, decreased system capacity lead to overload, resulting in increasing failure rate, which in turn undermines system capacity and further increases unfulfilled demand, which ultimately leads to death.
Figures and Tables | References | Related Articles | Metrics
ePlant for quantitative and predictive plant science research in the big data era —Lay the foundation for the future model guided crop breeding, engineering and agronomy
Yi Xiao, Tiangen Chang, Qingfeng Song, Shuyue Wang, Danny Tholen, Yu Wang, Changpeng Xin, Guangyong Zheng, Honglong Zhao, Xin-Guang Zhu
Quant. Biol.. 2017, 5 (3): 260-271.   DOI: 10.1007/s40484-017-0110-9
Author Summary   Abstract   HTML   PDF (719KB)

Background: The increase in global population, climate change and stagnancy in crop yield on unit land area basis in recent decades urgently call for a new approach to support contemporary crop improvements. ePlant is a mathematical model of plant growth and development with a high level of mechanistic details to meet this challenge.

Results: ePlant integrates modules developed for processes occurring at drastically different temporal (10–8–106 seconds) and spatial (10–10–10 meters) scales, incorporating diverse physical, biophysical and biochemical processes including gene regulation, metabolic reaction, substrate transport and diffusion, energy absorption, transfer and conversion, organ morphogenesis, plant environment interaction, etc. Individual modules are developed using a divide-and-conquer approach; modules at different temporal and spatial scales are integrated through transfer variables. We further propose a supervised learning procedure based on information geometry to combine model and data for both knowledge discovery and model extension or advances. We finally discuss the recent formation of a global consortium, which includes experts in plant biology, computer science, statistics, agronomy, phenomics, etc. aiming to expedite the development and application of ePlant or its equivalents by promoting a new model development paradigm where models are developed as a community effort instead of driven mainly by individual labs’ effort.

Conclusions: ePlant, as a major research tool to support quantitative and predictive plant science research, will play a crucial role in the future model guided crop engineering, breeding and agronomy.

ePlant is a mathematical model of plant growth and development with a high level of mechanistic details. ePlant integrates modules developed for processes occurring at different temporal and spatial scales, incorporating diverse physical, biophysical and biochemical processes. We propose a supervised learning procedure based on information geometry to combine model and data for both knowledge discovery and model extension. A global consortium approach is needed to expedite ePlant development and application. ePlant, as a research tool to support quantitative and predictive plant science research, can play a crucial role in the future model guided crop engineering, breeding and agronomy.
Figures and Tables | References | Related Articles | Metrics
NEWS
Strategic planning for national biomedical big data infrastructure in China
Zhen Wang, Zefeng Wang, Yixue Li
Quant. Biol.. 2017, 5 (3): 272-275.   DOI: 10.1007/s40484-017-0114-5
Abstract   HTML   PDF (85KB)

The promise that big data will revolutionize scientific discovery and technology innovation is now being widely recognized. With the explosive growth of biomedical data, life science is being transformed into a digital science in which novel insights are gained from in-depth data analysis and modeling. Extensive and innovative utilization of biomedical big data is a key to the success of precision medicine. Therefore, constructing a centralized national-level biomedical big data infrastructure becomes crucial and urgent for China. Such infrastructure should achieve superb capacity of safe data storage, standardized data processing and quality control, systematic data integration across multiple types, and in-depth data mining and effective data sharing. Full data chain service including information retrieval, knowledge discovery and technology support can be provided to data centers, research institutes and healthcare industries. Relying on Shanghai Institutes for Biological Sciences, agreements have been signed that a main node of the infrastructure will be located in Shanghai, and a backup node will be set up in Guizhou Province. After a construction period of five years, the infrastructure should greatly enhance China’s core competence in collection, interpretation and application of biomedical big data.

References | Related Articles | Metrics
9 articles

LinksMore