Quantitative Biology

Online First
The manuscripts published below will continue to be available from this page until they are assigned to an issue.
Please wait a minute...
For Selected: View Abstracts Toggle Thumbnails
Identification of candidate disease genes in patients with common variable immunodeficiency
Guojun Liu, Mikhail A. Bolkov, Irina A. Tuzankina, Irina G. Danilova
Quant. Biol.    https://doi.org/10.1007/s40484-019-0174-9
Abstract   HTML   PDF (1812KB)

Background: Common variable immunodeficiency (CVID), the most prevalent form of primary immunodeficiency (PID), is characterized by hypogammaglobulinemia and recurrent infections. Understanding protein-protein interaction (PPI) networks of CVID genes and identifying candidate CVID genes are critical steps in facilitating the early diagnosis of CVID. Here, the aim was to investigate PPI networks of CVID genes and identify candidate CVID genes using computation techniques.

Methods: Network density and biological distance were used to study PPI data for CVID and PID genes obtained from the STRING database. Gene expression data of patients with CVID were obtained from the Gene Expression Omnibus, and then Pearson’s correlation coefficient, a PPI database, and Kyoto Encyclopedia of Genes and Genomes were used to identify candidate CVID genes. We then evaluated our predictions and identified differentially expressed CVID genes.

Results: The majority of CVID genes are characterized by a high network density and small biological distance, whereas most PID genes are characterized by a low network density and large biological distance, indicating that CVID genes are more functionally similar to each other and closely interact with one other compared with PID genes. Subsequently, we identified 172 CVID candidate genes that have similar biological functions to known CVID genes, and eight genes were recently reported as CVID-related genes. MYC, a candidate gene, was down-regulated in CVID duodenal biopsies, but up-regulated in blood samples compared with levels in healthy controls.

Conclusion: Our findings will aid in a better understanding of the complex of CVID genes, possibly further facilitating the early diagnosis of CVID.

Table and Figures | Reference | Supplementary Material | Related Articles | Metrics
EpiFIT: functional interpretation of transcription factors based on combination of sequence and epigenetic information
Shaoming Song, Hongfei Cui, Shengquan Chen, Qiao Liu, Rui Jiang
Quant. Biol.    https://doi.org/10.1007/s40484-019-0175-8
Abstract   HTML   PDF (1455KB)

Background: Transcription factor is one of the most important regulators in the transcriptional process. Nevertheless, the functional interpretation of transcription factors is still a main challenge due to the poor performance of methods relating to regulatory regions to genes. Epigenetic information, such as chromatin accessibility, contains genome-wide knowledge about transcription regulation and thus may shed light on the functional interpretation of transcription factors.

Methods: We propose EpiFIT (Epigenetic based Functional Interpretation of Transcription factors), a tool to infer functions of transcription factors from ChIP-seq data. Briefly, we adopt a variable distance rule to establish associations between regulatory regions and nearby genes. The associations are then filtered to ensure that the remaining regions and associated genes are co-open. Finally, GO enrichment is applied to all related genes and a ranking list of GO terms is provided as functional interpretation.

Results: We first examined the chromatin openness correlation between regulatory regions and associated genes. The correlation can help EpiFIT purify regulatory region–gene associations. By evaluating EpiFIT on a set of real data, we demonstrated that EpiFIT outperforms other existing methods for precisely interpreting transcription factor functions. We further verify the efficiency of openness in interpretation and the ability of EpiFIT to build distal region-gene associations.

Conclusion: EpiFIT is a powerful tool for interpreting the transcription factor functions. We believe EpiFIT will facilitate the functional interpretation of other regulatory elements, and thus open a new door to understanding the regulatory mechanism.

Availability: The application is freely accessible at website: bioinfo.au.tsinghua.edu.cn/openness/EpiFIT/.

Table and Figures | Reference | Related Articles | Metrics
Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic Medical Records
Yang Yang, Qi Li, Zhaoyang Liu, Fang Ye, Ke Deng
Quant. Biol.    https://doi.org/10.1007/s40484-019-0173-x
Abstract   HTML   PDF (3798KB)

Background: Traditional Chinese medicine (TCM) has been attracting lots of attentions from various disciplines recently. However, TCM is still mysterious because of its unique philosophy and theoretical thinking. Due to the lack of high quality data, understanding TCM thoroughly faces critical challenges. In this study, we introduce the Zhou Archive, a large-scale database of expert-specific Electronic Medical Records containing information about 73,000+ visits to one TCM doctor for over 35 years. Covering the full spectrum of diagnosis-treatment model behind TCM practice, the archive provides an opportunity to understand TCM from the data-driven perspective.

Methods: Processing the text data in the archive via a series of data processing steps, we transformed the semi-structured EMRs in the archive to a well-structured feature table. Based on the structured feature table obtained, a series of statistical analyses are implemented to learn principles of TCM clinical practice from the archive, including correlation analysis, enrichment analysis, embedding analysis and association pattern discovery.

Results: A structured feature table of 14,000+ features is generated at the end of the proposed data processing procedure, with a feature codebook, a term dictionary and a term-feature map as byproducts. Statistical analysis of the feature table reveals underlying principles about the diagnosis-treatment model of TCM, helping us better understand the TDM practice from a data-driven perspective.

Conclusion: Expert-specific EMRs provide opportunities to understand TCM from the data-driven perspective. Taking advantage of recent progresses on NLP for Chinese, we can process a large number of TCM EMRs efficiently to gain insights via statistical analysis.

Table and Figures | Reference | Supplementary Material | Related Articles | Metrics
Computational prediction and functional analysis of arsenic-binding proteins in human cells
Shichao Pang, Junchen Yang, Yilei Zhao, Yixue Li, Jingfang Wang
Quant. Biol.    https://doi.org/10.1007/s40484-019-0169-6
Abstract   HTML   PDF (538KB)

Background: Arsenic has a broad anti-cancer ability against hematologic malignancies and solid tumors. To systematically understand the biological functions of arsenic, we need to identify arsenic-binding proteins in human cells. However, due to lack of effective theoretical tools and experimental methods, only a few arsenic-binding proteins have been identified.

Methods: Based on the crystal structure of ArsM, we generated a single mutation free energy profile for arsenic binding using free energy perturbation methods. Multiple validations provide an indication that our computational model has the ability to predict arsenic-binding proteins with desirable accuracy. We subsequently apply this computational model to scan the entire human genome to identify all the potential arsenic-binding proteins.

Results: The computationally predicted arsenic-binding proteins show a wide range of biological functions, especially in the signaling transduction pathways. In the signaling transduction pathways, arsenic directly binds to the key factors (e.g., Notch receptors, Notch ligands, Wnt family proteins, TGF-beta, and their interacting proteins) and results in significant inhibitions on their enzymatic activities, further having a crucial impact on the related signaling pathways.

Conclusions: Arsenic has a significant impact on signaling transduction in cells. Arsenic binding to proteins can lead to dysfunctions of the target proteins, having crucial impacts on both signaling pathway and gene transcription. We hope that the computationally predicted arsenic-binding proteins and the functional analysis can provide a novel insight into the biological functions of arsenic, revealing a mechanism for the broad anti-cancer of arsenic.

Table and Figures | Reference | Related Articles | Metrics
First page | Prev page | Next page | Last page Page 1 of 1, 4 articles found