Jun 2019, Volume 7 Issue 2

Cover illustration

  • Competition among different types of biological molecules for limited resources is ubiquitous in biological processes. Molecular competition causes intricate behaviors in resource allocation, and thus introduces a hidden layer of regulatory mechanism by connecting components without direct physical interactions. Wei et al. built a unified coarse-grained competition motif model to quantitatively understand and predict diverse phenomena mediated by molecular competition. They s [Detail] ...

  • Select all
    Jiyu Fan, Ailing Fu, Le Zhang

    Background: In recent years, since the molecular docking technique can greatly improve the efficiency and reduce the research cost, it has become a key tool in computer-assisted drug design to predict the binding affinity and analyze the interactive mode.

    Results: This study introduces the key principles, procedures and the widely-used applications for molecular docking. Also, it compares the commonly used docking applications and recommends which research areas are suitable for them. Lastly, it briefly reviews the latest progress in molecular docking such as the integrated method and deep learning.

    Conclusion: Limited to the incomplete molecular structure and the shortcomings of the scoring function, current docking applications are not accurate enough to predict the binding affinity. However, we could improve the current molecular docking technique by integrating the big biological data into scoring function.

    Xingyu Liao, Min Li, You Zou, Fang-Xiang Wu, Yi-Pan, Jianxin Wang

    Background: Next-generation sequencing (NGS) technologies have fostered an unprecedented proliferation of high-throughput sequencing projects and a concomitant development of novel algorithms for the assembly of short reads. However, numerous technical or computational challenges in de novo assembly still remain, although many new ideas and solutions have been suggested to tackle the challenges in both experimental and computational settings.

    Results: In this review, we first briefly introduce some of the major challenges faced by NGS sequence assembly. Then, we analyze the characteristics of various sequencing platforms and their impact on assembly results. After that, we classify de novo assemblers according to their frameworks (overlap graph-based, de Bruijn graph-based and string graph-based), and introduce the characteristics of each assembly tool and their adaptation scene. Next, we introduce in detail the solutions to the main challenges of de novo assembly of next generation sequencing data, single-cell sequencing data and single molecule sequencing data. At last, we discuss the application of SMS long reads in solving problems encountered in NGS assembly.

    Conclusions: This review not only gives an overview of the latest methods and developments in assembly algorithms, but also provides guidelines to determine the optimal assembly algorithm for a given input sequencing data type.

    Lei Wei, Ye Yuan, Tao Hu, Shuailin Li, Tianrun Cheng, Jinzhi Lei, Zhen Xie, Michael Q. Zhang, Xiaowo Wang

    Background: Molecular competition brings about trade-offs of shared limited resources among the cellular components, and thus introduces a hidden layer of regulatory mechanism by connecting components even without direct physical interactions. Several molecular competition scenarios have been observed recently, but there is still a lack of systematic quantitative understanding to reveal the essence of molecular competition.

    Methods: Here, by abstracting the analogous competition mechanism behind diverse molecular systems, we built a unified coarse-grained competition motif model to systematically integrate experimental evidences in these processes and analyzed general properties shared behind them from steady-state behavior to dynamic responses.

    Results: We could predict in what molecular environments competition would reveal threshold behavior or display a negative linear dependence. We quantified how competition can shape regulator-target dose-response curve, modulate dynamic response speed, control target expression noise, and introduce correlated fluctuations between targets.

    Conclusions: This work uncovered the complexity and generality of molecular competition effect as a hidden layer of gene regulatory network, and therefore provided a unified insight and a theoretical framework to understand and employ competition in both natural and synthetic systems.

    Shashank Singh, Yang Yang, Barnabás Póczos, Jian Ma

    Background: In the human genome, distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions. Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide, it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions.

    Methods: Here we report a new computational method (named “SPEID”) using deep learning models to predict enhancer-promoter interactions based on sequence-based features only, when the locations of putative enhancers and promoters in a particular cell type are given.

    Results: Our results across six different cell types demonstrate that SPEID is effective in predicting enhancer-promoter interactions as compared to state-of-the-art methods that only use information from a single cell type. As a proof-of-principle, we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes.

    Conclusions: This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.

    Tao Ding, Jie Gao, Shanshan Zhu, Junhua Xu, Min Wu

    Background: Increasing evidences indicate that microRNAs (miRNAs) are functionally related to the development and progression of various human diseases. Inferring disease-related miRNAs can be helpful in promoting disease biomarker detection for the treatment, diagnosis, and prevention of complex diseases.

    Methods: To improve the prediction accuracy of miRNA-disease association and capture more potential disease-related miRNAs, we constructed a precise miRNA global similarity network (MSFSN) via calculating the miRNA similarity based on secondary structures, families, and functions.

    Results: We tested the network on the classical algorithms: WBSMDA and RWRMDA through the method of leave-one-out cross-validation. Eventually, AUCs of 0.8212 and 0.9657 are obtained, respectively. Also, the proposed MSFSN is applied to three cancers for breast neoplasms, hepatocellular carcinoma, and prostate neoplasms. Consequently, 82%, 76%, and 82% of the top 50 potential miRNAs for these diseases are respectively validated by the miRNA-disease associations database miR2Disease and oncomiRDB.

    Conclusion: Therefore, MSFSN provides a novel miRNA similarity network combining precise function network with global structure network of miRNAs to predict the associations between miRNAs and diseases in various models.

    Chuang Han, Yu Wu

    Background: Tumor microenvironment plays an essential role in the growth of malignancy. Understanding how tumor cells co-evolve with tumor-associated immune cells and stromal cells is important for tumor treatment.

    Methods: In this paper, we propose a logistic population dynamics model for quantifying the intercellular signaling network in non-small-cell lung cancer (NSCLC). The model describes the evolutionary dynamics of cells and signaling proteins and was used to predict effective receptor targets through combination strategy analysis. Then, we optimized a multi-target strategy analysis algorithm that was verified by applying it to virtual patients with heterogeneous conditions. Furthermore, to deal with acquired resistance which was commonly observed in patients with NSCLC, we proposed a novel targeting strategy — tracking targeted therapy, to optimize the treatment by improving the therapeutic strategy periodically.

    Results: The synergistic effect when inhibiting multiple signaling pathways may help significantly retard carcinogenic processes associated with disease progression, compared with suppression of a single signaling pathway. While traditional treatment (surgery, radiotherapy and chemotherapy) tends to attack tumor cells directly, the multi-target therapy we suggested here is aimed to inhibit the development of tumor by emasculating the relative competitive advantages of tumor cells and promoting that of normal cells.

    Conclusion: The combination of traditional and targeted therapy, as an interesting experiment, was significantly more effective in treatment of virtual patients due to a clear complementary relationship between the two therapeutic schemes.

    Kui Hua, Xuegong Zhang

    Background: Reproducibility is a defining feature of a scientific discovery. Reproducibility can be at different levels for different types of study. The purpose of the Human Cell Atlas (HCA) project is to build maps of molecular signatures of all human cell types and states to serve as references for future discoveries. Constructing such a complex reference atlas must involve the assembly and aggregation of data from multiple labs, probably generated with different technologies. It has much higher requirements on reproducibility than individual research projects. To add another layer of complexity, the bioinformatics procedures involved for single-cell data have high flexibility and diversity. There are many factors in the processing and analysis of single-cell RNA-seq data that can shape the final results in different ways.

    Methods: To study what levels of reproducibility can be reached in current practices, we conducted a detailed reproduction study for a well-documented recent publication on the atlas of human blood dendritic cells as an example to break down the bioinformatics steps and factors that are crucial for the reproducibility at different levels.

    Results: We found that the major scientific discovery can be well reproduced after some efforts, but there are also some differences in some details that may cause uncertainty in the future reference. This study provides a detailed case observation on the on-going discussions of the type of standards the HCA community should take when releasing data and publications to guarantee the reproducibility and reliability of the future atlas.

    Conclusion: Current practices of releasing data and publications may not be adequate to guarantee the reproducibility of HCA. We propose building more stringent guidelines and standards on the information that needs to be provided along with publications for projects that evolved in the HCA program.