Journal home Browse Online first

Online first

The manuscripts published below will continue to be available from this page until they are assigned to an issue.
  • Select all
  • RESEARCH ARTICLE
    Ran Yi, Rui Zeng, Yang Weng, Minjing Yu, Yu-Kun Lai, Yong-Jin Liu

    Background: Image-based automatic diagnosis of field diseases can help increase crop yields and is of great importance. However, crop lesion regions tend to be scattered and of varying sizes, this along with substantial intra-class variation and small inter-class variation makes segmentation difficult.

    Methods: We propose a novel end-to-end system that only requires weak supervision of image-level labels for lesion region segmentation. First, a two-branch network is designed for joint disease classification and seed region generation. The generated seed regions are then used as input to the next segmentation stage where we design to use an encoder-decoder network. Different from previous works that use an encoder in the segmentation network, the encoder-decoder network is critical for our system to successfully segment images with small and scattered regions, which is the major challenge in image-based diagnosis of field diseases. We further propose a novel weakly supervised training strategy for the encoder-decoder semantic segmentation network, making use of the extracted seed regions.

    Results: Experimental results show that our system achieves better lesion region segmentation results than state of the arts. In addition to crop images, our method is also applicable to general scattered object segmentation. We demonstrate this by extending our framework to work on the PASCAL VOC dataset, which achieves comparable performance with the state-of-the-art DSRG (deep seeded region growing) method.

    Conclusion: Our method not only outperforms state-of-the-art semantic segmentation methods by a large margin for the lesion segmentation task, but also shows its capability to perform well on more general tasks.

  • RESEARCH ARTICLE
    Pavel Pronkin, Alexander Tatikolov

    Background: The outbreak and continued spread of coronavirus infection (COVID-19) sets the goal of finding new tools and methods to develop analytical procedures and tests to detect, study infection and prevent morbidity.

    Methods: The noncovalent binding of cyanine and squarylium dyes of different classes (60 compounds in total) with the proteases NSP3, NSP5, and NSP12 of SARS-CoV-2 was studied by the method of molecular docking.

    Results: The interaction energies and spatial configurations of dye molecules in complexes with NSP3, NSP5, and NSP12 have been determined.

    Conclusion: A number of anionic dyes showing lower values of the total energy Etot could be recommended for practical research in the development of agents for the detection and inactivation of the coronavirus.

  • FEATURE
    Wei Dong, Boqing Qiang, Huanming Yang
  • REVIEW
    Taoyu Chen, Qi Lei, Minglei Shi, Tingting Li

    Background: The concept of biomolecular condensate was put forward recently to emphasize the ability of certain cellular compartments to concentrate molecules and comprise proteins and nucleic acids with specific biological functions, from ribosome genesis to RNA splicing. Due to their unique role in biological processes, it is crucial to investigate their compositions, which is a primary determinant of condensate properties.

    Results: Since a wide range of macromolecules comprise biomolecular condensates, it is necessary for researchers to investigate them using high-throughput methodologies while low-throughput experiments are not efficient enough. These high-throughput methods usually purify interacting protein libraries from condensates before being scanned in mass spectrometry. It is possible to extract organelles as a whole for specific condensates for further analysis, however, most condensates do not have a distinguishable marker or are sensitive to shear force to be extracted as a whole. Affinity tagging allows a comprehensive view of interacting proteins of target molecule yet only proteins with strong bonds may be pulled down. Proximity labeling serves as a complementary method to label more dynamic proteins with weaker interactions, increasing sensitivity while decreasing specificity. Image-based fluorescent screening takes another path by scanning images automatically to illustrate the condensing state of biomolecules within membraneless organelles, which is a unique feature unlike the previous mass spectrometry-based methods.

    Conclusion: This review presents a rough glimpse into high-throughput methodologies for biomolecular condensate investigation to encourage usage of bioinformatic tools by researchers in relevant fields.

  • RESEARCH ARTICLE
    Rudra Banerjee, Srijit Bhattacharjee, Pritish Kumar Varadwaj

    Background: The coronavirus pandemic (COVID-19) is causing a havoc globally, exacerbated by the newly discovered SARS-CoV-2 virus. Due to its high population density, India is one of the most badly effected countries from the first wave of COVID-19. Therefore, it is extremely necessary to accurately predict the state-wise and overall dynamics of COVID-19 to get the effective and efficient organization of resources across India.

    Methods: In this study, the dynamics of COVID-19 in India and several of its selected states with different demographic structures were analyzed using the SEIRD epidemiological model. The basic reproductive ratio R 0 was systemically estimated to predict the dynamics of the temporal progression of COVID-19 in India and eight of its states, Andhra Pradesh, Chhattisgarh, Delhi, Gujarat, Madhya Pradesh, Maharashtra, Tamil Nadu, and Uttar Pradesh.

    Results: For India, the SEIRD model calculations show that the peak of infection is expected to appear around the middle of October, 2020. Furthermore, we compared the model scenario to a Gaussian fit of the daily infected cases and obtained similar results. The early imposition of a nation-wide lockdown has reduced the number of infected cases but delayed the appearance of the infection peak significantly.

    Conclusion: After comparing our calculations using India’s data to the real life dynamics observed in Italy and Russia, we can conclude that the SEIRD model can predict the dynamics of COVID-19 with sufficient accuracy.

  • FEATURE
    Runsheng Chen

    This article records the author’s experience in participating in the early human genome and bioinformatics research in China, especially the non-coding sequence of the genome. It also introduced the beginning of human genome research in china, including the experts and teams involved in the International Human Genome Project. All the progress of bioinformatics originates from the inheritance of theoretical biology and the layout of philosophers in china.

  • REVIEW
    Shuyu Shi, Wen Si, Xiaoyi Ouyang, Ping Wei

    Background: The concept of phase separation has been used to describe and interpret physicochemical phenomena in biological systems for decades. Many intracellular macromolecules undergo phase separation, where it plays important roles in gene regulation, cellular signaling, metabolic reactions and so on, due to its unique dynamic properties and biological effects. As the noticeable importance of phase separation, pioneer researchers have explored the possibility to introduce the synthetically engineered phase separation for applicable cell function.

    Results: In this article, we illustrated the application value of phase separation in synthetic biology. We described main states of phase separation in detail, summarized some ways to implement synthetic condensates and several methods to regulate phase separation, and provided a substantial amount of identical examples to illuminate the applications and perspectives of phase separation in synthetic biology.

    Conclusions: Multivalent interactions implement phase separation in synthetic biology. Small molecules, light control and spontaneous interactions induce and regulate phase separation. The synthetic condensates are widely used in signal amplifications, designer orthogonally non-membrane-bound organelles, metabolic pathways, gene regulations, signaling transductions and controllable platforms. Studies on quantitative analysis, more standardized modules and precise spatiotemporal control of synthetic phase separation may promote the further development of this field.

  • METHOD
    Ya-Li Zhu, Xiao-Ning Zhang, Chuan-Yuan Wang, Jin-Xing Liu, Xiang-Zhen Kong

    Background: Single-cell RNA sequencing (scRNA-seq) data provides a whole new view to study disease and cell differentiation development. With the explosive increment of scRNA-seq data, effective models are demanded for mining the intrinsic biological information.

    Methods: This paper proposes a novel non-negative matrix factorization (NMF) method for clustering and gene co-expression network analysis, termed Adaptive Total Variation Constraint Hypergraph Regularized NMF (ATV-HNMF). ATV-HNMF can adaptively select the different schemes to denoise the cluster or preserve the cluster boundary information between clusters based on the gradient information. Besides, ATV-HNMF incorporates hypergraph regularization, which can consider high-order relationships between cells to reserve the intrinsic structure of the space.

    Results: Experiments show that the performances on clustering outperform other compared methods, and the network construction results are consistent with previous studies, which illustrate that our model is effective and useful.

    Conclusion: From the clustering results, we can see that ATV-HNMF outperforms other methods, which can help us to understand the heterogeneity. We can discover many disease-related genes from the constructed network, and some are worthy of further clinical exploration.

  • RESEARCH ARTICLE
    Jing Tang, Yongheng Wang, Jianbo Fu, Xianglu Wu, Zhijie Han, Chuan Wang, Maiyuan Guo, Yingxiong Wang, Yubin Ding, Bo Yang, Feng Zhu

    Background: Functional characterization of the long noncoding RNAs (lncRNAs) in disease attracts great attention, which results in a limited number of experimentally characterized lncRNAs. The major problems underlying the lack of experimental verifications are considered to come from the significant false-positive assignments and extensive genetic-heterogeneity of disease. These problems are even worse when it comes to the functional characterization in comorbidity (simultaneous/sequential presence of multiple diseases in a patient, and showing much wider prevalence, poorer treatment-response and longer illness-course than a single disease).

    Methods: Herein, FCCLnc was developed to characterize lncRNA function by (1) integrating diverse SNPs that were associated with 193 diseases standardized by International Classification of Diseases (ICD-11), (2) condition-specific expression of lncRNAs, (3) weighted correlation network of lncRNAs and protein-coding neighboring genes.

    Results: FCCLnc can characterize lncRNA function in both disease and comorbidity by not only controlling false discovery but also tolerating their disease heterogeneity. Moreover, FCCLnc can provide interactive visualization and full download of lncRNA-centered co-expression network.

    Conclusion: In summary, FCCLnc is unique in characterizing lncRNA function in diverse diseases and comorbidities and is highly expected to emerge to be an indispensable complement to other available tools. FCCLnc is accessible at https://idrblab.org/fcclnc/.

  • RESEARCH ARTICLE
    Xiao-Ying Yan, Shao-Wu Zhang, Siu-Ming Yiu, Jian-Yu Shi

    Background: One of the challenges in personalized medicine is to determine specific drugs and their dosages for patient individuals who are undergoing a common disease. The technique of cell lines provides a safe approach to capture the drug responses of patient individuals when given specific drugs with varied dosages. However, it is still costly to determine drug responses in cells w.r.t dosages by biological assays. Computational methods provide a promising screening to infer possible drug responses in the cells of patient individuals on a large scale. Nevertheless, existing computational approaches are insufficient to interpret the underlying reason for drug responses.

    Methods: In this work, we propose an interpretable model for analyzing and predicting drug responses across cell lines. The proposed model bridges drug features (e.g., chemical structure fingerprints), cell features (e.g., gene expression profiles), and drug responses across cells (measured by IC50) by a triple matrix factorization (TMF), such that the underlying reason for drug responses in specific cells is possibly interpreted.

    Results: The comparison with state-of-the-art computational approaches demonstrates the superiority of our TMF. More importantly, a case study of drug responses in lung-related cell lines shows its interpretable ability to find out highly occurring drug substructures, crucial mutated genes, as well as significant pairs between substructures and mutated genes in terms of drug sensitivity and resistance.

    Conclusion: TMF is an effective and interpretable approach for predicting cell lines responses to drugs, and can dig out crucial pairs of chemical substructures and genes, which uncovers the underlying reason for drug responses in specific cells.