Journal home Browse Latest articles

Latest articles

  • Select all
    Jiarui Ou, Le Zhang, Xiaoli Ru
    Quantitative Biology, 2024, 12(1): 117-127.

    Cardiovascular disease (CVD) is the major cause of death in many regions around the world, and several of its risk factors might be linked to diets. To improve public health and the understanding of this topic, we look at the recent Minnesota Coronary Experiment (MCE) analysis that used t-test and Cox model to evaluate CVD risks. However, these parametric methods might suffer from three problems: small sample size, right-censored bias, and lack of long-term evidence. To overcome the first of these challenges, we utilize a nonparametric permutation test to examine the relationship between dietary fats and serum total cholesterol. To address the second problem, we use a resampling-based rank test to examine whether the serum total cholesterol level affects CVD deaths. For the third issue, we use some extra-Framingham Heart Study (FHS) data with an A/B test to look for meta-relationship between diets, risk factors, and CVD risks. We show that, firstly, the link between low saturated fat diets and reduction in serum total cholesterol is strong. Secondly, reducing serum total cholesterol does not robustly have an impact on CVD hazards in the diet group. Lastly, the A/B test result suggests a more complicated relationship regarding abnormal diastolic blood pressure ranges caused by diets and how these might affect the associative link between the cholesterol level and heart disease risks. This study not only helps us to deeply analyze the MCE data but also, in combination with the long-term FHS data, reveals possible complex relationships behind diets, risk factors, and heart disease.

    Keran Sun, Jingyuan Ning, Keqi Jia, Xiaoqing Fan, Hongru Li, Jize Ma, Meiqi Meng, Cuiqing Ma, Lin Wei
    Quantitative Biology, 2024, 12(1): 100-116.

    To investigate the impact of hyperglycemia on the prognosis of patients with gastric cancer and identify key molecules associated with high glucose levels in gastric cancer development, RNA sequencing data and clinical features of gastric cancer patients were obtained from The Cancer Genome Atlas (TCGA) database. High glucose-related genes strongly associated with gastric cancer were identified using weighted gene co-expression network and differential analyses. A gastric cancer prognosis signature was constructed based on these genes and patients were categorized into high- and low-risk groups. The immune statuses of the two patient groups were compared. ATP citrate lyase (ACLY), a gene significantly related to the prognosis, was found to be upregulated upon high-glucose stimulation. Immunohistochemistry and molecular analyses confirmed high ACLY expression in gastric cancer tissues and cells. Gene Set Enrichment Analysis (GSEA) revealed the involvement of ACLY in cell cycle and DNA replication processes. Inhibition of ACLY affected the proliferation and migration of gastric cancer cells induced by high glucose levels. These findings suggest that ACLY, as a high glucose-related gene, plays a critical role in gastric cancer progression.

    Siyu Li, Songming Tang, Yunchang Wang, Sijie Li, Yuhang Jia, Shengquan Chen
    Quantitative Biology, 2024, 12(1): 85-99.

    Recent advances in single-cell chromatin accessibility sequencing (scCAS) technologies have resulted in new insights into the characterization of epigenomic heterogeneity and have increased the need for automatic cell type annotation. However, existing automatic annotation methods for scCAS data fail to incorporate the reference data and neglect novel cell types, which only exist in a test set. Here, we propose RAINBOW, a reference-guided automatic annotation method based on the contrastive learning framework, which is capable of effectively identifying novel cell types in a test set. By utilizing contrastive learning and incorporating reference data, RAINBOW can effectively characterize the heterogeneity of cell types, thereby facilitating more accurate annotation. With extensive experiments on multiple scCAS datasets, we show the advantages of RAINBOW over state-of-the-art methods in known and novel cell type annotation. We also verify the effectiveness of incorporating reference data during the training process. In addition, we demonstrate the robustness of RAINBOW to data sparsity and number of cell types. Furthermore, RAINBOW provides superior performance in newly sequenced data and can reveal biological implication in downstream analyses. All the results demonstrate the superior performance of RAINBOW in cell type annotation for scCAS data. We anticipate that RAINBOW will offer essential guidance and great assistance in scCAS data analysis. The source codes are available at the GitHub website (BioX-NKU/RAINBOW).

    Tianxing Ma, Zetong Zhao, Haochen Li, Lei Wei, Xuegong Zhang
    Quantitative Biology, 2024, 12(1): 70-84.

    Complicated molecular alterations in tumors generate various mutant peptides. Some of these mutant peptides can be presented to the cell surface and then elicit immune responses, and such mutant peptides are called neoantigens. Accurate detection of neoantigens could help to design personalized cancer vaccines. Although some computational frameworks for neoantigen detection have been proposed, most of them can only detect SNV- and indel-derived neoantigens. In addition, current frameworks adopt oversimplified neoantigen prioritization strategies. These factors hinder the comprehensive and effective detection of neoantigens. We developed NeoHunter, flexible software to systematically detect and prioritize neoantigens from sequencing data in different formats. NeoHunter can detect not only SNV- and indel-derived neoantigens but also gene fusion- and aberrant splicing-derived neoantigens. NeoHunter supports both direct and indirect immunogenicity evaluation strategies to prioritize candidate neoantigens. These strategies utilize binding characteristics, existing biological big data, and T-cell receptor specificity to ensure accurate detection and prioritization. We applied NeoHunter to the TESLA dataset, cohorts of melanoma and non-small cell lung cancer patients. NeoHunter achieved high performance across the TESLA cancer patients and detected 79% (27 out of 34) of validated neoantigens in total. SNV- and indel-derived neoantigens accounted for 90% of the top 100 candidate neoantigens while neoantigens from aberrant splicing accounted for 9%. Gene fusion-derived neoantigens were detected in one patient. NeoHunter is a powerful tool to ‘catch all’ neoantigens and is available for free academic use on Github (XuegongLab/NeoHunter).

    Mahsa Babaei, Soheila Kashanian, Huang-Teck Lee, Frances Harding
    Quantitative Biology, 2024, 12(1): 53-69.

    Protein biomarkers represent specific biological activities and processes, so they have had a critical role in cancer diagnosis and medical care for more than 50 years. With the recent improvement in proteomics technologies, thousands of protein biomarker candidates have been developed for diverse disease states. Studies have used different types of samples for proteomics diagnosis. Samples were pretreated with appropriate techniques to increase the selectivity and sensitivity of the downstream analysis and purified to remove the contaminants. The purified samples were analyzed by several principal proteomics techniques to identify the specific protein. In this study, recent improvements in protein biomarker discovery, verification, and validation are investigated. Furthermore, the advantages, and disadvantages of conventional techniques, are discussed. Studies have used mass spectroscopy (MS) as a critical technique in the identification and quantification of candidate biomarkers. Nevertheless, after protein biomarker discovery, verification and validation have been required to reduce the false-positive rate where there have been higher number of samples. Multiple reaction monitoring (MRM), parallel reaction monitoring (PRM), and selected reaction monitoring (SRM), in combination with stable isotope-labeled internal standards, have been examined as options for biomarker verification, and enzyme-linked immunosorbent assay (ELISA) for validation.

    Xinyue Li, Zhankun Xiong, Wen Zhang, Shichao Liu
    Quantitative Biology, 2024, 12(1): 30-52.

    The prediction of drug-drug interactions (DDIs) is a crucial task for drug safety research, and identifying potential DDIs helps us to explore the mechanism behind combinatorial therapy. Traditional wet chemical experiments for DDI are cumbersome and time-consuming, and are too small in scale, limiting the efficiency of DDI predictions. Therefore, it is particularly crucial to develop improved computational methods for detecting drug interactions. With the development of deep learning, several computational models based on deep learning have been proposed for DDI prediction. In this review, we summarized the high-quality DDI prediction methods based on deep learning in recent years, and divided them into four categories: neural network-based methods, graph neural network-based methods, knowledge graph-based methods, and multimodal-based methods. Furthermore, we discuss the challenges of existing methods and future potential perspectives. This review reveals that deep learning can significantly improve DDI prediction performance compared to traditional machine learning. Deep learning models can scale to large-scale datasets and accept multiple data types as input, thus making DDI predictions more efficient and accurate.

    Yuhang Xia, Yongkang Wang, Zhiwei Wang, Wen Zhang
    Quantitative Biology, 2024, 12(1): 15-29.

    Drug discovery is aimed to design novel molecules with specific chemical properties for the treatment of targeting diseases. Generally, molecular optimization is one important step in drug discovery, which optimizes the physical and chemical properties of a molecule. Currently, artificial intelligence techniques have shown excellent success in drug discovery, which has emerged as a new strategy to address the challenges of drug design including molecular optimization, and drastically reduce the costs and time for drug discovery. We review the latest advances of molecular optimization in artificial intelligence-based drug discovery, including data resources, molecular properties, optimization methodologies, and assessment criteria for molecular optimization. Specifically, we classify the optimization methodologies into molecular mapping-based, molecular distribution matching-based, and guided search-based methods, respectively, and discuss the principles of these methods as well as their pros and cons. Moreover, we highlight the current challenges in molecular optimization and offer a variety of perspectives, including interpretability, multidimensional optimization, and model generalization, on potential new lines of research to pursue in future. This study provides a comprehensive review of molecular optimization in artificial intelligence-based drug discovery, which points out the challenges as well as the new prospects. This review will guide researchers who are interested in artificial intelligence molecular optimization.

    Yazhen Song, Chenxi Feng, Difei Zhou, Zengxin Ma, Lian He, Cong Zhang, Guihong Yu, Yan Zhao, Song Yang, Xinhui Xing
    Quantitative Biology, 2024, 12(1): 1-14.

    Developing methylotrophic cell factories that can efficiently catalyze organic one-carbon (C1) feedstocks derived from electrocatalytic reduction of carbon dioxide into bio-based chemicals and biofuels is of strategic significance for building a carbon-neutral, sustainable economic and industrial system. With the rapid advancement of RNA sequencing technology and mass spectrometer analysis, researchers have used these quantitative microbiology methods extensively, especially isotope-based metabolic flux analysis, to study the metabolic processes initiating from C1 feedstocks in natural C1-utilizing bacteria and synthetic C1 bacteria. This paper reviews the use of advanced quantitative analysis in recent years to understand the metabolic network and basic principles in the metabolism of natural C1-utilizing bacteria grown on methane, methanol, or formate. The acquired knowledge serves as a guide to rewire the central methylotrophic metabolism of natural C1-utilizing bacteria to improve the carbon conversion efficiency, and to engineer non-C1-utilizing bacteria into synthetic strains that can use C1 feedstocks as the sole carbon and energy source. These progresses ultimately enhance the design and construction of highly efficient C1-based cell factories to synthesize diverse high value-added products. The integration of quantitative biology and synthetic biology will advance the iterative cycle of understand–design–build–testing–learning to enhance C1-based biomanufacturing in the future.

    Jianfeng Feng
    Quantitative Biology, 2023, 11(4): 471-473.
    Hongfei Cui
    Quantitative Biology, 2023, 11(4): 451-470.

    The information on host–microbe interactions contained in the operational taxonomic unit (OTU) abundance table can serve as a clue to understanding the biological traits of OTUs and samples. Some studies have inferred the taxonomies or functions of OTUs by constructing co-occurrence networks, but co-occurrence networks can only encompass a small fraction of all OTUs due to the high sparsity of the OTU table. There is a lack of studies that intensively explore and use the information on sample-OTU interactions. This study constructed a sample-OTU heterogeneous information network and represented the nodes in the network through the heterogeneous graph embedding method to form the OTU space and sample space. Taking advantage of the represented OTU and sample vectors combined with the original OTU abundance information, an Integrated Model of Embedded Taxonomies and Abundance (IMETA) was proposed for predicting sample attributes, such as phenotypes and individual diet habits. Both the OTU space and sample space contain reasonable biological or medical semantic information, and the IMETA using embedded OTU and sample vectors can have stable and good performance in the sample classification tasks. This suggests that the embedding representation based on the sample-OTU heterogeneous information network can provide more useful information for understanding microbiome samples. This study conducted quantified representations of the biological characteristics within the OTUs and samples, which is a good attempt to increase the utilization rate of information in the OTU abundance table, and it promotes a deeper understanding of the underlying knowledge of human microbiome.

    Ke Feng, Hongyang Jiang, Chaoyi Yin, Huiyan Sun
    Quantitative Biology, 2023, 11(4): 434-450.

    Gene regulatory network (GRN) inference from gene expression data is a significant approach to understanding aspects of the biological system. Compared with generalized correlation-based methods, causality-inspired ones seem more rational to infer regulatory relationships. We propose GRINCD, a novel GRN inference framework empowered by graph representation learning and causal asymmetric learning, considering both linear and non-linear regulatory relationships. First, high-quality representation of each gene is generated using graph neural network. Then, we apply the additive noise model to predict the causal regulation of each regulator-target pair. Additionally, we design two channels and finally assemble them for robust prediction. Through comprehensive comparisons of our framework with state-of-the-art methods based on different principles on numerous datasets of diverse types and scales, the experimental results show that our framework achieves superior or comparable performance under various evaluation metrics. Our work provides a new clue for constructing GRNs, and our proposed framework GRINCD also shows potential in identifying key factors affecting cancer development.

    Dali Wang, Jiaxuan Li, Lei Wang, Yipeng Cao, Bo Kang, Xiangfei Meng, Sai Li, Chen Song
    Quantitative Biology, 2023, 11(4): 421-433.

    The causative pathogen of coronavirus disease 2019 (COVID-19), severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is an enveloped virus assembled by a lipid envelope and multiple structural proteins. In this study, by integrating experimental data, structural modeling, as well as coarse-grained and all-atom molecular dynamics simulations, we constructed multiscale models of SARS-CoV-2. Our 500-ns coarse-grained simulation of the intact virion allowed us to investigate the dynamic behavior of the membrane-embedded proteins and the surrounding lipid molecules in situ. Our results indicated that the membrane-embedded proteins are highly dynamic, and certain types of lipids exhibit various binding preferences to specific sites of the membrane-embedded proteins. The equilibrated virion model was transformed into atomic resolution, which provided a 3D structure for scientific demonstration and can serve as a framework for future exascale all-atom molecular dynamics (MD) simulations. A short all-atom molecular dynamics simulation of 255 ps was conducted as a preliminary test for large-scale simulations of this complex system.

    Junqi Zhang, Zixuan You, Dingyuan Liu, Rui Tang, Chao Zhao, Yingxiu Cao, Feng Li, Hao Song
    Quantitative Biology, 2023, 11(4): 405-420.

    Electroactive microorganisms (EAMs) could utilize extracellular electron transfer (EET) pathways to exchange electrons and energy with their external surroundings. Conductive cytochrome proteins and nanowires play crucial roles in controlling electron transfer rate from cytosol to extracellular electrode. Many previous studies elucidated how the c-type cytochrome proteins and conductive nanowires are synthesized, assembled, and engineered to manipulate the EET rate, and quantified the kinetic processes of electron generation and EET. Here, we firstly overview the electron transfer pathways of EAMs and quantify the kinetic parameters that dictating intracellular electron production and EET. Secondly, we systematically review the structure, conductivity mechanisms, and engineering strategies to manipulate conductive cytochromes and nanowire in EAMs. Lastly, we outlook potential directions for future research in cytochromes and conductive nanowires for enhanced electron transfer. This article reviews the quantitative kinetics of intracellular electron production and EET, and the contribution of engineered c-type cytochromes and conductive nanowire in enhancing the EET rate, which lay the foundation for enhancing electron transfer capacity of EAMs.

    Chao Pang, Henry H. Y. Tong, Leyi Wei
    Quantitative Biology, 2023, 11(4): 395-404.

    The prediction of molecular properties is a crucial task in the field of drug discovery. Computational methods that can accurately predict molecular properties can significantly accelerate the drug discovery process and reduce the cost of drug discovery. In recent years, iterative updates in computing hardware and the rise of deep learning have created a new and effective path for molecular property prediction. Deep learning methods can leverage the vast amount of data accumulated over the years in drug discovery and do not require complex feature engineering. In this review, we summarize molecular representations and commonly used datasets in molecular property prediction models and present advanced deep learning methods for molecular property prediction, including state-of-the-art deep learning networks such as graph neural networks and Transformer-based models, as well as state-of-the-art deep learning strategies such as 3D pre-train, contrastive learning, multi-task learning, transfer learning, and meta-learning. We also point out some critical issues such as lack of datasets, low information utilization, and lack of specificity for diseases.

    Lu Peng, Zecheng Zhang, Xianyi Wang, Weiyi Qiu, Liqian Zhou, Hui Xiao, Chunxiuzi Liu, Shaohua Tang, Zhiwei Qin, Jiakun Jiang, Zengru Di, Yu Liu
    Quantitative Biology, 2023, 11(4): 376-394.

    Creating a man-made life in the laboratory is one of life science’s most intriguing yet challenging problems. Advances in synthetic biology and related theories, particularly those related to the origin of life, have laid the groundwork for further exploration and understanding in this field of artificial life or man-made life. But there remains a wealth of quantitative mathe-matical models and tools that have yet to be applied to this area. In this paper, we review the two main approaches often employed in the field of man-made life: the top-down approach that reduces the complexity of extant and existing living systems and the bottom-up approach that integrates well-defined components, by introducing the theoretical basis, recent advances, and their limitations. We then argue for another possible approach, namely “bottom-up from the origin of life”: Starting with the establishment of auto-catalytic chemical reaction networks that employ physical boundaries as the initial compartments, then designing directed evolutionary systems, with the expectation that independent compartments will eventually emerge so that the system becomes free-living. This approach is actually analogous to the process of how life originated. With this paper, we aim to stimulate the interest of synthetic biologists and experimentalists to consider a more theoretical perspective, and to promote the communication between the origin of life community and the synthetic man-made life community.