Journal home Browse Most accessed

Most accessed

  • Select all
  • RESEARCH ARTICLE
    Qijin Yin, Rui Fan, Xusheng Cao, Qiao Liu, Rui Jiang, Wanwen Zeng
    Quantitative Biology, 2023, 11(3): 260-274. https://doi.org/10.15302/J-QB-022-0320

    Background: Computational approaches for accurate prediction of drug interactions, such as drug-drug interactions (DDIs) and drug-target interactions (DTIs), are highly demanded for biochemical researchers. Despite the fact that many methods have been proposed and developed to predict DDIs and DTIs respectively, their success is still limited due to a lack of systematic evaluation of the intrinsic properties embedded in the corresponding chemical structure.

    Methods: In this paper, we develop DeepDrug, a deep learning framework for overcoming the above limitation by using residual graph convolutional networks (Res-GCNs) and convolutional networks (CNNs) to learn the comprehensive structure- and sequence-based representations of drugs and proteins.

    Results: DeepDrug outperforms state-of-the-art methods in a series of systematic experiments, including binary-class DDIs, multi-class/multi-label DDIs, binary-class DTIs classification and DTIs regression tasks. Furthermore, we visualize the structural features learned by DeepDrug Res-GCN module, which displays compatible and accordant patterns in chemical properties and drug categories, providing additional evidence to support the strong predictive power of DeepDrug. Ultimately, we apply DeepDrug to perform drug repositioning on the whole DrugBank database to discover the potential drug candidates against SARS-CoV-2, where 7 out of 10 top-ranked drugs are reported to be repurposed to potentially treat coronavirus disease 2019 (COVID-19).

    Conclusions: To sum up, we believe that DeepDrug is an efficient tool in accurate prediction of DDIs and DTIs and provides a promising insight in understanding the underlying mechanism of these biochemical relations.

  • PERSPECTIVE
    Xuegong Zhang, Lei Wei, Rui Jiang, Xiaowo Wang, Jin Gu, Zhen Xie, Hairong Lv
    Quantitative Biology, 2023, 11(3): 207-213. https://doi.org/10.15302/J-QB-023-0331

    The rapid development of biological technology (BT) and information technology (IT) especially of genomics and artificial intelligence (AI) is bringing great potential for revolutionizing future medicine. We propose the concept and framework of Digital Life Systems or dLife as a new paradigm to unleash this potential. It includes the multi-scale and multi-granule measure and representation of life in the digital space, the mathematical and/or computational modeling of the biology behind physiological and pathological processes, and ultimately cyber twins of healthy or diseased human body in the virtual space that can be used to simulate complex biological processes and deduce effects of medical treatments. We advocate that dLife is the route toward future AI precision medicine and should be the new paradigm for future biological and medical research.

  • RESEARCH ARTICLE
    Yuanpeng Xiong, Xuan He, Dan Zhao, Tao Jiang, Jianyang Zeng
    Quantitative Biology, 2023, 11(3): 275-286. https://doi.org/10.15302/J-QB-022-0316

    Background: Chromatin-associated RNA (caRNA) acts as a ubiquitous epigenetic layer in eukaryotes, and has been reported to be essential in various biological processes, including gene transcription, chromatin remodeling and cellular differentiation. Recently, numerous experimental techniques have been developed to characterize genome-wide RNA-chromatin interactions to understand their underlying biological functions. However, these experimental methods are generally expensive, time-consuming, and limited in identifying all potential sites, while most of the existing computational methods are restricted to detecting only specific types of RNAs interacting with chromatin.

    Methods: Here, we propose a highly interpretable computational framework, named DeepRCI, to identify the interactions between various types of RNAs and chromatin. In this framework, we introduce a novel deep learning component called variformer and integrate multi-omics data to capture intrinsic genomic features at both RNA and DNA levels.

    Results: Extensive experiments demonstrate that DeepRCI can detect RNA-chromatin interactions more accurately when compared to the state-of-the-art baseline prediction methods. Furthermore, the sequence features extracted by DeepRCI can be well matched to known critical gene regulatory components, indicating that our model can provide useful biological insights into understanding the underlying mechanisms of RNA-chromatin interactions. In addition, based on the prediction results, we further delineate the relationships between RNA-chromatin interactions and cellular functions, including gene expression and the modulation of cell states.

    Conclusions: In summary, DeepRCI can serve as a useful tool for characterizing RNA-chromatin interactions and studying the underlying gene regulatory code.

  • RESEARCH ARTICLE
    Xabier Martinez-de-Morentin, Sumeer A. Khan, Robert Lehmann, Sisi Qu, Alberto Maillo, Narsis A. Kiani, Felipe Prosper, Jesper Tegner, David Gomez-Cabrero
    Quantitative Biology, 2023, 11(3): 246-259. https://doi.org/10.15302/J-QB-022-0318

    Background: Single-cell multi-omics technologies allow a profound system-level biology understanding of cells and tissues. However, an integrative and possibly systems-based analysis capturing the different modalities is challenging. In response, bioinformatics and machine learning methodologies are being developed for multi-omics single-cell analysis. It is unclear whether current tools can address the dual aspect of modality integration and prediction across modalities without requiring extensive parameter fine-tuning.

    Methods: We designed LIBRA, a neural network based framework, to learn translation between paired multi-omics profiles so that a shared latent space is constructed. Additionally, we implemented a variation, aLIBRA, that allows automatic fine-tuning by identifying parameter combinations that optimize both the integrative and predictive tasks. All model parameters and evaluation metrics are made available to users with minimal user iteration. Furthermore, aLIBRA allows experienced users to implement custom configurations. The LIBRA toolbox is freely available as R and Python libraries at GitHub (TranslationalBioinformaticsUnit/LIBRA).

    Results: LIBRA was evaluated in eight multi-omic single-cell data-sets, including three combinations of omics. We observed that LIBRA is a state-of-the-art tool when evaluating the ability to increase cell-type (clustering) resolution in the integrated latent space. Furthermore, when assessing the predictive power across data modalities, such as predictive chromatin accessibility from gene expression, LIBRA outperforms existing tools. As expected, adaptive parameter optimization (aLIBRA) significantly boosted the performance of learning predictive models from paired data-sets.

    Conclusion: LIBRA is a versatile tool that performs competitively in both “integration” and “prediction” tasks based on single-cell multi-omics data. LIBRA is a data-driven robust platform that includes an adaptive learning scheme.

  • REVIEW
    Yan Yan, Liheng Yang, Leyuan Meng, Haochen Su, Cheng Zhou, Le Yu, Zhengtu Li, Xu Zhang, Huihua Cai, Juntao Gao
    Quantitative Biology, 2023, 11(3): 231-245. https://doi.org/10.15302/J-QB-023-0332

    Background: Spatial multi-omics are demonstrated to be a powerful method to assist researchers on genetic studies. In this review, bioimaging-based spatial multi-omics techniques such as seqFISH+, merFISH, integrated DNA seqFISH+, DNA merFISH, and MINA are introduced along with each technique’s probe design, development, and imaging processes.

    Results: seqFISH employed 4–5 fluorophores to barcode and conducted multiple rounds of hybridization, in order that mRNA can be identified through color-coding. seqFISH+ added 60 pseudo-color and distributed them equally into three channels to enhance imaging power, in order that i.e., 24,000 genes can be imaged in total. merFISH utilized 4 out 16 Hamming distance to innovatively provide a robust error-detecting method. MINA, a methodology combining merFISH (multiplexed error-robust fluorescence in situ hybridization) and chromosomal tracing, enabled multiplexed genomic architecture imaged in mammalian single cells. Optical reconstruction of chromatin architecture (ORCA) a method that could conduct DNA path tracing in nanoscale manner with kilobase resolution, an FISH variation that improved genetic resolution, enable high-precision fiducial registration and sequential imaging, and utilized Oligopaint probe to hybridize the short genomic region ranging from 2 to 10 kilobase. ORCA then prescribes these short section primary probes with individual barcodes to attach fluorophore and to be imaged.

    Conclusion: This review concentrated on providing a comprehensive overview for these spatial-multi-omics techniques with the intention on helping researchers on selecting appropriate technique for their research.

  • REVIEW
    Chenfei Tian, Jianhua Li, Yong Wang
    Quantitative Biology, 2023, 11(3): 214-230. https://doi.org/10.15302/J-QB-022-0326

    Backgrounds: As an increasing number of synthetic switches and circuits have been created for plant systems and of synthetic products produced in plant chassis, plant synthetic biology is taking a strong foothold in agriculture and medicine. The ever-exploding data has also promoted the expansion of toolkits in this field. Genetic parts libraries and quantitative characterization approaches have been developed. However, plant synthetic biology is still in its infancy. The considerations for selecting biological parts to design and construct genetic circuits with predictable functions remain desired.

    Results: In this article, we review the current biotechnological progresses in field of plant synthetic biology. Assembly standardization and quantitative approaches of genetic parts and genetic circuits are discussed. We also highlight the main challenges in the iterative cycles of design-build-test-learn for introducing novel traits into plants.

    Conclusion: Plant synthetic biology promises to provide important solutions to many issues in agricultural production, human health care, and environmental sustainability. However, tremendous challenges exist in this field. For example, the quantitative characterization of genetic parts is limited; the orthogonality and the transfer functions of circuits are unpredictable; and also, the mathematical modeling-assisted circuits design still needs to improve predictability and reliability. These challenges are expected to be resolved in the near future as interests in this field are intensifying.

  • RESEARCH ARTICLE
    Xiuquan Wang, Mian Umair Ahsan, Yunyun Zhou, Kai Wang
    Quantitative Biology, 2023, 11(3): 287-296. https://doi.org/10.15302/J-QB-022-0323

    Background: Oxford Nanopore long-read sequencing technology addresses current limitations for DNA methylation detection that are inherent in short-read bisulfite sequencing or methylation microarrays. A number of analytical tools, such as Nanopolish, Guppy/Tombo and DeepMod, have been developed to detect DNA methylation on Nanopore data. However, additional improvements can be made in computational efficiency, prediction accuracy, and contextual interpretation on complex genomics regions (such as repetitive regions, low GC density regions).

    Method: In the current study, we apply Transformer architecture to detect DNA methylation on ionic signals from Oxford Nanopore sequencing data. Transformer is an algorithm that adopts self-attention architecture in the neural networks and has been widely used in natural language processing.

    Results: Compared to traditional deep-learning method such as convolutional neural network (CNN) and recurrent neural network (RNN), Transformer may have specific advantages in DNA methylation detection, because the self-attention mechanism can assist the relationship detection between bases that are far from each other and pay more attention to important bases that carry characteristic methylation-specific signals within a specific sequence context.

    Conclusion: We demonstrated the ability of Transformers to detect methylation on ionic signal data.

  • RESEARCH ARTICLE
    Qin Xie, Wei Ma, Jianhang Zhang, Shiliang Li, Xiaobing Deng, Youjun Xu, Weilin Zhang
    Quantitative Biology, 2023, 11(3): 320-331. https://doi.org/10.15302/J-QB-022-0321

    Background: Molecular docking-based virtual screening (VS) aims to choose ligands with potential pharmacological activities from millions or even billions of molecules. This process could significantly cut down the number of compounds that need to be experimentally tested. However, during the docking calculation, many molecules have low affinity for a particular protein target, which waste a lot of computational resources.

    Methods: We implemented a fast and practical molecular screening approach called DL-DockVS (deep learning dock virtual screening) by using deep learning models (regression and classification models) to learn the outcomes of pipelined docking programs step-by-step.

    Results: In this study, we showed that this approach could successfully weed out compounds with poor docking scores while keeping compounds with potentially high docking scores against 10 DUD-E protein targets. A self-built dataset of about 1.9 million molecules was used to further verify DL-DockVS, yielding good results in terms of recall rate, active compounds enrichment factor and runtime speed.

    Conclusions: We comprehensively evaluate the practicality and effectiveness of DL-DockVS against 10 protein targets. Due to the improvements of runtime and maintained success rate, it would be a useful and promising approach to screen ultra-large compound libraries in the age of big data. It is also very convenient for researchers to make a well-trained model of one specific target for predicting other chemical libraries and high docking-score molecules without docking computation again.

  • RESEARCH ARTICLE
    Elham Dalalbashi Esfahani, Esmaeil Ebrahimie, Ali Niazi, Manijeh Mohammadi Dehcheshmeh
    Quantitative Biology, 2023, 11(3): 343-358. https://doi.org/10.15302/J-QB-023-0333

    Background: Accumulating evidence shows that long non-coding RNAs (lncRNAs) play critical roles in cancer progression. The possible association between lncRNAs and herbal medicine is yet to be known. This study aims to identify medicinal herbs associated with lncRNAs by RNA-seq data for breast and prostate cancer.

    Methods: To develop the optimal approach for identifying cancer-related lncRNAs, we implemented two steps: (1) applying protein–protein interaction (PPI), Gene Ontology (GO), and pathway analyses, and (2) applying attribute weighting and finding the efficient classification model of the machine learning approach.

    Results: In the first step, GO terms and pathway analyses on differential co-expressed mRNAs revealed that lncRNAs were widely co-expressed with metabolic process genes. We identified two hub lncRNA-mRNA networks that implicate lncRNAs associated with breast and prostate cancer. In the second step, we implemented various machine learning-based prediction systems (Decision Tree, Random Forest, Deep Learning, and Gradient-Boosted Tree) on the non-transformed and Z-standardized differential co-expressed lncRNAs. Based on five-fold cross-validation, we obtained high accuracy (91.11%), high sensitivity (88.33%), and high specificity (93.33%) in Deep Learning which reinforces the biomarker power of identified lncRNAs in this study. As data originally came from different cell lines at different durations of herbal treatment intervention, we applied seven attribute weighting algorithms to check the effects of variables on identifying lncRNAs. Attribute weighting results showed that the cell line and time had little or no effect on the selected lncRNAs list. Besides, we identified one known lncRNAs, downregulated RNA in cancer (DRAIC), as an essential feature.

    Conclusions: This study will provide further insights to investigate the potential therapeutic and prognostic targets for prostate cancer (PC) and breast cancer (BC) in common.

  • RESEARCH ARTICLE
    Nan Miles Xi, Angelos Vasilopoulos
    Quantitative Biology, 2023, 11(3): 297-305. https://doi.org/10.15302/J-QB-022-0324

    Background: The existence of doublets in single-cell RNA sequencing (scRNA-seq) data poses a great challenge in downstream data analysis. Computational doublet-detection methods have been developed to remove doublets from scRNA-seq data. Yet, the default hyperparameter settings of those methods may not provide optimal performance.

    Methods: We propose a strategy to tune hyperparameters for a cutting-edge doublet-detection method. We utilize a full factorial design to explore the relationship between hyperparameters and detection accuracy on 16 real scRNA-seq datasets. The optimal hyperparameters are obtained by a response surface model and convex optimization.

    Results: We show that the optimal hyperparameters provide top performance across scRNA-seq datasets under various biological conditions. Our tuning strategy can be applied to other computational doublet-detection methods. It also offers insights into hyperparameter tuning for broader computational methods in scRNA-seq data analysis.

    Conclusions: The hyperparameter configuration significantly impacts the performance of computational doublet-detection methods. Our study is the first attempt to systematically explore the optimal hyperparameters under various biological conditions and optimization objectives. Our study provides much-needed guidance for hyperparameter tuning in computational doublet-detection methods.

  • RESEARCH ARTICLE
    Leandro R. Jones, Julieta M. Manrique
    Quantitative Biology, 2023, 11(3): 332-342. https://doi.org/10.15302/J-QB-023-0329

    Background: Massively parallel sequencing of environmental DNA allows microbiological studies to be performed in greater detail than was possible with first-generation sequencing. For example, it facilitates the use of approaches hitherto largely applied to flora and fauna, such as rank abundance distribution (RAD) analyses.

    Methods: Here, we set out to advance the knowledge on Ca. Pelagibacterales (SAR11) communities from southern South America using environmental sequences from the open ocean in the Argentine sea, the uncharted Engaño Bay, as well as a river and an oligohaline shallow lake from the Patagonian Steppe ecoregion. The structures of the SAR11 assemblages present in these ecosystems were dissected by direct and rarefaction-based estimates of species richness, and evaluations of the corresponding abundance distributions (ADs), which was addressed by RAD analyses.

    Results: Microbial community composition analyses revealed that the studied SAR11 assemblages coexist with 27 bacterial phyla. SAR11 richness was in general very high, but ADs turned out to be highly uneven. The results were compatible with prior knowledge, and similar to that derived from point estimates of diversity. However, our comprehensive dissection allowed for more detailed quantitative comparisons to be made between the environments surveyed, and revealed differences regarding both richness and the underlying ADs.

    Conclusions: Despite SAR11 assemblages being extremely rich, their ADs are very uneven. Richness and ADs can vary, not only between fresh and salt water, but also between oceanic and coastal marine environments. The obtained results provide insights on general topics such as adaptation and the contrast between marine and freshwater radiations.

  • RESEARCH ARTICLE
    Doyoung Park
    Quantitative Biology, 2023, 11(3): 306-319. https://doi.org/10.15302/J-QB-022-0325

    Background: Living cells need to undergo subtle shape adaptations in response to the topography of their substrates. These shape changes are mainly determined by reorganization of their internal cytoskeleton, with a major contribution from filamentous (F) actin. Bundles of F-actin play a major role in determining cell shape and their interaction with substrates, either as “stress fibers,” or as our newly discovered “Concave Actin Bundles” (CABs), which mainly occur while endothelial cells wrap micro-fibers in culture.

    Methods: To better understand the morphology and functions of these CABs, it is necessary to recognize and analyze as many of them as possible in complex cellular ensembles, which is a demanding and time-consuming task. In this study, we present a novel algorithm to automatically recognize CABs without further human intervention. We developed and employed a multilayer perceptron artificial neural network (“the recognizer”), which was trained to identify CABs.

    Results: The recognizer demonstrated high overall recognition rate and reliability in both randomized training, and in subsequent testing experiments.

    Conclusion: It would be an effective replacement for validation by visual detection which is both tedious and inherently prone to errors.

  • REVIEW ARTICLE
    Yazhen Song, Chenxi Feng, Difei Zhou, Zengxin Ma, Lian He, Cong Zhang, Guihong Yu, Yan Zhao, Song Yang, Xinhui Xing
    Quantitative Biology, 2024, 12(1): 1-14. https://doi.org/10.1002/qub2.31

    Developing methylotrophic cell factories that can efficiently catalyze organic one-carbon (C1) feedstocks derived from electrocatalytic reduction of carbon dioxide into bio-based chemicals and biofuels is of strategic significance for building a carbon-neutral, sustainable economic and industrial system. With the rapid advancement of RNA sequencing technology and mass spectrometer analysis, researchers have used these quantitative microbiology methods extensively, especially isotope-based metabolic flux analysis, to study the metabolic processes initiating from C1 feedstocks in natural C1-utilizing bacteria and synthetic C1 bacteria. This paper reviews the use of advanced quantitative analysis in recent years to understand the metabolic network and basic principles in the metabolism of natural C1-utilizing bacteria grown on methane, methanol, or formate. The acquired knowledge serves as a guide to rewire the central methylotrophic metabolism of natural C1-utilizing bacteria to improve the carbon conversion efficiency, and to engineer non-C1-utilizing bacteria into synthetic strains that can use C1 feedstocks as the sole carbon and energy source. These progresses ultimately enhance the design and construction of highly efficient C1-based cell factories to synthesize diverse high value-added products. The integration of quantitative biology and synthetic biology will advance the iterative cycle of understand–design–build–testing–learning to enhance C1-based biomanufacturing in the future.

  • REVIEW ARTICLE
    Xinyue Li, Zhankun Xiong, Wen Zhang, Shichao Liu
    Quantitative Biology, 2024, 12(1): 30-52. https://doi.org/10.1002/qub2.32

    The prediction of drug-drug interactions (DDIs) is a crucial task for drug safety research, and identifying potential DDIs helps us to explore the mechanism behind combinatorial therapy. Traditional wet chemical experiments for DDI are cumbersome and time-consuming, and are too small in scale, limiting the efficiency of DDI predictions. Therefore, it is particularly crucial to develop improved computational methods for detecting drug interactions. With the development of deep learning, several computational models based on deep learning have been proposed for DDI prediction. In this review, we summarized the high-quality DDI prediction methods based on deep learning in recent years, and divided them into four categories: neural network-based methods, graph neural network-based methods, knowledge graph-based methods, and multimodal-based methods. Furthermore, we discuss the challenges of existing methods and future potential perspectives. This review reveals that deep learning can significantly improve DDI prediction performance compared to traditional machine learning. Deep learning models can scale to large-scale datasets and accept multiple data types as input, thus making DDI predictions more efficient and accurate.

  • REVIEW ARTICLE
    Yuhang Xia, Yongkang Wang, Zhiwei Wang, Wen Zhang
    Quantitative Biology, 2024, 12(1): 15-29. https://doi.org/10.1002/qub2.30

    Drug discovery is aimed to design novel molecules with specific chemical properties for the treatment of targeting diseases. Generally, molecular optimization is one important step in drug discovery, which optimizes the physical and chemical properties of a molecule. Currently, artificial intelligence techniques have shown excellent success in drug discovery, which has emerged as a new strategy to address the challenges of drug design including molecular optimization, and drastically reduce the costs and time for drug discovery. We review the latest advances of molecular optimization in artificial intelligence-based drug discovery, including data resources, molecular properties, optimization methodologies, and assessment criteria for molecular optimization. Specifically, we classify the optimization methodologies into molecular mapping-based, molecular distribution matching-based, and guided search-based methods, respectively, and discuss the principles of these methods as well as their pros and cons. Moreover, we highlight the current challenges in molecular optimization and offer a variety of perspectives, including interpretability, multidimensional optimization, and model generalization, on potential new lines of research to pursue in future. This study provides a comprehensive review of molecular optimization in artificial intelligence-based drug discovery, which points out the challenges as well as the new prospects. This review will guide researchers who are interested in artificial intelligence molecular optimization.

  • RESEARCH ARTICLE
    Tianxing Ma, Zetong Zhao, Haochen Li, Lei Wei, Xuegong Zhang
    Quantitative Biology, 2024, 12(1): 70-84. https://doi.org/10.1002/qub2.28

    Complicated molecular alterations in tumors generate various mutant peptides. Some of these mutant peptides can be presented to the cell surface and then elicit immune responses, and such mutant peptides are called neoantigens. Accurate detection of neoantigens could help to design personalized cancer vaccines. Although some computational frameworks for neoantigen detection have been proposed, most of them can only detect SNV- and indel-derived neoantigens. In addition, current frameworks adopt oversimplified neoantigen prioritization strategies. These factors hinder the comprehensive and effective detection of neoantigens. We developed NeoHunter, flexible software to systematically detect and prioritize neoantigens from sequencing data in different formats. NeoHunter can detect not only SNV- and indel-derived neoantigens but also gene fusion- and aberrant splicing-derived neoantigens. NeoHunter supports both direct and indirect immunogenicity evaluation strategies to prioritize candidate neoantigens. These strategies utilize binding characteristics, existing biological big data, and T-cell receptor specificity to ensure accurate detection and prioritization. We applied NeoHunter to the TESLA dataset, cohorts of melanoma and non-small cell lung cancer patients. NeoHunter achieved high performance across the TESLA cancer patients and detected 79% (27 out of 34) of validated neoantigens in total. SNV- and indel-derived neoantigens accounted for 90% of the top 100 candidate neoantigens while neoantigens from aberrant splicing accounted for 9%. Gene fusion-derived neoantigens were detected in one patient. NeoHunter is a powerful tool to ‘catch all’ neoantigens and is available for free academic use on Github (XuegongLab/NeoHunter).

  • RESEARCH ARTICLE
    Siyu Li, Songming Tang, Yunchang Wang, Sijie Li, Yuhang Jia, Shengquan Chen
    Quantitative Biology, 2024, 12(1): 85-99. https://doi.org/10.1002/qub2.33

    Recent advances in single-cell chromatin accessibility sequencing (scCAS) technologies have resulted in new insights into the characterization of epigenomic heterogeneity and have increased the need for automatic cell type annotation. However, existing automatic annotation methods for scCAS data fail to incorporate the reference data and neglect novel cell types, which only exist in a test set. Here, we propose RAINBOW, a reference-guided automatic annotation method based on the contrastive learning framework, which is capable of effectively identifying novel cell types in a test set. By utilizing contrastive learning and incorporating reference data, RAINBOW can effectively characterize the heterogeneity of cell types, thereby facilitating more accurate annotation. With extensive experiments on multiple scCAS datasets, we show the advantages of RAINBOW over state-of-the-art methods in known and novel cell type annotation. We also verify the effectiveness of incorporating reference data during the training process. In addition, we demonstrate the robustness of RAINBOW to data sparsity and number of cell types. Furthermore, RAINBOW provides superior performance in newly sequenced data and can reveal biological implication in downstream analyses. All the results demonstrate the superior performance of RAINBOW in cell type annotation for scCAS data. We anticipate that RAINBOW will offer essential guidance and great assistance in scCAS data analysis. The source codes are available at the GitHub website (BioX-NKU/RAINBOW).

  • REVIEW ARTICLE
    Mahsa Babaei, Soheila Kashanian, Huang-Teck Lee, Frances Harding
    Quantitative Biology, 2024, 12(1): 53-69. https://doi.org/10.1002/qub2.35

    Protein biomarkers represent specific biological activities and processes, so they have had a critical role in cancer diagnosis and medical care for more than 50 years. With the recent improvement in proteomics technologies, thousands of protein biomarker candidates have been developed for diverse disease states. Studies have used different types of samples for proteomics diagnosis. Samples were pretreated with appropriate techniques to increase the selectivity and sensitivity of the downstream analysis and purified to remove the contaminants. The purified samples were analyzed by several principal proteomics techniques to identify the specific protein. In this study, recent improvements in protein biomarker discovery, verification, and validation are investigated. Furthermore, the advantages, and disadvantages of conventional techniques, are discussed. Studies have used mass spectroscopy (MS) as a critical technique in the identification and quantification of candidate biomarkers. Nevertheless, after protein biomarker discovery, verification and validation have been required to reduce the false-positive rate where there have been higher number of samples. Multiple reaction monitoring (MRM), parallel reaction monitoring (PRM), and selected reaction monitoring (SRM), in combination with stable isotope-labeled internal standards, have been examined as options for biomarker verification, and enzyme-linked immunosorbent assay (ELISA) for validation.

  • RESEARCH ARTICLE
    Keran Sun, Jingyuan Ning, Keqi Jia, Xiaoqing Fan, Hongru Li, Jize Ma, Meiqi Meng, Cuiqing Ma, Lin Wei
    Quantitative Biology, 2024, 12(1): 100-116. https://doi.org/10.1002/qub2.36

    To investigate the impact of hyperglycemia on the prognosis of patients with gastric cancer and identify key molecules associated with high glucose levels in gastric cancer development, RNA sequencing data and clinical features of gastric cancer patients were obtained from The Cancer Genome Atlas (TCGA) database. High glucose-related genes strongly associated with gastric cancer were identified using weighted gene co-expression network and differential analyses. A gastric cancer prognosis signature was constructed based on these genes and patients were categorized into high- and low-risk groups. The immune statuses of the two patient groups were compared. ATP citrate lyase (ACLY), a gene significantly related to the prognosis, was found to be upregulated upon high-glucose stimulation. Immunohistochemistry and molecular analyses confirmed high ACLY expression in gastric cancer tissues and cells. Gene Set Enrichment Analysis (GSEA) revealed the involvement of ACLY in cell cycle and DNA replication processes. Inhibition of ACLY affected the proliferation and migration of gastric cancer cells induced by high glucose levels. These findings suggest that ACLY, as a high glucose-related gene, plays a critical role in gastric cancer progression.

  • RESEARCH ARTICLE
    Jiarui Ou, Le Zhang, Xiaoli Ru
    Quantitative Biology, 2024, 12(1): 117-127. https://doi.org/10.1002/qub2.29

    Cardiovascular disease (CVD) is the major cause of death in many regions around the world, and several of its risk factors might be linked to diets. To improve public health and the understanding of this topic, we look at the recent Minnesota Coronary Experiment (MCE) analysis that used t-test and Cox model to evaluate CVD risks. However, these parametric methods might suffer from three problems: small sample size, right-censored bias, and lack of long-term evidence. To overcome the first of these challenges, we utilize a nonparametric permutation test to examine the relationship between dietary fats and serum total cholesterol. To address the second problem, we use a resampling-based rank test to examine whether the serum total cholesterol level affects CVD deaths. For the third issue, we use some extra-Framingham Heart Study (FHS) data with an A/B test to look for meta-relationship between diets, risk factors, and CVD risks. We show that, firstly, the link between low saturated fat diets and reduction in serum total cholesterol is strong. Secondly, reducing serum total cholesterol does not robustly have an impact on CVD hazards in the diet group. Lastly, the A/B test result suggests a more complicated relationship regarding abnormal diastolic blood pressure ranges caused by diets and how these might affect the associative link between the cholesterol level and heart disease risks. This study not only helps us to deeply analyze the MCE data but also, in combination with the long-term FHS data, reveals possible complex relationships behind diets, risk factors, and heart disease.

  • EDITORIAL
    Michael Q. Zhang
    Quantitative Biology, 2023, 11(4): 359-362. https://doi.org/10.1002/qub2.5
  • REVIEW ARTICLE
    Chao Pang, Henry H. Y. Tong, Leyi Wei
    Quantitative Biology, 2023, 11(4): 395-404. https://doi.org/10.1002/qub2.23

    The prediction of molecular properties is a crucial task in the field of drug discovery. Computational methods that can accurately predict molecular properties can significantly accelerate the drug discovery process and reduce the cost of drug discovery. In recent years, iterative updates in computing hardware and the rise of deep learning have created a new and effective path for molecular property prediction. Deep learning methods can leverage the vast amount of data accumulated over the years in drug discovery and do not require complex feature engineering. In this review, we summarize molecular representations and commonly used datasets in molecular property prediction models and present advanced deep learning methods for molecular property prediction, including state-of-the-art deep learning networks such as graph neural networks and Transformer-based models, as well as state-of-the-art deep learning strategies such as 3D pre-train, contrastive learning, multi-task learning, transfer learning, and meta-learning. We also point out some critical issues such as lack of datasets, low information utilization, and lack of specificity for diseases.

  • REVIEW ARTICLE
    Lu Peng, Zecheng Zhang, Xianyi Wang, Weiyi Qiu, Liqian Zhou, Hui Xiao, Chunxiuzi Liu, Shaohua Tang, Zhiwei Qin, Jiakun Jiang, Zengru Di, Yu Liu
    Quantitative Biology, 2023, 11(4): 376-394. https://doi.org/10.1002/qub2.22

    Creating a man-made life in the laboratory is one of life science’s most intriguing yet challenging problems. Advances in synthetic biology and related theories, particularly those related to the origin of life, have laid the groundwork for further exploration and understanding in this field of artificial life or man-made life. But there remains a wealth of quantitative mathe-matical models and tools that have yet to be applied to this area. In this paper, we review the two main approaches often employed in the field of man-made life: the top-down approach that reduces the complexity of extant and existing living systems and the bottom-up approach that integrates well-defined components, by introducing the theoretical basis, recent advances, and their limitations. We then argue for another possible approach, namely “bottom-up from the origin of life”: Starting with the establishment of auto-catalytic chemical reaction networks that employ physical boundaries as the initial compartments, then designing directed evolutionary systems, with the expectation that independent compartments will eventually emerge so that the system becomes free-living. This approach is actually analogous to the process of how life originated. With this paper, we aim to stimulate the interest of synthetic biologists and experimentalists to consider a more theoretical perspective, and to promote the communication between the origin of life community and the synthetic man-made life community.

  • RESEARCH ARTICLE
    Ke Feng, Hongyang Jiang, Chaoyi Yin, Huiyan Sun
    Quantitative Biology, 2023, 11(4): 434-450. https://doi.org/10.1002/qub2.26

    Gene regulatory network (GRN) inference from gene expression data is a significant approach to understanding aspects of the biological system. Compared with generalized correlation-based methods, causality-inspired ones seem more rational to infer regulatory relationships. We propose GRINCD, a novel GRN inference framework empowered by graph representation learning and causal asymmetric learning, considering both linear and non-linear regulatory relationships. First, high-quality representation of each gene is generated using graph neural network. Then, we apply the additive noise model to predict the causal regulation of each regulator-target pair. Additionally, we design two channels and finally assemble them for robust prediction. Through comprehensive comparisons of our framework with state-of-the-art methods based on different principles on numerous datasets of diverse types and scales, the experimental results show that our framework achieves superior or comparable performance under various evaluation metrics. Our work provides a new clue for constructing GRNs, and our proposed framework GRINCD also shows potential in identifying key factors affecting cancer development.

  • REVIEW ARTICLE
    Feiran Li, Yu Chen, Johan Gustafsson, Hao Wang, Yi Wang, Chong Zhang, Xinhui Xing
    Quantitative Biology, 2023, 11(4): 363-375. https://doi.org/10.1002/qub2.21

    Over the last 15 years, genome-scale metabolic models (GEMs) have been reconstructed for human and model animals, such as mouse and rat, to systematically understand metabolism, simulate multicellular or multi-tissue interplay, understand human diseases, and guide cell factory design for biopharmaceutical protein production. Here, we describe how metabolic networks can be represented using stoichiometric matrices and well-defined constraints for flux simulation. Then, we review the history of GEM development for quantitative understanding of Homo sapiens and other relevant animals, together with their applications. We describe how model develops from H. sapiens to other animals and from generic purpose to precise context-specific simulation. The progress of GEMs for animals greatly expand our systematic understanding of metabolism in human and related animals. We discuss the difficulties and present perspectives on the GEM development and the quest to integrate more biological processes and omics data for future research and translation. We truly hope that this review can inspire new models developed for other mammalian organisms and generate new algorithms for integrating big data to conduct more in-depth analysis to further make progress on human health and biopharmaceutical engineering.

  • RESEARCH ARTICLE
    Hongfei Cui
    Quantitative Biology, 2023, 11(4): 451-470. https://doi.org/10.1002/qub2.25

    The information on host–microbe interactions contained in the operational taxonomic unit (OTU) abundance table can serve as a clue to understanding the biological traits of OTUs and samples. Some studies have inferred the taxonomies or functions of OTUs by constructing co-occurrence networks, but co-occurrence networks can only encompass a small fraction of all OTUs due to the high sparsity of the OTU table. There is a lack of studies that intensively explore and use the information on sample-OTU interactions. This study constructed a sample-OTU heterogeneous information network and represented the nodes in the network through the heterogeneous graph embedding method to form the OTU space and sample space. Taking advantage of the represented OTU and sample vectors combined with the original OTU abundance information, an Integrated Model of Embedded Taxonomies and Abundance (IMETA) was proposed for predicting sample attributes, such as phenotypes and individual diet habits. Both the OTU space and sample space contain reasonable biological or medical semantic information, and the IMETA using embedded OTU and sample vectors can have stable and good performance in the sample classification tasks. This suggests that the embedding representation based on the sample-OTU heterogeneous information network can provide more useful information for understanding microbiome samples. This study conducted quantified representations of the biological characteristics within the OTUs and samples, which is a good attempt to increase the utilization rate of information in the OTU abundance table, and it promotes a deeper understanding of the underlying knowledge of human microbiome.

  • COMMENTARY
    Jianfeng Feng
    Quantitative Biology, 2023, 11(4): 471-473. https://doi.org/10.1002/qub2.6
  • RESEARCH ARTICLE
    Dali Wang, Jiaxuan Li, Lei Wang, Yipeng Cao, Bo Kang, Xiangfei Meng, Sai Li, Chen Song
    Quantitative Biology, 2023, 11(4): 421-433. https://doi.org/10.1002/qub2.20

    The causative pathogen of coronavirus disease 2019 (COVID-19), severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is an enveloped virus assembled by a lipid envelope and multiple structural proteins. In this study, by integrating experimental data, structural modeling, as well as coarse-grained and all-atom molecular dynamics simulations, we constructed multiscale models of SARS-CoV-2. Our 500-ns coarse-grained simulation of the intact virion allowed us to investigate the dynamic behavior of the membrane-embedded proteins and the surrounding lipid molecules in situ. Our results indicated that the membrane-embedded proteins are highly dynamic, and certain types of lipids exhibit various binding preferences to specific sites of the membrane-embedded proteins. The equilibrated virion model was transformed into atomic resolution, which provided a 3D structure for scientific demonstration and can serve as a framework for future exascale all-atom molecular dynamics (MD) simulations. A short all-atom molecular dynamics simulation of 255 ps was conducted as a preliminary test for large-scale simulations of this complex system.

  • REVIEW ARTICLE
    Junqi Zhang, Zixuan You, Dingyuan Liu, Rui Tang, Chao Zhao, Yingxiu Cao, Feng Li, Hao Song
    Quantitative Biology, 2023, 11(4): 405-420. https://doi.org/10.1002/qub2.24

    Electroactive microorganisms (EAMs) could utilize extracellular electron transfer (EET) pathways to exchange electrons and energy with their external surroundings. Conductive cytochrome proteins and nanowires play crucial roles in controlling electron transfer rate from cytosol to extracellular electrode. Many previous studies elucidated how the c-type cytochrome proteins and conductive nanowires are synthesized, assembled, and engineered to manipulate the EET rate, and quantified the kinetic processes of electron generation and EET. Here, we firstly overview the electron transfer pathways of EAMs and quantify the kinetic parameters that dictating intracellular electron production and EET. Secondly, we systematically review the structure, conductivity mechanisms, and engineering strategies to manipulate conductive cytochromes and nanowire in EAMs. Lastly, we outlook potential directions for future research in cytochromes and conductive nanowires for enhanced electron transfer. This article reviews the quantitative kinetics of intracellular electron production and EET, and the contribution of engineered c-type cytochromes and conductive nanowire in enhancing the EET rate, which lay the foundation for enhancing electron transfer capacity of EAMs.