Journal home Browse Latest Articles

Latest Articles

  • Select all
  • COMMENTARY
    Minsheng Hao, Lei Wei, Fan Yang, Jianhua Yao, Christina V. Theodoris, Bo Wang, Xin Li, Ge Yang, Xuegong Zhang
    Quantitative Biology, 2024, 12(4): 433-443. https://doi.org/10.1002/qub2.65
  • RESEARCH ARTICLE
    Sarawoot Somin, Don Kulasiri, Sandhya Samarasinghe
    Quantitative Biology, 2024, 12(4): 414-432. https://doi.org/10.1002/qub2.61

    The insulin‐degrading enzyme (IDE) plays a significant role in the degradation of the amyloid beta (Aβ), a peptide found in the brain regions of the patients with early Alzheimer’s disease. Adenosine triphosphate (ATP) allosterically regulates the Aβ‐degrading activity of IDE. The present study investigates the electrostatic interactions between ATP‐IDE at the allosteric site of IDE, including thermostabilities/flexibilities of IDE residues, which have not yet been explored systematically. This study applies the quantum mechanics/molecular mechanics (QM/MM) to the proposed computational model for exploring electrostatic interactions between ATP and IDE. Molecular dynamic (MD) simulations are performed at different temperatures for identifying flexible and thermostable residues of IDE. The proposed computational model predicts QM/MM energy‐minimised structures providing the IDE residues (Lys530 and Asp385) with high binding affinities. Considering root mean square fluctuation values during the MD simulations at 300.00 K including heat‐shock temperatures (321.15 K and 315.15 K) indicates that Lys530 and Asp385 are also the thermostable residues of IDE, whereas Ser576 and Lys858 have high flexibilities with compromised thermostabilities. The present study sheds light on the phenomenon of biological recognition and interactions at the ATP‐binding domain, which may have important implications for pharmacological drug design. The proposed computational model may facilitate the development of allosteric IDE activators/inhibitors, which mimic ATP interactions.

  • RESEARCH ARTICLE
    Gabor Kiss, Salissou Moutari, Cara Mctaggart, Lynsey Patterson, Frank Kee, Felicity Lamrock
    Quantitative Biology, 2024, 12(4): 400-413. https://doi.org/10.1002/qub2.50

    This study introduces a deterministic formulation for modelling the asymptotic spread of a vaccine preventable disease as well as the different stages for the progression of the disease. We derive the formula for the associated basic reproduction number. To illustrate the proposed model, we use data from the 2017–2018 diphtheria outbreak in Yemen and fit the parameters of the model. A sensitivity analysis of the basic reproduction number, with respect to the model parameters, show that this number increases with an increase of the transmission rate while this number decreases when vaccination rate increases.

  • RESEARCH ARTICLE
    Yunjie Shi, Yun Cheng, Peiyu Chen, Lexiang Zhang, Fangfu Ye
    Quantitative Biology, 2024, 12(4): 389-399. https://doi.org/10.1002/qub2.58

    Breast cancer constitutes a significant global health burden, while conventional diagnosis approaches may lack precision and can be discomforting for patients. Exosomes have emerged as promising biomarkers for breast cancer due to their participation in diverse pathological processes, and a convenient analysis platform is believed to greatly promote its application. In this study, we propose a novel digital PCR approach utilizing near‐infrared (NIR) photo‐responsive thermosensitive microcarriers integrated with black phosphorus for quantifying microRNA (miRNA) biomarkers within exosomes. Petal‐like biomimetic nanomaterials were firstly assembled for non‐specific exosome capture based on the affinity effect of avidin and biotin. Photothermal‐responsive microcarriers, fabricated using gelatin‐based substrates blended with photothermal nanocomposite, exhibited NIR‐induced heating and reversible phase transition properties. We optimized synthesis parameters on thermal response and established a programmable and controllable NIR light source module. The results indicated a significant elevation in the levels of biomarkers miRNA‐1246 and miRNA‐122, with fold increases ranging from 6.2 to 23.6 and 5.9 to 13.0, respectively, in breast cancer cell lines MCF‐7 and MDA‐MB‐231 compared to healthy control cells HUVEC. This study offers broad prospects for utilizing exosomes to resolve predictive biomarkers.

  • RESEARCH ARTICLE
    Junjie Tang, Changhu Wang, Feiyi Xiao, Ruibin Xi
    Quantitative Biology, 2024, 12(4): 375-388. https://doi.org/10.1002/qub2.64

    Gene regulatory network (GRN) refers to the complex network formed by regulatory interactions between genes in living cells. In this paper, we consider inferring GRNs in single cells based on single‐cell RNA sequencing (scRNA‐seq) data. In scRNA‐seq, single cells are often profiled from mixed populations, and their cell identities are unknown. A common practice for single‐cell GRN analysis is to first cluster the cells and infer GRNs for every cluster separately. However, this two‐step procedure ignores uncertainty in the clustering step and thus could lead to inaccurate estimation of the networks. Here, we consider the mixture Poisson log‐normal model (MPLN) for network inference of count data from mixed populations. The precision matrices of the MPLN are the GRNs of different cell types. To avoid the intractable optimization of the MPLN’s log‐likelihood, we develop an algorithm called variational mixture Poisson log‐normal (VMPLN) to jointly estimate the GRNs of different cell types based on the variational inference method. We compare VMPLN with state‐of‐the‐art single‐cell regulatory network inference methods. Comprehensive simulation shows that VMPLN achieves better performance, especially in scenarios where different cell types have a high mixing degree. Benchmarking on real scRNA‐seq data also demonstrates that VMPLN can provide more accurate network estimation in most cases. Finally, we apply VMPLN to a large scRNA‐seq dataset from patients infected with severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) and find that VMPLN identifies critical differences in regulatory networks in immune cells between patients with moderate and severe symptoms. The source codes are available on the GitHub website (github.com/XiDsLab/SCVMPLN).

  • RESEARCH ARTICLE
    Muhammad Azam, Yibo Chen, Micheal Olaolu Arowolo, Haowang Liu, Mihail Popescu, Dong Xu
    Quantitative Biology, 2024, 12(4): 360-374. https://doi.org/10.1002/qub2.57

    Understanding complex biological pathways, including gene–gene interactions and gene regulatory networks, is critical for exploring disease mechanisms and drug development. Manual literature curation of biological pathways cannot keep up with the exponential growth of new discoveries in the literature. Large‐scale language models (LLMs) trained on extensive text corpora contain rich biological information, and they can be mined as a biological knowledge graph. This study assesses 21 LLMs, including both application programming interface (API)‐based models and open‐source models in their capacities of retrieving biological knowledge. The evaluation focuses on predicting gene regulatory relations (activation, inhibition, and phosphorylation) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway components. Results indicated a significant disparity in model performance. API‐based models GPT‐4 and Claude‐Pro showed superior performance, with an F1 score of 0.4448 and 0.4386 for the gene regulatory relation prediction, and a Jaccard similarity index of 0.2778 and 0.2657 for the KEGG pathway prediction, respectively. Open‐source models lagged behind their API‐based counterparts, whereas Falcon‐180b and llama2‐7b had the highest F1 scores of 0.2787 and 0.1923 in gene regulatory relations, respectively. The KEGG pathway recognition had a Jaccard similarity index of 0.2237 for Falcon‐180b and 0.2207 for llama2‐7b. Our study suggests that LLMs are informative in gene network analysis and pathway mapping, but their effectiveness varies, necessitating careful model selection. This work also provides a case study and insight into using LLMs das knowledge graphs. Our code is publicly available at the website of GitHub (Muh‐aza).

  • REVIEW ARTICLE
    Jinge Wang, Zien Cheng, Qiuming Yao, Li Liu, Dong Xu, Gangqing Hu
    Quantitative Biology, 2024, 12(4): 345-359. https://doi.org/10.1002/qub2.67

    The year 2023 marked a significant surge in the exploration of applying large language model chatbots, notably Chat Generative Pre‐trained Transformer (ChatGPT), across various disciplines. We surveyed the application of ChatGPT in bioinformatics and biomedical informatics throughout the year, covering omics, genetics, biomedical text mining, drug discovery, biomedical image understanding, bioinformatics programming, and bioinformatics education. Our survey delineates the current strengths and limitations of this chatbot in bioinformatics and offers insights into potential avenues for future developments.

  • PERSPECTIVE
    Ziyu Chen, Lin Wei, Ge Gao
    Quantitative Biology, 2024, 12(4): 339-344. https://doi.org/10.1002/qub2.69

    Transformer‐based foundation models such as ChatGPTs have revolutionized our daily life and affected many fields including bioinformatics. In this perspective, we first discuss about the direct application of textual foundation models on bioinformatics tasks, focusing on how to make the most out of canonical large language models and mitigate their inherent flaws. Meanwhile, we go through the transformer‐based, bioinformatics‐tailored foundation models for both sequence and non‐sequence data. In particular, we envision the further development directions as well as challenges for bioinformatics foundation models.

  • PERSPECTIVE
    Christina V. Theodoris
    Quantitative Biology, 2024, 12(4): 335-338. https://doi.org/10.1002/qub2.68

    Transfer learning has revolutionized fields including natural language understanding and computer vision by leveraging large‐scale general datasets to pretrain models with foundational knowledge that can then be transferred to improve predictions in a vast range of downstream tasks. More recently, there has been a growth in the adoption of transfer learning approaches in biological fields, where models have been pretrained on massive amounts of biological data and employed to make predictions in a broad range of biological applications. However, unlike in natural language where humans are best suited to evaluate models given a clear understanding of the ground truth, biology presents the unique challenge of being in a setting where there are a plethora of unknowns while at the same time needing to abide by real‐world physical constraints. This perspective provides a discussion of some key points we should consider as a field in designing benchmarks for foundation models in network biology.

  • TECHNICAL NOTE
    Jianfeng Cao, Lihan Hu, Guoye Guan, Zelin Li, Zhongying Zhao, Chao Tang, Hong Yan
    Quantitative Biology, 2024, 12(3): 329-334. https://doi.org/10.1002/qub2.47

    Caenorhabditis elegans has been widely used as a model organism in developmental biology due to its invariant development. In this study, we developed a desktop software CShaperApp to segment fluorescence‐labeled images of cell membranes and analyze cellular morphologies interactively during C. elegans embryogenesis. Based on the previously proposed framework CShaper, CShaperApp empowers biologists to automatically and efficiently extract quantitative cellular morphological data with either an existing deep learning model or a fine‐tuned one adapted to their in‐house dataset. Experimental results show that it takes about 30 min to process a three‐dimensional time‐lapse (4D) dataset, which consists of 150 image stacks at a ~1.5‐min interval and covers C. elegans embryogenesis from the 4‐cell to 350‐cell stages. The robustness of CShaperApp is also validated with the datasets from different laboratories. Furthermore, modularized implementation increases the flexibility in multi‐task applications and promotes its flexibility for future enhancements. As cell morphology over development has emerged as a focus of interest in developmental biology, CShaperApp is anticipated to pave the way for those studies by accelerating the high‐throughput generation of systems‐level quantitative data collection. The software can be freely downloaded from the website of Github (cao13jf/CShaperApp) and is executable on Windows, macOS, and Linux operating systems.

  • COMMUNICATION
    Wenlong Zhao, Cecylia S. Lupala, Shifeng Hou, Shuxin Yang, Ziqi Yan, Shujie Liao, Xuefei Li, Nan Li
    Quantitative Biology, 2024, 12(3): 324-328. https://doi.org/10.1002/qub2.60
  • METHOD
    Yang Li, Xiaonan Ren, Haochen Yu, Tao Sun, Shuangge Ma
    Quantitative Biology, 2024, 12(3): 313-323. https://doi.org/10.1002/qub2.51

    Deep learning has been increasingly popular in omics data analysis. Recent works incorporating variable selection into deep learning have greatly enhanced the model’s interpretability. However, because deep learning desires a large sample size, the existing methods may result in uncertain findings when the dataset has a small sample size, commonly seen in omics data analysis. With the explosion and availability of omics data from multiple populations/studies, the existing methods naively pool them into one dataset to enhance the sample size while ignoring that variable structures can differ across datasets, which might lead to inaccurate variable selection results. We propose a penalized integrative deep neural network (PIN) to simultaneously select important variables from multiple datasets. PIN directly aggregates multiple datasets as input and considers both homogeneity and heterogeneity situations among multiple datasets in an integrative analysis framework. Results from extensive simulation studies and applications of PIN to gene expression datasets from elders with different cognitive statuses or ovarian cancer patients at different stages demonstrate that PIN outperforms existing methods with considerably improved performance among multiple datasets. The source code is freely available on Github (rucliyang/PINFunc). We speculate that the proposed PIN method will promote the identification of disease‐related important variables based on multiple studies/datasets from diverse origins.

  • RESEARCH ARTICLE
    Razieh Rezaei Adriani, Seyed Latif Mousavi Gargari, Hamid Bakherad, Jafar Amani
    Quantitative Biology, 2024, 12(3): 301-312. https://doi.org/10.1002/qub2.63

    Monoclonal antibodies are attractive therapeutic agents in a wide range of human disorders that bind specifically to their target through their complementary‐determining regions (CDRs). Small proteins with structurally preserved CDRs are promising antibodies mimetics. In this in silico study, we presented new antibody mimetics against the cancer marker epidermal growth factor receptor (EGFR) created by the CDRs grafting technique. Ten potential graft acceptor sites that efficiently immobilize the grafted CDR loops were selected from three small protein scaffolds using a computer. The three most involved CDR loops in antibody‐receptor interactions extracted from panitumumab antibody against the EGFR domain III crystal structure were then grafted to the selected scaffolds through the loop randomization technique. The combination of three CDR loops and 10 grafting sites revealed that three of the 36 combinations showed specific binding to EGFR DIII by binding energy calculations. Thus, the present strategy and selected small protein scaffolds are promising tools in the design of new binders against EGFR with high binding energy.

  • RESEARCH ARTICLE
    Ivan Krasnyakov
    Quantitative Biology, 2024, 12(3): 286-300. https://doi.org/10.1002/qub2.62

    Epithelial cell networks imply a packing geometry characterized by various cell shapes and distributions in terms of number of cell neighbors and areas. Despite such simple characteristics describing cell sheets, the formation of bubble‐like cells during the morphogenesis of epithelial tissues remains poorly understood. This study proposes a topological mathematical model of morphogenesis in a squamous epithelial. We introduce a new potential that takes into account not only the elasticity of cell perimeter and area but also the elasticity of their internal angles. Additionally, we incorporate an integral equation for chemical signaling, allowing us to consider chemo‐mechanical cell interactions. In addition to the listed factors, the model takes into account essential processes in real epithelial, such as cell proliferation and intercalation. The presented mathematical model has yielded novel insights into the packing of epithelial sheets. It has been found that there are two main states: one consists of cells of the same size, and the other consists of “bubble” cells. An example is provided of the possibility of accounting for chemo‐mechanical interactions in a multicellular environment. The introduction of a parameter determining the flexibility of cell shapes enables the modeling of more complex cell behaviors, such as considering change of cell phenotype. The developed mathematical model of morphogenesis of squamous epithelium allows progress in understanding the processes of formation of cell networks. The results obtained from mathematical modeling are of significant importance for understanding the mechanisms of morphogenesis and development of epithelial tissues. Additionally, the obtained results can be applied in developing methods to influence morphogenetic processes in medical applications.

  • RESEARCH ARTICLE
    Madhuri Gollapalli, Harsh Anand, Satish Mahadevan Srinivasan
    Quantitative Biology, 2024, 12(3): 271-285. https://doi.org/10.1002/qub2.46

    Predictive analytics is crucial in precision medicine for personalized patient care. To aid in precision medicine, this study identifies a subset of genetic and clinical variables that can serve as predictors for classifying diseased tissues/disease types. To achieve this, experiments were performed on diseased tissues obtained from the L1000 dataset to assess differences in the functionality and predictive capabilities of genetic and clinical variables. In this study, the k‐means technique was used for clustering the diseased tissue types, and the multinomial logistic regression (MLR) technique was applied for classifying the diseased tissue types. Dimensionality reduction techniques including principal component analysis and Boruta are used extensively to reduce the dimensionality of genetic and clinical variables. The results showed that landmark genes performed slightly better in clustering diseased tissue types compared to any random set of 978 non‐landmark genes, and the difference is statistically significant. Furthermore, it was evident that both clinical and genetic variables were important in predicting the diseased tissue types. The top three clinical predictors for predicting diseased tissue types were identified as morphology, gender, and age of diagnosis. Additionally, this study explored the possibility of using the latent representations of the clusters of landmark and non‐landmark genes as predictors for an MLR classifier. The classification models built using MLR revealed that landmark genes can serve as a subset of genetic variables and/or as a proxy for clinical variables. This study concludes that combining predictive analytics with dimensionality reduction effectively identifies key predictors in precision medicine, enhancing diagnostic accuracy.