Quant. Biol. All Journals
Journal home Browse Most accessed

Most accessed

  • Select all
  • PERSPECTIVE
    Foundation models for bioinformatics
    Ziyu Chen, Lin Wei, Ge Gao
    Quantitative Biology, 2024, 12(4): 339-344. https://doi.org/10.1002/qub2.69

    Transformer‐based foundation models such as ChatGPTs have revolutionized our daily life and affected many fields including bioinformatics. In this perspective, we first discuss about the direct application of textual foundation models on bioinformatics tasks, focusing on how to make the most out of canonical large language models and mitigate their inherent flaws. Meanwhile, we go through the transformer‐based, bioinformatics‐tailored foundation models for both sequence and non‐sequence data. In particular, we envision the further development directions as well as challenges for bioinformatics foundation models.

  • RESEARCH ARTICLE
    Functional predictability of universal gene circuits in diverse microbial hosts
    Chenrui Qin, Tong Xu, Xuejin Zhao, Yeqing Zong, Haoqian M. Zhang, Chunbo Lou, Qi Ouyang, Long Qian
    Quantitative Biology, 2024, 12(2): 129-140. https://doi.org/10.1002/qub2.41

    Although the principles of synthetic biology were initially established in model bacteria, microbial producers, extremophiles and gut microbes have now emerged as valuable prokaryotic chassis for biological engineering. Extending the host range in which designed circuits can function reliably and predictably presents a major challenge for the concept of synthetic biology to materialize. In this work, we systematically characterized the cross-species universality of two transcriptional regulatory modules—the T7 RNA polymerase activator module and the repressors module—in three non-model microbes. We found striking linear relationships in circuit activities among different organisms for both modules. Parametrized model fitting revealed host non-specific parameters defining the universality of both modules. Lastly, a genetic NOT gate and a band-pass filter circuit were constructed from these modules and tested in non-model organisms. Combined models employing host non-specific parameters were successful in quantitatively predicting circuit behaviors, underscoring the potential of universal biological parts and predictive modeling in synthetic bioengineering.

  • COMMENTARY
    Current opinions on large cellular models
    Minsheng Hao, Lei Wei, Fan Yang, Jianhua Yao, Christina V. Theodoris, Bo Wang, Xin Li, Ge Yang, Xuegong Zhang
    Quantitative Biology, 2024, 12(4): 433-443. https://doi.org/10.1002/qub2.65
  • PERSPECTIVE
    Perspectives on benchmarking foundation models for network biology
    Christina V. Theodoris
    Quantitative Biology, 2024, 12(4): 335-338. https://doi.org/10.1002/qub2.68

    Transfer learning has revolutionized fields including natural language understanding and computer vision by leveraging large‐scale general datasets to pretrain models with foundational knowledge that can then be transferred to improve predictions in a vast range of downstream tasks. More recently, there has been a growth in the adoption of transfer learning approaches in biological fields, where models have been pretrained on massive amounts of biological data and employed to make predictions in a broad range of biological applications. However, unlike in natural language where humans are best suited to evaluate models given a clear understanding of the ground truth, biology presents the unique challenge of being in a setting where there are a plethora of unknowns while at the same time needing to abide by real‐world physical constraints. This perspective provides a discussion of some key points we should consider as a field in designing benchmarks for foundation models in network biology.

  • RESEARCH ARTICLE
    DDI-Transform: A neural network for predicting drug-drug interaction events
    Jiaming Su, Ying Qian
    Quantitative Biology, 2024, 12(2): 155-163. https://doi.org/10.1002/qub2.44

    Drug-drug interaction (DDI) event prediction is a challenging problem, and accurate prediction of DDI events is critical to patient health and new drug development. Recently, many machine learning-based techniques have been proposed for predicting DDI events. However, most of the existing methods do not effectively integrate the multidimensional features of drugs and provide poor mitigation of noise to get effective feature information. To address these limitations, we propose a DDI-Transform neural network framework for DDI event prediction. In DDI-Transform, we design a drug structure information feature extraction module and a drug bind-protein feature extraction module to obtain multidimensional feature information. A stack of DDI-Transform layers in the DDI-Transform network module are then used for adaptive learning, thus adaptively selecting the effective feature information for prediction. The results show that DDI-Transform can accurately predict DDI events and outperform the state-of-the-art models. Results on different scale datasets confirm the robustness of the method.

  • RESEARCH ARTICLE
    GCARDTI: Drug–target interaction prediction based on a hybrid mechanism in drug SELFIES
    Yinfei Feng, Yuanyuan Zhang, Zengqian Deng, Mimi Xiong
    Quantitative Biology, 2024, 12(2): 141-154. https://doi.org/10.1002/qub2.39

    The prediction of the interaction between a drug and a target is the most critical issue in the fields of drug development and repurposing. However, there are still two challenges in current deep learning research: (i) the structural information of drug molecules is not fully explored in most drug target studies, and the previous drug SMILES does not correspond well to effective drug molecules and (ii) exploration of the potential relationship between drugs and targets is in need of improvement. In this work, we use a new and better representation of the effective molecular graph structure, SELFIES. We propose a hybrid mechanism framework based on convolutional neural network and graph attention network to capture multi-view feature information of drug and target molecular structures, and we aim to enhance the ability to capture interaction sites between a drug and a target. In this study, our experiments using two different datasets show that the GCARDTI model outperforms a variety of different model algorithms on different metrics. We also demonstrate the accuracy of our model through two case studies.

  • RESEARCH ARTICLE
    Measuring drug similarity using drug–drug interactions
    Ji Lv, Guixia Liu, Yuan Ju, Houhou Huang, Ying Sun
    Quantitative Biology, 2024, 12(2): 164-172. https://doi.org/10.1002/qub2.38

    Combination therapy is a promising approach to address the challenge of antimicrobial resistance, and computational models have been proposed for predicting drug–drug interactions. Most existing models rely on drug similarity measures based on characteristics such as chemical structure and the mechanism of action. In this study, we focus on the network structure itself and propose a drug similarity measure based on drug–drug interaction networks. We explore the potential applications of this measure by combining it with unsupervised learning and semi-supervised learning approaches. In unsupervised learning, drugs can be grouped based on their interactions, leading to almost monochromatic group–group interactions. In addition, drugs within the same group tend to have similar mechanisms of action (MoA). In semi-supervised learning, the similarity measure can be utilized to construct affinity matrices, enabling the prediction of unknown drug–drug interactions. Our method exceeds existing approaches in terms of performance. Overall, our experiments demonstrate the effectiveness and practicability of the proposed similarity measure. On the one hand, when combined with clustering algorithms, it can be used for functional annotation of compounds with unknown MoA. On the other hand, when combined with semi-supervised graph learning, it enables the prediction of unknown drug–drug interactions.

  • RESEARCH ARTICLE
    A feature extraction framework for discovering pan-cancer driver genes based on multi-omics data
    Xiaomeng Xue, Feng Li, Junliang Shang, Lingyun Dai, Daohui Ge, Qianqian Ren
    Quantitative Biology, 2024, 12(2): 173-181. https://doi.org/10.1002/qub2.40

    The identification of tumor driver genes facilitates accurate cancer diagnosis and treatment, playing a key role in precision oncology, along with gene signaling, regulation, and their interaction with protein complexes. To tackle the challenge of distinguishing driver genes from a large number of genomic data, we construct a feature extraction framework for discovering pan-cancer driver genes based on multi-omics data (mutations, gene expression, copy number variants, and DNA methylation) combined with protein–protein interaction (PPI) networks. Using a network propagation algorithm, we mine functional information among nodes in the PPI network, focusing on genes with weak node information to represent specific cancer information. From these functional features, we extract distribution features of pan-cancer data, pan-cancer TOPSIS features of functional features using the ideal solution method, and SetExpan features of pan-cancer data from the gene functional features, a method to rank pan-cancer data based on the average inverse rank. These features represent the common message of pan-cancer. Finally, we use the lightGBM classification algorithm for gene prediction. Experimental results show that our method outperforms existing methods in terms of the area under the check precision-recall curve (AUPRC) and demonstrates better performance across different PPI networks. This indicates our framework’s effectiveness in predicting potential cancer genes, offering valuable insights for the diagnosis and treatment of tumors.

  • RESEARCH ARTICLE
    Plasma proteome profiling reveals biomarkers of chemotherapy resistance in patients with advanced colorectal cancer
    Jingxin Yang, Jin Chen, Luobin Zhang, Fangming Zhou, Xiaozhen Cui, Ruijun Tian, Ruilian Xu
    Quantitative Biology, 2024, 12(2): 215-224. https://doi.org/10.1002/qub2.34

    Colorectal cancer (CRC) is one of the most common cancers. Patients with advanced CRC can only rely on chemotherapy to improve outcomes. However, primary drug resistance frequently occurs and is difficult to predict. Changes in plasma protein composition have shown potential in clinical diagnosis. Thus, it is urgent to identify potential protein biomarkers for primary resistance to chemotherapy for patients with CRC. Automatic sample preparation and high-throughput analysis were used to explore potential plasma protein biomarkers. Drug susceptibility testing of circulating tumor cells (CTCs) has been investigated, and the relationship between their values and protein expressions has been discussed. In addition, the differential proteins in different chemotherapy outcomes have been analyzed. Finally, the potential biomarkers have been detected via enzyme-linked immunosorbent assay (ELISA). Plasma proteome of 60 CRC patients were profiled. The correlation between plasma protein levels and the results of drug susceptibility testing of CTCs was performed, and 85 proteins showed a significant positive or negative correlation with chemotherapy resistance. Forty-four CRC patients were then divided into three groups according to their chemotherapy outcomes (objective response, stable disease, and progressive disease), and 37 differential proteins were found to be related to chemotherapy resistance. The overlapping proteins were further investigated in an additional group of 79 patients using ELISA. Protein levels of F5 and PROZ significantly increased in the progressive disease group compared to other outcome groups. Our study indicated that F5 and PROZ proteins could represent potential biomarkers of resistance to chemotherapy in advanced CRC patients.

  • RESEARCH ARTICLE
    Hierarchical learning of gastric cancer molecular subtypes by integrating multi-modal DNA-level omics data and clinical stratification
    Binyu Yang, Siying Liu, Jiemin Xie, Xi Tang, Pan Guan, Yifan Zhu, Xuemei Liu, Yunhui Xiong, Zuli Yang, Weiyao Li, Yonghua Wang, Wen Chen, Qingjiao Li, Li C. Xia
    Quantitative Biology, 2024, 12(2): 182-196. https://doi.org/10.1002/qub2.45

    Molecular subtyping of gastric cancer (GC) aims to comprehend its genetic landscape. However, the efficacy of current subtyping methods is hampered by their mixed use of molecular features, a lack of strategy optimization, and the limited availability of public GC datasets. There is a pressing need for a precise and easily adoptable subtyping approach for early DNA-based screening and treatment. Based on TCGA subtypes, we developed a novel DNA-based hierarchical classifier for gastric cancer molecular subtyping (HCG), which employs gene mutations, copy number aberrations, and methylation patterns as predictors. By incorporating the closely related esophageal adenocarcinomas dataset, we expanded the TCGA GC dataset for the training and testing of HCG (n = 453). The optimization of HCG was achieved through three hierarchical strategies using Lasso-Logistic regression, evaluated by their overall the area under receiver operating characteristic curve (auROC), accuracy, F1 score, the area under precision-recall curve (auPRC) and their capability for clinical stratification using multivariate survival analysis. Subtype-specific DNA alteration biomarkers were discerned through difference tests based on HCG defined subtypes. Our HCG classifier demonstrated superior performance in terms of overall auROC (0.95), accuracy (0.88), F1 score (0.87) and auPRC (0.86), significantly improving the clinical stratification of patients (overall p-value = 0.032). Difference tests identified 25 subtype-specific DNA alterations, including a high mutation rate in the SYNE1, ITGB4, and COL22A1 genes for the MSI subtype, and hypermethylation of ALS2CL, KIAA0406, and RPRD1B genes for the EBV subtype. HCG is an accurate and robust classifier for DNA-based GC molecular subtyping with highly predictive clinical stratification performance. The training and test datasets, along with the analysis programs of HCG, are accessible on the GitHub website (github.com/LabxSCUT).

  • COMMENTARY
    Toward predictable universal genetic circuit design
    Yuanli Gao, Baojun Wang
    Quantitative Biology, 2024, 12(2): 225-229. https://doi.org/10.1002/qub2.48
  • TECHNICAL NOTE
    CShaperApp: Segmenting and analyzing cellular morphologies of the developing Caenorhabditis elegans embryo
    Jianfeng Cao, Lihan Hu, Guoye Guan, Zelin Li, Zhongying Zhao, Chao Tang, Hong Yan
    Quantitative Biology, 2024, 12(3): 329-334. https://doi.org/10.1002/qub2.47

    Caenorhabditis elegans has been widely used as a model organism in developmental biology due to its invariant development. In this study, we developed a desktop software CShaperApp to segment fluorescence‐labeled images of cell membranes and analyze cellular morphologies interactively during C. elegans embryogenesis. Based on the previously proposed framework CShaper, CShaperApp empowers biologists to automatically and efficiently extract quantitative cellular morphological data with either an existing deep learning model or a fine‐tuned one adapted to their in‐house dataset. Experimental results show that it takes about 30 min to process a three‐dimensional time‐lapse (4D) dataset, which consists of 150 image stacks at a ~1.5‐min interval and covers C. elegans embryogenesis from the 4‐cell to 350‐cell stages. The robustness of CShaperApp is also validated with the datasets from different laboratories. Furthermore, modularized implementation increases the flexibility in multi‐task applications and promotes its flexibility for future enhancements. As cell morphology over development has emerged as a focus of interest in developmental biology, CShaperApp is anticipated to pave the way for those studies by accelerating the high‐throughput generation of systems‐level quantitative data collection. The software can be freely downloaded from the website of Github (cao13jf/CShaperApp) and is executable on Windows, macOS, and Linux operating systems.

  • RESEARCH ARTICLE
    A clinical trial termination prediction model based on denoising autoencoder and deep survival regression
    Huamei Qi, Wenhui Yang, Wenqin Zou, Yuxuan Hu
    Quantitative Biology, 2024, 12(2): 205-214. https://doi.org/10.1002/qub2.43

    Effective clinical trials are necessary for understanding medical advances but early termination of trials can result in unnecessary waste of resources. Survival models can be used to predict survival probabilities in such trials. However, survival data from clinical trials are sparse, and DeepSurv cannot accurately capture their effective features, making the models weak in generalization and decreasing their prediction accuracy. In this paper, we propose a survival prediction model for clinical trial completion based on the combination of denoising autoencoder (DAE) and DeepSurv models. The DAE is used to obtain a robust representation of features by breaking the loop of raw features after autoencoder training, and then the robust features are provided to DeepSurv as input for training. The clinical trial dataset for training the model was obtained from the ClinicalTrials.gov dataset. A study of clinical trial completion in pregnant women was conducted in response to the fact that many current clinical trials exclude pregnant women. The experimental results showed that the denoising autoencoder and deep survival regression (DAE-DSR) model was able to extract meaningful and robust features for survival analysis; the C-index of the training and test datasets were 0.74 and 0.75 respectively. Compared with the Cox proportional hazards model and DeepSurv model, the survival analysis curves obtained by using DAE-DSR model had more prominent features, and the model was more robust and performed better in actual prediction.

  • RESEARCH ARTICLE
    Effectiveness of machine learning at modeling the relationship between Hi‐C data and copy number variation
    Yuyang Wang, Yu Sun, Zeyu Liu, Bijia Chen, Hebing Chen, Chao Ren, Xuanwei Lin, Pengzhen Hu, Peiheng Jia, Xiang Xu, Kang Xu, Ximeng Liu, Hao Li, Xiaochen Bo
    Quantitative Biology, 2024, 12(3): 231-244. https://doi.org/10.1002/qub2.52

    Copy number variation (CNV) refers to the number of copies of a specific sequence in a genome and is a type of chromatin structural variation. The development of the Hi‐C technique has empowered research on the spatial structure of chromatins by capturing interactions between DNA fragments. We utilized machine‐learning methods including the linear transformation model and graph convolutional network (GCN) to detect CNV events from Hi‐C data and reveal how CNV is related to three‐dimensional interactions between genomic fragments in terms of the one‐dimensional read count signal and features of the chromatin structure. The experimental results demonstrated a specific linear relation between the Hi‐C read count and CNV for each chromosome that can be well qualified by the linear transformation model. In addition, the GCN‐based model could accurately extract features of the spatial structure from Hi‐C data and infer the corresponding CNV across different chromosomes in a cancer cell line. We performed a series of experiments including dimension reduction, transfer learning, and Hi‐C data perturbation to comprehensively evaluate the utility and robustness of the GCN‐based model. This work can provide a benchmark for using machine learning to infer CNV from Hi‐C data and serves as a necessary foundation for deeper understanding of the relationship between Hi‐C data and CNV.

  • REVIEW ARTICLE
    Bioinformatics and biomedical informatics with ChatGPT: Year one review
    Jinge Wang, Zien Cheng, Qiuming Yao, Li Liu, Dong Xu, Gangqing Hu
    Quantitative Biology, 2024, 12(4): 345-359. https://doi.org/10.1002/qub2.67

    The year 2023 marked a significant surge in the exploration of applying large language model chatbots, notably Chat Generative Pre‐trained Transformer (ChatGPT), across various disciplines. We surveyed the application of ChatGPT in bioinformatics and biomedical informatics throughout the year, covering omics, genetics, biomedical text mining, drug discovery, biomedical image understanding, bioinformatics programming, and bioinformatics education. Our survey delineates the current strengths and limitations of this chatbot in bioinformatics and offers insights into potential avenues for future developments.

  • METHOD
    A penalized integrative deep neural network for variable selection among multiple omics datasets
    Yang Li, Xiaonan Ren, Haochen Yu, Tao Sun, Shuangge Ma
    Quantitative Biology, 2024, 12(3): 313-323. https://doi.org/10.1002/qub2.51

    Deep learning has been increasingly popular in omics data analysis. Recent works incorporating variable selection into deep learning have greatly enhanced the model’s interpretability. However, because deep learning desires a large sample size, the existing methods may result in uncertain findings when the dataset has a small sample size, commonly seen in omics data analysis. With the explosion and availability of omics data from multiple populations/studies, the existing methods naively pool them into one dataset to enhance the sample size while ignoring that variable structures can differ across datasets, which might lead to inaccurate variable selection results. We propose a penalized integrative deep neural network (PIN) to simultaneously select important variables from multiple datasets. PIN directly aggregates multiple datasets as input and considers both homogeneity and heterogeneity situations among multiple datasets in an integrative analysis framework. Results from extensive simulation studies and applications of PIN to gene expression datasets from elders with different cognitive statuses or ovarian cancer patients at different stages demonstrate that PIN outperforms existing methods with considerably improved performance among multiple datasets. The source code is freely available on Github (rucliyang/PINFunc). We speculate that the proposed PIN method will promote the identification of disease‐related important variables based on multiple studies/datasets from diverse origins.

  • RESEARCH ARTICLE
    Comprehensive cross cancer analyses reveal mutational signature cancer specificity
    Rui Xin, Limin Jiang, Hui Yu, Fengyao Yan, Jijun Tang, Yan Guo
    Quantitative Biology, 2024, 12(3): 245-254. https://doi.org/10.1002/qub2.49

    Mutational signatures refer to distinct patterns of DNA mutations that occur in a specific context or under certain conditions. It is a powerful tool to describe cancer etiology. We conducted a study to show cancer heterogeneity and cancer specificity from the aspect of mutational signatures through collinearity analysis and machine learning techniques. Through thorough training and independent validation, our results show that while the majority of the mutational signatures are distinct, similarities between certain mutational signature pairs can be observed through both mutation patterns and mutational signature abundance. The observation can potentially assist to determine the etiology of yet elusive mutational signatures. Further analysis using machine learning approaches demonstrated moderate mutational signature cancer specificity. Skin cancer among all cancer types demonstrated the strongest mutational signature specificity.

  • RESEARCH ARTICLE
    A comprehensive evaluation of large language models in mining gene relations and pathway knowledge
    Muhammad Azam, Yibo Chen, Micheal Olaolu Arowolo, Haowang Liu, Mihail Popescu, Dong Xu
    Quantitative Biology, 2024, 12(4): 360-374. https://doi.org/10.1002/qub2.57

    Understanding complex biological pathways, including gene–gene interactions and gene regulatory networks, is critical for exploring disease mechanisms and drug development. Manual literature curation of biological pathways cannot keep up with the exponential growth of new discoveries in the literature. Large‐scale language models (LLMs) trained on extensive text corpora contain rich biological information, and they can be mined as a biological knowledge graph. This study assesses 21 LLMs, including both application programming interface (API)‐based models and open‐source models in their capacities of retrieving biological knowledge. The evaluation focuses on predicting gene regulatory relations (activation, inhibition, and phosphorylation) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway components. Results indicated a significant disparity in model performance. API‐based models GPT‐4 and Claude‐Pro showed superior performance, with an F1 score of 0.4448 and 0.4386 for the gene regulatory relation prediction, and a Jaccard similarity index of 0.2778 and 0.2657 for the KEGG pathway prediction, respectively. Open‐source models lagged behind their API‐based counterparts, whereas Falcon‐180b and llama2‐7b had the highest F1 scores of 0.2787 and 0.1923 in gene regulatory relations, respectively. The KEGG pathway recognition had a Jaccard similarity index of 0.2237 for Falcon‐180b and 0.2207 for llama2‐7b. Our study suggests that LLMs are informative in gene network analysis and pathway mapping, but their effectiveness varies, necessitating careful model selection. This work also provides a case study and insight into using LLMs das knowledge graphs. Our code is publicly available at the website of GitHub (Muh‐aza).

  • COMMUNICATION
    Assessing the inhibition efficacy of clinical drugs against the main proteases of SARS‐CoV‐2 variants and other coronaviruses
    Wenlong Zhao, Cecylia S. Lupala, Shifeng Hou, Shuxin Yang, Ziqi Yan, Shujie Liao, Xuefei Li, Nan Li
    Quantitative Biology, 2024, 12(3): 324-328. https://doi.org/10.1002/qub2.60
  • RESEARCH ARTICLE
    A substructure‐aware graph neural network incorporating relation features for drug–drug interaction prediction
    Liangcheng Dong, Baoming Feng, Zengqian Deng, Jinlong Wang, Peihao Ni, Yuanyuan Zhang
    Quantitative Biology, 2024, 12(3): 255-270. https://doi.org/10.1002/qub2.66

    Identifying drug–drug interactions (DDIs) is an important aspect of drug design research, and predicting DDIs serves as a crucial guarantee for avoiding potential adverse effects. Current substructure‐based prediction methods still have some limitations: (ⅰ) The process of substructure extraction does not fully exploit the graph structure information of drugs, as it only evaluates the importance of different radius substructures from a single perspective. (ⅱ) The process of constructing drug representations has overlooked the significant impact of relation embedding on optimizing drug representations. In this work, we propose a substructure‐aware graph neural network incorporating relation features (RFSA‐DDI) for DDI prediction, which introduces a directed message passing neural network with substructure attention mechanism based on graph self‐adaptive pooling (GSP‐DMPNN) and a substructure‐aware interaction module incorporating relation features (RSAM). GSP‐DMPNN utilizes graph self‐adaptive pooling to comprehensively consider node features and local drug information for adaptive extraction of substructures. RSAM interacts drug features with relation representations to enhance their respective features individually, highlighting substructures that significantly impact predictions. RFSA‐DDI is evaluated on two real‐world datasets. Compared to existing methods, RFSA‐DDI demonstrates certain advantages in both transductive and inductive settings, effectively handling the task of predicting DDIs for unseen drugs and exhibiting good generalization capability. The experimental results show that RFSA‐DDI can effectively capture valuable structural information of drugs more accurately for DDI prediction, and provide more reliable assistance for potential DDIs detection in drug development and treatment stages.

  • RESEARCH ARTICLE
    Single‐cell gene regulatory network analysis for mixed cell populations
    Junjie Tang, Changhu Wang, Feiyi Xiao, Ruibin Xi
    Quantitative Biology, 2024, 12(4): 375-388. https://doi.org/10.1002/qub2.64

    Gene regulatory network (GRN) refers to the complex network formed by regulatory interactions between genes in living cells. In this paper, we consider inferring GRNs in single cells based on single‐cell RNA sequencing (scRNA‐seq) data. In scRNA‐seq, single cells are often profiled from mixed populations, and their cell identities are unknown. A common practice for single‐cell GRN analysis is to first cluster the cells and infer GRNs for every cluster separately. However, this two‐step procedure ignores uncertainty in the clustering step and thus could lead to inaccurate estimation of the networks. Here, we consider the mixture Poisson log‐normal model (MPLN) for network inference of count data from mixed populations. The precision matrices of the MPLN are the GRNs of different cell types. To avoid the intractable optimization of the MPLN’s log‐likelihood, we develop an algorithm called variational mixture Poisson log‐normal (VMPLN) to jointly estimate the GRNs of different cell types based on the variational inference method. We compare VMPLN with state‐of‐the‐art single‐cell regulatory network inference methods. Comprehensive simulation shows that VMPLN achieves better performance, especially in scenarios where different cell types have a high mixing degree. Benchmarking on real scRNA‐seq data also demonstrates that VMPLN can provide more accurate network estimation in most cases. Finally, we apply VMPLN to a large scRNA‐seq dataset from patients infected with severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) and find that VMPLN identifies critical differences in regulatory networks in immune cells between patients with moderate and severe symptoms. The source codes are available on the GitHub website (github.com/XiDsLab/SCVMPLN).

  • RESEARCH ARTICLE
    SimHOEPI: A resampling simulator for generating single nucleotide polymorphism data with a high-order epistasis model
    Yahan Li, Xinrui Cai, Junliang Shang, Yuanyuan Zhang, Jin-Xing Liu
    Quantitative Biology, 2024, 12(2): 197-204. https://doi.org/10.1002/qub2.42

    Epistasis is a ubiquitous phenomenon in genetics, and is considered to be one of main factors in current efforts to unveil missing heritability of complex diseases. Simulation data is crucial for evaluating epistasis detection tools in genome-wide association studies (GWAS). Existing simulators normally suffer from two limitations: absence of support for high-order epistasis models containing multiple single nucleotide polymorphisms (SNPs), and inability to generate simulation SNP data independently. In this study, we proposed a simulator SimHOEPI, which is capable of calculating penetrance tables of high-order epistasis models depending on either prevalence or heritability, and uses a resampling strategy to generate simulation data independently. Highlights of SimHOEPI are the preservation of realistic minor allele frequencies in sampling data, the accurate calculation and embedding of high-order epistasis models, and acceptable simulation time. A series of experiments were carried out to verify these properties from different aspects. Experimental results show that SimHOEPI can generate simulation SNP data independently with high-order epistasis models, implying that it might be an alternative simulator for GWAS.

  • RESEARCH ARTICLE
    Mathematical modeling of evolution of cell networks in epithelial tissues
    Ivan Krasnyakov
    Quantitative Biology, 2024, 12(3): 286-300. https://doi.org/10.1002/qub2.62

    Epithelial cell networks imply a packing geometry characterized by various cell shapes and distributions in terms of number of cell neighbors and areas. Despite such simple characteristics describing cell sheets, the formation of bubble‐like cells during the morphogenesis of epithelial tissues remains poorly understood. This study proposes a topological mathematical model of morphogenesis in a squamous epithelial. We introduce a new potential that takes into account not only the elasticity of cell perimeter and area but also the elasticity of their internal angles. Additionally, we incorporate an integral equation for chemical signaling, allowing us to consider chemo‐mechanical cell interactions. In addition to the listed factors, the model takes into account essential processes in real epithelial, such as cell proliferation and intercalation. The presented mathematical model has yielded novel insights into the packing of epithelial sheets. It has been found that there are two main states: one consists of cells of the same size, and the other consists of “bubble” cells. An example is provided of the possibility of accounting for chemo‐mechanical interactions in a multicellular environment. The introduction of a parameter determining the flexibility of cell shapes enables the modeling of more complex cell behaviors, such as considering change of cell phenotype. The developed mathematical model of morphogenesis of squamous epithelium allows progress in understanding the processes of formation of cell networks. The results obtained from mathematical modeling are of significant importance for understanding the mechanisms of morphogenesis and development of epithelial tissues. Additionally, the obtained results can be applied in developing methods to influence morphogenetic processes in medical applications.

  • RESEARCH ARTICLE
    Integrated photothermal microcarriers for precise exosome‐secreted microRNA profiling in breast cancer diagnosis
    Yunjie Shi, Yun Cheng, Peiyu Chen, Lexiang Zhang, Fangfu Ye
    Quantitative Biology, 2024, 12(4): 389-399. https://doi.org/10.1002/qub2.58

    Breast cancer constitutes a significant global health burden, while conventional diagnosis approaches may lack precision and can be discomforting for patients. Exosomes have emerged as promising biomarkers for breast cancer due to their participation in diverse pathological processes, and a convenient analysis platform is believed to greatly promote its application. In this study, we propose a novel digital PCR approach utilizing near‐infrared (NIR) photo‐responsive thermosensitive microcarriers integrated with black phosphorus for quantifying microRNA (miRNA) biomarkers within exosomes. Petal‐like biomimetic nanomaterials were firstly assembled for non‐specific exosome capture based on the affinity effect of avidin and biotin. Photothermal‐responsive microcarriers, fabricated using gelatin‐based substrates blended with photothermal nanocomposite, exhibited NIR‐induced heating and reversible phase transition properties. We optimized synthesis parameters on thermal response and established a programmable and controllable NIR light source module. The results indicated a significant elevation in the levels of biomarkers miRNA‐1246 and miRNA‐122, with fold increases ranging from 6.2 to 23.6 and 5.9 to 13.0, respectively, in breast cancer cell lines MCF‐7 and MDA‐MB‐231 compared to healthy control cells HUVEC. This study offers broad prospects for utilizing exosomes to resolve predictive biomarkers.

  • RESEARCH ARTICLE
    On electrostatic interactions of adenosine triphosphate–insulin‐degrading enzyme revealed by quantum mechanics/molecular mechanics and molecular dynamics
    Sarawoot Somin, Don Kulasiri, Sandhya Samarasinghe
    Quantitative Biology, 2024, 12(4): 414-432. https://doi.org/10.1002/qub2.61

    The insulin‐degrading enzyme (IDE) plays a significant role in the degradation of the amyloid beta (Aβ), a peptide found in the brain regions of the patients with early Alzheimer’s disease. Adenosine triphosphate (ATP) allosterically regulates the Aβ‐degrading activity of IDE. The present study investigates the electrostatic interactions between ATP‐IDE at the allosteric site of IDE, including thermostabilities/flexibilities of IDE residues, which have not yet been explored systematically. This study applies the quantum mechanics/molecular mechanics (QM/MM) to the proposed computational model for exploring electrostatic interactions between ATP and IDE. Molecular dynamic (MD) simulations are performed at different temperatures for identifying flexible and thermostable residues of IDE. The proposed computational model predicts QM/MM energy‐minimised structures providing the IDE residues (Lys530 and Asp385) with high binding affinities. Considering root mean square fluctuation values during the MD simulations at 300.00 K including heat‐shock temperatures (321.15 K and 315.15 K) indicates that Lys530 and Asp385 are also the thermostable residues of IDE, whereas Ser576 and Lys858 have high flexibilities with compromised thermostabilities. The present study sheds light on the phenomenon of biological recognition and interactions at the ATP‐binding domain, which may have important implications for pharmacological drug design. The proposed computational model may facilitate the development of allosteric IDE activators/inhibitors, which mimic ATP interactions.

  • RESEARCH ARTICLE
    Deterministic modelling of asymptomatic spread and disease stage progression in vaccine preventable infectious diseases
    Gabor Kiss, Salissou Moutari, Cara Mctaggart, Lynsey Patterson, Frank Kee, Felicity Lamrock
    Quantitative Biology, 2024, 12(4): 400-413. https://doi.org/10.1002/qub2.50

    This study introduces a deterministic formulation for modelling the asymptotic spread of a vaccine preventable disease as well as the different stages for the progression of the disease. We derive the formula for the associated basic reproduction number. To illustrate the proposed model, we use data from the 2017–2018 diphtheria outbreak in Yemen and fit the parameters of the model. A sensitivity analysis of the basic reproduction number, with respect to the model parameters, show that this number increases with an increase of the transmission rate while this number decreases when vaccination rate increases.

  • RESEARCH ARTICLE
    In silico designing and optimization of anti‐epidermal growth factor receptor scaffolds by complementary‐determining regions‐grafting technique
    Razieh Rezaei Adriani, Seyed Latif Mousavi Gargari, Hamid Bakherad, Jafar Amani
    Quantitative Biology, 2024, 12(3): 301-312. https://doi.org/10.1002/qub2.63

    Monoclonal antibodies are attractive therapeutic agents in a wide range of human disorders that bind specifically to their target through their complementary‐determining regions (CDRs). Small proteins with structurally preserved CDRs are promising antibodies mimetics. In this in silico study, we presented new antibody mimetics against the cancer marker epidermal growth factor receptor (EGFR) created by the CDRs grafting technique. Ten potential graft acceptor sites that efficiently immobilize the grafted CDR loops were selected from three small protein scaffolds using a computer. The three most involved CDR loops in antibody‐receptor interactions extracted from panitumumab antibody against the EGFR domain III crystal structure were then grafted to the selected scaffolds through the loop randomization technique. The combination of three CDR loops and 10 grafting sites revealed that three of the 36 combinations showed specific binding to EGFR DIII by binding energy calculations. Thus, the present strategy and selected small protein scaffolds are promising tools in the design of new binders against EGFR with high binding energy.

  • RESEARCH ARTICLE
    Characterizing diseases using genetic and clinical variables: A data analytics approach
    Madhuri Gollapalli, Harsh Anand, Satish Mahadevan Srinivasan
    Quantitative Biology, 2024, 12(3): 271-285. https://doi.org/10.1002/qub2.46

    Predictive analytics is crucial in precision medicine for personalized patient care. To aid in precision medicine, this study identifies a subset of genetic and clinical variables that can serve as predictors for classifying diseased tissues/disease types. To achieve this, experiments were performed on diseased tissues obtained from the L1000 dataset to assess differences in the functionality and predictive capabilities of genetic and clinical variables. In this study, the k‐means technique was used for clustering the diseased tissue types, and the multinomial logistic regression (MLR) technique was applied for classifying the diseased tissue types. Dimensionality reduction techniques including principal component analysis and Boruta are used extensively to reduce the dimensionality of genetic and clinical variables. The results showed that landmark genes performed slightly better in clustering diseased tissue types compared to any random set of 978 non‐landmark genes, and the difference is statistically significant. Furthermore, it was evident that both clinical and genetic variables were important in predicting the diseased tissue types. The top three clinical predictors for predicting diseased tissue types were identified as morphology, gender, and age of diagnosis. Additionally, this study explored the possibility of using the latent representations of the clusters of landmark and non‐landmark genes as predictors for an MLR classifier. The classification models built using MLR revealed that landmark genes can serve as a subset of genetic variables and/or as a proxy for clinical variables. This study concludes that combining predictive analytics with dimensionality reduction effectively identifies key predictors in precision medicine, enhancing diagnostic accuracy.