Jun 2024, Volume 12 Issue 2
    

Cover illustration

  • This cover art exemplifies the potential and complexity of designing genetic circuits that operate reliably across various biological systems.Central to the visual is a radiant DNA helix,emblematic of the genetic foundation of this scientific domain.Encircling it are varied non-model microbes—extremophiles,gut microbes,and other microbial producers-each uniquely depicted to emphasize their diversity and significance in biological engineering. In the background,the silhouette [Detail] ...

    Download cover

  • Select all
  • RESEARCH ARTICLE
    Chenrui Qin, Tong Xu, Xuejin Zhao, Yeqing Zong, Haoqian M. Zhang, Chunbo Lou, Qi Ouyang, Long Qian
    2024, 12(2): 129-140. https://doi.org/10.1002/qub2.41

    Although the principles of synthetic biology were initially established in model bacteria, microbial producers, extremophiles and gut microbes have now emerged as valuable prokaryotic chassis for biological engineering. Extending the host range in which designed circuits can function reliably and predictably presents a major challenge for the concept of synthetic biology to materialize. In this work, we systematically characterized the cross-species universality of two transcriptional regulatory modules—the T7 RNA polymerase activator module and the repressors module—in three non-model microbes. We found striking linear relationships in circuit activities among different organisms for both modules. Parametrized model fitting revealed host non-specific parameters defining the universality of both modules. Lastly, a genetic NOT gate and a band-pass filter circuit were constructed from these modules and tested in non-model organisms. Combined models employing host non-specific parameters were successful in quantitatively predicting circuit behaviors, underscoring the potential of universal biological parts and predictive modeling in synthetic bioengineering.

  • RESEARCH ARTICLE
    Yinfei Feng, Yuanyuan Zhang, Zengqian Deng, Mimi Xiong
    2024, 12(2): 141-154. https://doi.org/10.1002/qub2.39

    The prediction of the interaction between a drug and a target is the most critical issue in the fields of drug development and repurposing. However, there are still two challenges in current deep learning research: (i) the structural information of drug molecules is not fully explored in most drug target studies, and the previous drug SMILES does not correspond well to effective drug molecules and (ii) exploration of the potential relationship between drugs and targets is in need of improvement. In this work, we use a new and better representation of the effective molecular graph structure, SELFIES. We propose a hybrid mechanism framework based on convolutional neural network and graph attention network to capture multi-view feature information of drug and target molecular structures, and we aim to enhance the ability to capture interaction sites between a drug and a target. In this study, our experiments using two different datasets show that the GCARDTI model outperforms a variety of different model algorithms on different metrics. We also demonstrate the accuracy of our model through two case studies.

  • RESEARCH ARTICLE
    Jiaming Su, Ying Qian
    2024, 12(2): 155-163. https://doi.org/10.1002/qub2.44

    Drug-drug interaction (DDI) event prediction is a challenging problem, and accurate prediction of DDI events is critical to patient health and new drug development. Recently, many machine learning-based techniques have been proposed for predicting DDI events. However, most of the existing methods do not effectively integrate the multidimensional features of drugs and provide poor mitigation of noise to get effective feature information. To address these limitations, we propose a DDI-Transform neural network framework for DDI event prediction. In DDI-Transform, we design a drug structure information feature extraction module and a drug bind-protein feature extraction module to obtain multidimensional feature information. A stack of DDI-Transform layers in the DDI-Transform network module are then used for adaptive learning, thus adaptively selecting the effective feature information for prediction. The results show that DDI-Transform can accurately predict DDI events and outperform the state-of-the-art models. Results on different scale datasets confirm the robustness of the method.

  • RESEARCH ARTICLE
    Ji Lv, Guixia Liu, Yuan Ju, Houhou Huang, Ying Sun
    2024, 12(2): 164-172. https://doi.org/10.1002/qub2.38

    Combination therapy is a promising approach to address the challenge of antimicrobial resistance, and computational models have been proposed for predicting drug–drug interactions. Most existing models rely on drug similarity measures based on characteristics such as chemical structure and the mechanism of action. In this study, we focus on the network structure itself and propose a drug similarity measure based on drug–drug interaction networks. We explore the potential applications of this measure by combining it with unsupervised learning and semi-supervised learning approaches. In unsupervised learning, drugs can be grouped based on their interactions, leading to almost monochromatic group–group interactions. In addition, drugs within the same group tend to have similar mechanisms of action (MoA). In semi-supervised learning, the similarity measure can be utilized to construct affinity matrices, enabling the prediction of unknown drug–drug interactions. Our method exceeds existing approaches in terms of performance. Overall, our experiments demonstrate the effectiveness and practicability of the proposed similarity measure. On the one hand, when combined with clustering algorithms, it can be used for functional annotation of compounds with unknown MoA. On the other hand, when combined with semi-supervised graph learning, it enables the prediction of unknown drug–drug interactions.

  • RESEARCH ARTICLE
    Xiaomeng Xue, Feng Li, Junliang Shang, Lingyun Dai, Daohui Ge, Qianqian Ren
    2024, 12(2): 173-181. https://doi.org/10.1002/qub2.40

    The identification of tumor driver genes facilitates accurate cancer diagnosis and treatment, playing a key role in precision oncology, along with gene signaling, regulation, and their interaction with protein complexes. To tackle the challenge of distinguishing driver genes from a large number of genomic data, we construct a feature extraction framework for discovering pan-cancer driver genes based on multi-omics data (mutations, gene expression, copy number variants, and DNA methylation) combined with protein–protein interaction (PPI) networks. Using a network propagation algorithm, we mine functional information among nodes in the PPI network, focusing on genes with weak node information to represent specific cancer information. From these functional features, we extract distribution features of pan-cancer data, pan-cancer TOPSIS features of functional features using the ideal solution method, and SetExpan features of pan-cancer data from the gene functional features, a method to rank pan-cancer data based on the average inverse rank. These features represent the common message of pan-cancer. Finally, we use the lightGBM classification algorithm for gene prediction. Experimental results show that our method outperforms existing methods in terms of the area under the check precision-recall curve (AUPRC) and demonstrates better performance across different PPI networks. This indicates our framework’s effectiveness in predicting potential cancer genes, offering valuable insights for the diagnosis and treatment of tumors.

  • RESEARCH ARTICLE
    Binyu Yang, Siying Liu, Jiemin Xie, Xi Tang, Pan Guan, Yifan Zhu, Xuemei Liu, Yunhui Xiong, Zuli Yang, Weiyao Li, Yonghua Wang, Wen Chen, Qingjiao Li, Li C. Xia
    2024, 12(2): 182-196. https://doi.org/10.1002/qub2.45

    Molecular subtyping of gastric cancer (GC) aims to comprehend its genetic landscape. However, the efficacy of current subtyping methods is hampered by their mixed use of molecular features, a lack of strategy optimization, and the limited availability of public GC datasets. There is a pressing need for a precise and easily adoptable subtyping approach for early DNA-based screening and treatment. Based on TCGA subtypes, we developed a novel DNA-based hierarchical classifier for gastric cancer molecular subtyping (HCG), which employs gene mutations, copy number aberrations, and methylation patterns as predictors. By incorporating the closely related esophageal adenocarcinomas dataset, we expanded the TCGA GC dataset for the training and testing of HCG (n = 453). The optimization of HCG was achieved through three hierarchical strategies using Lasso-Logistic regression, evaluated by their overall the area under receiver operating characteristic curve (auROC), accuracy, F1 score, the area under precision-recall curve (auPRC) and their capability for clinical stratification using multivariate survival analysis. Subtype-specific DNA alteration biomarkers were discerned through difference tests based on HCG defined subtypes. Our HCG classifier demonstrated superior performance in terms of overall auROC (0.95), accuracy (0.88), F1 score (0.87) and auPRC (0.86), significantly improving the clinical stratification of patients (overall p-value = 0.032). Difference tests identified 25 subtype-specific DNA alterations, including a high mutation rate in the SYNE1, ITGB4, and COL22A1 genes for the MSI subtype, and hypermethylation of ALS2CL, KIAA0406, and RPRD1B genes for the EBV subtype. HCG is an accurate and robust classifier for DNA-based GC molecular subtyping with highly predictive clinical stratification performance. The training and test datasets, along with the analysis programs of HCG, are accessible on the GitHub website (github.com/LabxSCUT).

  • RESEARCH ARTICLE
    Yahan Li, Xinrui Cai, Junliang Shang, Yuanyuan Zhang, Jin-Xing Liu
    2024, 12(2): 197-204. https://doi.org/10.1002/qub2.42

    Epistasis is a ubiquitous phenomenon in genetics, and is considered to be one of main factors in current efforts to unveil missing heritability of complex diseases. Simulation data is crucial for evaluating epistasis detection tools in genome-wide association studies (GWAS). Existing simulators normally suffer from two limitations: absence of support for high-order epistasis models containing multiple single nucleotide polymorphisms (SNPs), and inability to generate simulation SNP data independently. In this study, we proposed a simulator SimHOEPI, which is capable of calculating penetrance tables of high-order epistasis models depending on either prevalence or heritability, and uses a resampling strategy to generate simulation data independently. Highlights of SimHOEPI are the preservation of realistic minor allele frequencies in sampling data, the accurate calculation and embedding of high-order epistasis models, and acceptable simulation time. A series of experiments were carried out to verify these properties from different aspects. Experimental results show that SimHOEPI can generate simulation SNP data independently with high-order epistasis models, implying that it might be an alternative simulator for GWAS.

  • RESEARCH ARTICLE
    Huamei Qi, Wenhui Yang, Wenqin Zou, Yuxuan Hu
    2024, 12(2): 205-214. https://doi.org/10.1002/qub2.43

    Effective clinical trials are necessary for understanding medical advances but early termination of trials can result in unnecessary waste of resources. Survival models can be used to predict survival probabilities in such trials. However, survival data from clinical trials are sparse, and DeepSurv cannot accurately capture their effective features, making the models weak in generalization and decreasing their prediction accuracy. In this paper, we propose a survival prediction model for clinical trial completion based on the combination of denoising autoencoder (DAE) and DeepSurv models. The DAE is used to obtain a robust representation of features by breaking the loop of raw features after autoencoder training, and then the robust features are provided to DeepSurv as input for training. The clinical trial dataset for training the model was obtained from the ClinicalTrials.gov dataset. A study of clinical trial completion in pregnant women was conducted in response to the fact that many current clinical trials exclude pregnant women. The experimental results showed that the denoising autoencoder and deep survival regression (DAE-DSR) model was able to extract meaningful and robust features for survival analysis; the C-index of the training and test datasets were 0.74 and 0.75 respectively. Compared with the Cox proportional hazards model and DeepSurv model, the survival analysis curves obtained by using DAE-DSR model had more prominent features, and the model was more robust and performed better in actual prediction.

  • RESEARCH ARTICLE
    Jingxin Yang, Jin Chen, Luobin Zhang, Fangming Zhou, Xiaozhen Cui, Ruijun Tian, Ruilian Xu
    2024, 12(2): 215-224. https://doi.org/10.1002/qub2.34

    Colorectal cancer (CRC) is one of the most common cancers. Patients with advanced CRC can only rely on chemotherapy to improve outcomes. However, primary drug resistance frequently occurs and is difficult to predict. Changes in plasma protein composition have shown potential in clinical diagnosis. Thus, it is urgent to identify potential protein biomarkers for primary resistance to chemotherapy for patients with CRC. Automatic sample preparation and high-throughput analysis were used to explore potential plasma protein biomarkers. Drug susceptibility testing of circulating tumor cells (CTCs) has been investigated, and the relationship between their values and protein expressions has been discussed. In addition, the differential proteins in different chemotherapy outcomes have been analyzed. Finally, the potential biomarkers have been detected via enzyme-linked immunosorbent assay (ELISA). Plasma proteome of 60 CRC patients were profiled. The correlation between plasma protein levels and the results of drug susceptibility testing of CTCs was performed, and 85 proteins showed a significant positive or negative correlation with chemotherapy resistance. Forty-four CRC patients were then divided into three groups according to their chemotherapy outcomes (objective response, stable disease, and progressive disease), and 37 differential proteins were found to be related to chemotherapy resistance. The overlapping proteins were further investigated in an additional group of 79 patients using ELISA. Protein levels of F5 and PROZ significantly increased in the progressive disease group compared to other outcome groups. Our study indicated that F5 and PROZ proteins could represent potential biomarkers of resistance to chemotherapy in advanced CRC patients.

  • COMMENTARY
    Yuanli Gao, Baojun Wang
    2024, 12(2): 225-229. https://doi.org/10.1002/qub2.48