Quantitative Biology

Jun 2023, Volume 11 Issue 2
Previous Next

Cover illustration

Large language model-based chatbots present novel possibilities and challenges for bioinformatics. The vivid imagery depicted on the cover portrays students engaged an exciting conversation with a chatbot at sunrise, symbolizing the dawn of new prospects in bioinformatics. However, the mist enveloping the mountain valleys serves as a metaphor for the challenges that accompany these opportunities. Positioned alongside the students, an experienced educator devises strategies to [Detail] ...

Select all

PERSPECTIVE

Empowering beginners in bioinformatics with ChatGPT

Evelyn Shue, Li Liu, Bingxin Li, Zifeng Feng, Xin Li, Gangqing Hu

2023, 11(2): 105-108. https://doi.org/10.15302/J-QB-023-0327

Download PDF

The impressive conversational and programming abilities of ChatGPT make it an attractive tool for facilitating the education of bioinformatics data analysis for beginners. In this study, we proposed an iterative model to fine-tune instructions for guiding a chatbot in generating code for bioinformatics data analysis tasks. We demonstrated the feasibility of the model by applying it to various bioinformatics topics. Additionally, we discussed practical considerations and limitations regarding the use of the model in chatbot-aided bioinformatics education.
REVIEW

3D genomic organization in cancers

Junting Wang, Huan Tao, Hao Li, Xiaochen Bo, Hebing Chen

2023, 11(2): 109-121. https://doi.org/10.15302/J-QB-022-0317

Download PDF

Background: The hierarchical three-dimensional (3D) architectures of chromatin play an important role in fundamental biological processes, such as cell differentiation, cellular senescence, and transcriptional regulation. Aberrant chromatin 3D structural alterations often present in human diseases and even cancers, but their underlying mechanisms remain unclear.
Results: 3D chromatin structures (chromatin compartment A/B, topologically associated domains, and enhancer-promoter interactions) play key roles in cancer development, metastasis, and drug resistance. Bioinformatics techniques based on machine learning and deep learning have shown great potential in the study of 3D cancer genome.
Conclusion: Current advances in the study of the 3D cancer genome have expanded our understanding of the mechanisms underlying tumorigenesis and development. It will provide new insights into precise diagnosis and personalized treatment for cancers.
REVIEW

Computational methods for identifying enhancer-promoter interactions

Haiyan Gong, Zhengyuan Chen, Yuxin Tang, Minghong Li, Sichen Zhang, Xiaotong Zhang, Yang Chen

2023, 11(2): 122-142. https://doi.org/10.15302/J-QB-022-0322

Download PDF

Background: As parts of the cis-regulatory mechanism of the human genome, interactions between distal enhancers and proximal promoters play a crucial role. Enhancers, promoters, and enhancer-promoter interactions (EPIs) can be detected using many sequencing technologies and computation models. However, a systematic review that summarizes these EPI identification methods and that can help researchers apply and optimize them is still needed.

Results: In this review, we first emphasize the role of EPIs in regulating gene expression and describe a generic framework for predicting enhancer-promoter interaction. Next, we review prediction methods for enhancers, promoters, loops, and enhancer-promoter interactions using different data features that have emerged since 2010, and we summarize the websites available for obtaining enhancers, promoters, and enhancer-promoter interaction datasets. Finally, we review the application of the methods for identifying EPIs in diseases such as cancer.

Conclusions: The advance of computer technology has allowed traditional machine learning, and deep learning methods to be used to predict enhancer, promoter, and EPIs from genetic, genomic, and epigenomic features. In the past decade, models based on deep learning, especially transfer learning, have been proposed for directly predicting enhancer-promoter interactions from DNA sequences, and these models can reduce the parameter training time required of bioinformatics researchers. We believe this review can provide detailed research frameworks for researchers who are beginning to study enhancers, promoters, and their interactions.
REVIEW

Light-driven synthetic microbial consortia: playing with an oxygen dilemma

Huawei Zhu, Yin Li

2023, 11(2): 143-154. https://doi.org/10.15302/J-QB-022-0314

Download PDF

Background: Light-driven synthetic microbial consortia are composed of photoautotrophs and heterotrophs. They exhibited better performance in stability, robustness and capacity for handling complex tasks when comparing with axenic cultures. Different from general microbial consortia, the intrinsic property of photosynthetic oxygen evolution in light-driven synthetic microbial consortia is an important factor affecting the functions of the consortia.
Results: In light-driven microbial consortia, the oxygen liberated by photoautotrophs will result in an aerobic environment, which exerts dual effects on different species and processes. On one hand, oxygen is favorable to the synthetic microbial consortia when they are used for wastewater treatment and aerobic chemical production, in which biomass accumulation and oxidized product formation will benefit from the high energy yield of aerobic respiration. On the other hand, the oxygen is harmful to the synthetic microbial consortia when they were used for anaerobic processes including biohydrogen production and bioelectricity generation, in which the presence of oxygen will deactivate some biological components and compete for electrons.
Conclusions: Developing anaerobic processes in using light-driven synthetic microbial consortia represents a cost-effective alternative for production of chemicals from carbon dioxide and light. Thus, exploring a versatile approach addressing the oxygen dilemma is essential to enable light-driven synthetic microbial consortia to get closer to practical applications.
RESEARCH ARTICLE

Prediction of chromatin looping using deep hybrid learning (DHL)

Mateusz Chiliński, Anup Kumar Halder, Dariusz Plewczynski

2023, 11(2): 155-162. https://doi.org/10.15302/J-QB-022-0315

Download PDF

Background: With the development of rapid and cheap sequencing techniques, the cost of whole-genome sequencing (WGS) has dropped significantly. However, the complexity of the human genome is not limited to the pure sequence—and additional experiments are required to learn the human genome’s influence on complex traits. One of the most exciting aspects for scientists nowadays is the spatial organisation of the genome, which can be discovered using spatial experiments (e.g., Hi-C, ChIA-PET). The information about the spatial contacts helps in the analysis and brings new insights into our understanding of the disease developments.
Methods: We have used an ensemble of deep learning with classical machine learning algorithms. The deep learning network we used was DNABERT, which utilises the BERT language model (based on transformers) for the genomic function. The classical machine learning models included support vector machines (SVMs), random forests (RFs), and K-nearest neighbor (KNN). The whole approach was wrapped together as deep hybrid learning (DHL).
Results: We found that the DNABERT can be used to predict the ChIA-PET experiments with high precision. Additionally, the DHL approach has increased the metrics on CTCF and RNAPII sets.
Conclusions: DHL approach should be taken into consideration for the models utilising the power of deep learning. While straightforward in the concept, it can improve the results significantly.
RESEARCH ARTICLE

A cell marker-based clustering strategy (cmCluster) for precise cell type identification of scRNA-seq data

Yuwei Huang, Huidan Chang, Xiaoyi Chen, Jiayue Meng, Mengyao Han, Tao Huang, Liyun Yuan, Guoqing Zhang

2023, 11(2): 163-174. https://doi.org/10.15302/J-QB-022-0311

Download PDF

Background: The precise and efficient analysis of single-cell transcriptome data provides powerful support for studying the diversity of cell functions at the single-cell level. The most important and challenging steps are cell clustering and recognition of cell populations. While the precision of clustering and annotation are considered separately in most current studies, it is worth attempting to develop an extensive and flexible strategy to balance clustering accuracy and biological explanation comprehensively.
Methods: The cell marker-based clustering strategy (cmCluster), which is a modified Louvain clustering method, aims to search the optimal clusters through genetic algorithm (GA) and grid search based on the cell type annotation results.
Results: By applying cmCluster on a set of single-cell transcriptome data, the results showed that it was beneficial for the recognition of cell populations and explanation of biological function even on the occasion of incomplete cell type information or multiple data resources. In addition, cmCluster also produced clear boundaries and appropriate subtypes with potential marker genes. The relevant code is available in GitHub website (huangyuwei301/cmCluster).
Conclusions: We speculate that cmCluster provides researchers effective screening strategies to improve the accuracy of subsequent biological analysis, reduce artificial bias, and facilitate the comparison and analysis of multiple studies.
RESEARCH ARTICLE

Secure and efficient implementation of facial emotion detection for smart patient monitoring system

Kh Shahriya Zaman, Md Mamun Bin Ibne Reaz

2023, 11(2): 175-182. https://doi.org/10.15302/J-QB-022-0312

Download PDF

Background: Machine learning has enabled the automatic detection of facial expressions, which is particularly beneficial in smart monitoring and understanding the mental state of medical and psychological patients. Most algorithms that attain high emotion classification accuracy require extensive computational resources, which either require bulky and inefficient devices or require the sensor data to be processed on cloud servers. However, there is always the risk of privacy invasion, data misuse, and data manipulation when the raw images are transferred to cloud servers for processing facical emotion recognition (FER) data. One possible solution to this problem is to minimize the movement of such private data.
Methods: In this research, we propose an efficient implementation of a convolutional neural network (CNN) based algorithm for on-device FER on a low-power field programmable gate array (FPGA) platform. This is done by encoding the CNN weights to approximated signed digits, which reduces the number of partial sums to be computed for multiply-accumulate (MAC) operations. This is advantageous for portable devices that lack full-fledged resource-intensive multipliers.
Results: We applied our approximation method on MobileNet-v2 and ResNet18 models, which were pretrained with the FER2013 dataset. Our implementations and simulations reduce the FPGA resource requirement by at least 22% compared to models with integer weight, with negligible loss in classification accuracy.
Conclusions: The outcome of this research will help in the development of secure and low-power systems for FER and other biomedical applications. The approximation methods used in this research can also be extended to other image-based biomedical research fields.
RESEARCH ARTICLE

Cell-based allometry: an approach for evaluation of complexity in morphogenesis

Ali Tarihi, Mojtaba Tarihi, Taki Tiraihi

2023, 11(2): 183-203. https://doi.org/10.15302/J-QB-022-0319

Download PDF

Background: Morphogenesis is a complex process in a developing animal at the organ, cellular and molecular levels. In this investigation, allometry at the cellular level was evaluated.
Methods: Geometric information, including the time-lapse Cartesian coordinates of each cell’s center, was used for calculating the allometric coefficients. A zero-centroaxial skew-symmetrical matrix (CSSM), was generated and used for constructing another square matrix (basic square matrix: BSM), then the determinant of BSM was calculated (d). The logarithms of absolute d (Lad) of cell group at different stages of development were plotted for all of the cells in a range of development stages; the slope of the regression line was estimated then used as the allometric coefficient. Moreover, the lineage growth rate (LGR) was also calculated by plotting the Lad against the logarithm of the time. The complexity index at each stage was calculated. The method was tested on a developing Caenorhabditis elegans embryo.
Results: We explored two out of the four first generated blastomeres in C. elegans embryo. The ABp and EMS lineages show that the allometric coefficient of ABp was higher than that of EMS, which was consistent with the complexity index as well as LGR.
Conclusion: The conclusion of this study is that the complexity of the differentiating cells in a developing embryo can be evaluated by allometric scaling based on the data derived from the Cartesian coordinates of the cells at different stages of development.
COMMENTARY

ChatGPT opens a new door for bioinformatics

Dong Xu

2023, 11(2): 204-206. https://doi.org/10.15302/J-QB-023-0328

Download PDF

Modeling combination chemo-immunotherapy for heterogeneous tumors

Shaoqing Chen, Zheng Hu, Da Zhou

https://doi.org/10.1002/qub2.98

Link node: A method to characterize the chain topology of intrinsically disordered proteins

Danqi Lang, Le Chen, Moxin Zhang, Haoyu Song, Jingyuan Li

https://doi.org/10.1002/qub2.96

Protocol for simulating the effect of THz electromagnetic field on ion channels

Lingfeng Xue, Zigang Song, Qi Ouyang, Chen Song

https://doi.org/10.1002/qub2.94

An effective encoding of human medical conditions in disease space provides a versatile framework for deciphering disease associations

Tianxin Xu, Yu Li, Xin Gao, Andrey Rzhetsky, Gengjie Jia

https://doi.org/10.1002/qub2.93

Action functional as an early warning indicator in the space of probability measures via Schrödinger bridge

Peng Zhang, Ting Gao, Jin Guo, Jinqiao Duan

https://doi.org/10.1002/qub2.86

SPECIFIC: A systematic framework for engineering cell state-responsive synthetic promoters reveals key regulators of T cell exhaustion

Zhaoyu Zhang, Xiaoyu Qiu, Hui Ning, Zihua Huang, Minzhen Tao, Min Liang, Zhen Xie

https://doi.org/10.1002/qub2.97

View all articles

Foundation models for bioinformatics

Ziyu Chen, Lin Wei, Ge Gao

https://doi.org/10.1002/qub2.69

Functional predictability of universal gene circuits in diverse microbial hosts

Chenrui Qin, Tong Xu, Xuejin Zhao, Yeqing Zong, Haoqian M. Zhang, Chunbo Lou, Qi Ouyang, Long Qian

https://doi.org/10.1002/qub2.41

Current opinions on large cellular models

Minsheng Hao, Lei Wei, Fan Yang, Jianhua Yao, Christina V. Theodoris, Bo Wang, Xin Li, Ge Yang, Xuegong Zhang

https://doi.org/10.1002/qub2.65

View all articles

ChatGPT & Bioinformatics

AI & Big Data

3D genome

View all collections

Please choose a citation manager

About the journal

Aims & scopes

Description

Editorial board

Abstracting / Indexing

Cover gallery

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Authors & reviewers

Online submisson

Call for papers

Editorial policy

Guidelines for authors

Download templates

Classifications via endnote

Guidelines for reviewers

Author FAQs

Jun 2023, Volume 11 Issue 2
Previous Next

Please choose a citation manager

About the journal

Browse

Authors & reviewers

Jun 2023, Volume 11 Issue 2 Previous Next

Jun 2023, Volume 11 Issue 2
Previous Next