Journal home Browse Online first

Online first

The manuscripts published below will continue to be available from this page until they are assigned to an issue.
  • Select all
    Mateusz Chiliński, Anup Kumar Halder, Dariusz Plewczynski

    Background: With the development of rapid and cheap sequencing techniques, the cost of whole-genome sequencing (WGS) has dropped significantly. However, the complexity of the human genome is not limited to the pure sequence—and additional experiments are required to learn the human genome’s influence on complex traits. One of the most exciting aspects for scientists nowadays is the spatial organisation of the genome, which can be discovered using spatial experiments (e.g., Hi-C, ChIA-PET). The information about the spatial contacts helps in the analysis and brings new insights into our understanding of the disease developments.

    Methods: We have used an ensemble of deep learning with classical machine learning algorithms. The deep learning network we used was DNABERT, which utilises the BERT language model (based on transformers) for the genomic function. The classical machine learning models included support vector machines (SVMs), random forests (RFs), and K-nearest neighbor (KNN). The whole approach was wrapped together as deep hybrid learning (DHL).

    Results: We found that the DNABERT can be used to predict the ChIA-PET experiments with high precision. Additionally, the DHL approach has increased the metrics on CTCF and RNAPII sets.

    Conclusions: DHL approach should be taken into consideration for the models utilising the power of deep learning. While straightforward in the concept, it can improve the results significantly.

    Yuwei Huang, Huidan Chang, Xiaoyi Chen, Jiayue Meng, Mengyao Han, Tao Huang, Liyun Yuan, Guoqing Zhang

    Background: The precise and efficient analysis of single-cell transcriptome data provides powerful support for studying the diversity of cell functions at the single-cell level. The most important and challenging steps are cell clustering and recognition of cell populations. While the precision of clustering and annotation are considered separately in most current studies, it is worth attempting to develop an extensive and flexible strategy to balance clustering accuracy and biological explanation comprehensively.

    Methods: The cell marker-based clustering strategy (cmCluster), which is a modified Louvain clustering method, aims to search the optimal clusters through genetic algorithm (GA) and grid search based on the cell type annotation results.

    Results: By applying cmCluster on a set of single-cell transcriptome data, the results showed that it was beneficial for the recognition of cell populations and explanation of biological function even on the occasion of incomplete cell type information or multiple data resources. In addition, cmCluster also produced clear boundaries and appropriate subtypes with potential marker genes. The relevant code is available in GitHub website (huangyuwei301/cmCluster).

    Conclusions: We speculate that cmCluster provides researchers effective screening strategies to improve the accuracy of subsequent biological analysis, reduce artificial bias, and facilitate the comparison and analysis of multiple studies.

    Kh Shahriya Zaman, Md Mamun Bin Ibne Reaz

    Background: Machine learning has enabled the automatic detection of facial expressions, which is particularly beneficial in smart monitoring and understanding the mental state of medical and psychological patients. Most algorithms that attain high emotion classification accuracy require extensive computational resources, which either require bulky and inefficient devices or require the sensor data to be processed on cloud servers. However, there is always the risk of privacy invasion, data misuse, and data manipulation when the raw images are transferred to cloud servers for processing facical emotion recognition (FER) data. One possible solution to this problem is to minimize the movement of such private data.

    Methods: In this research, we propose an efficient implementation of a convolutional neural network (CNN) based algorithm for on-device FER on a low-power field programmable gate array (FPGA) platform. This is done by encoding the CNN weights to approximated signed digits, which reduces the number of partial sums to be computed for multiply-accumulate (MAC) operations. This is advantageous for portable devices that lack full-fledged resource-intensive multipliers.

    Results: We applied our approximation method on MobileNet-v2 and ResNet18 models, which were pretrained with the FER2013 dataset. Our implementations and simulations reduce the FPGA resource requirement by at least 22% compared to models with integer weight, with negligible loss in classification accuracy.

    Conclusions: The outcome of this research will help in the development of secure and low-power systems for FER and other biomedical applications. The approximation methods used in this research can also be extended to other image-based biomedical research fields.

    Huawei Zhu, Yin Li

    Background: Light-driven synthetic microbial consortia are composed of photoautotrophs and heterotrophs. They exhibited better performance in stability, robustness and capacity for handling complex tasks when comparing with axenic cultures. Different from general microbial consortia, the intrinsic property of photosynthetic oxygen evolution in light-driven synthetic microbial consortia is an important factor affecting the functions of the consortia.

    Results: In light-driven microbial consortia, the oxygen liberated by photoautotrophs will result in an aerobic environment, which exerts dual effects on different species and processes. On one hand, oxygen is favorable to the synthetic microbial consortia when they are used for wastewater treatment and aerobic chemical production, in which biomass accumulation and oxidized product formation will benefit from the high energy yield of aerobic respiration. On the other hand, the oxygen is harmful to the synthetic microbial consortia when they were used for anaerobic processes including biohydrogen production and bioelectricity generation, in which the presence of oxygen will deactivate some biological components and compete for electrons.

    Conclusions: Developing anaerobic processes in using light-driven synthetic microbial consortia represents a cost-effective alternative for production of chemicals from carbon dioxide and light. Thus, exploring a versatile approach addressing the oxygen dilemma is essential to enable light-driven synthetic microbial consortia to get closer to practical applications.