2025-11-18 2025, Volume 10 Issue 6

  • Select all
  • research-article
    You He, Shulan Ruan, Dong Wang, Huchuan Lu, Zhi Li, Yang Liu, Xu Chen, Shaohui Li, Jie Zhao, Jiaxuan Liang
    2025, 10(6): 1573-1592. https://doi.org/10.1049/cit2.70084

    With the rapid development of large AI models, large decision models have further broken through the limits of human cognition and promoted the innovation of decision-making paradigms in extensive fields such as medicine and transportation. In this paper, we systematically expound on the intelligent decision-making technology and prospects driven by large AI models. Specifically, we first review the development of large AI models in recent years. Then, from the perspective of methods, we introduce important theories and technologies of large decision models, such as model architecture and model adaptation. Next, from the perspective of applications, we introduce the cutting-edge applications of large decision models in various fields, such as autonomous driving and knowledge decision-making. Finally, we discuss existing challenges, such as security issues, decision bias and hallucination phenomenon as well as future prospects, from both technology development and domain ap-plications. We hope this review paper can help researchers understand the important progress of intelligent decision-making driven by large AI models.

  • research-article
    Yixing Lan, Xin Xu, Jiahang Liu, Xinglong Zhang, Yang Lu, Long Cheng
    2025, 10(6): 1593-1615. https://doi.org/10.1049/cit2.70073

    Reinforcement learning (RL) has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties. In recent years, the applications of RL in optimised decision-making and motion control of intelligent vehicles have received increasing attention. Due to the complex and dynamic operating environments of intelligent vehicles, it is necessary to improve the learning efficiency and generalisation ability of RL-based decision and control algorithms under different conditions. This survey systematically examines the theoretical foundations, algorithmic advancements and practical challenges of applying RL to intelligent vehicle systems operating in complex and dynamic environments. The major algorithm frameworks of RL are first introduced, and the recent advances in RL-based decision-making and control of intel-ligent vehicles are overviewed. In addition to self-learning decision and control approaches using state measurements, the developments of DRL methods for end-to-end driving control of intelligent vehicles are summarised. The open problems and directions for further research works are also discussed.

  • research-article
    Imran Khan, Javed Rashid, Anwar Ghani, Muhammad Shoaib Saleem, Muhammad Faheem, Humera Khan
    2025, 10(6): 1616-1632. https://doi.org/10.1049/cit2.70077

    Secure and automated sharing of medical information among different medical entities/stakeholders like patients, hospitals, doctors, law enforcement agencies, health insurance companies etc., in a standard format has always been a challenging problem. Current methods for ensuring compliance with medical privacy laws require specialists who are deeply familiar with these laws' complex requirements to verify the lawful exchange of medical information. This article introduces a Smart Medical Data Exchange Engine (SDEE) designed to automate the extracting of logical rules from medical privacy legislation using advanced techniques. These rules facilitate the secure extraction of information, safeguarding patient privacy and confidenti-ality. In addition, SMDEE can generate standardised clinical documents according to Health Level 7 (HL7) standards and also standardise the nomenclature of requested medical data, enabling accurate decision-making when accessing patient data. All access requests to patient information are processed through SMDEE to ensure authorised access. The proposed system's ef-ficacy is evaluated using the Health Insurance Portability and Accountability Act (HIPAA), a fundamental privacy law in the United States. However, SMDEE's fiexibility allows its application worldwide, accommodating various medical privacy laws. Beyond facilitating global information exchange, SMDEE aims to enhance international patients' timely and appropriate treatment.

  • research-article
    Wenxin Chen, Xingguang Duan, Ye Yuan, Pu Chen, Tengfei Cui, Changsheng Li
    2025, 10(6): 1633-1645. https://doi.org/10.1049/cit2.70043

    Segmentation tasks require multiple annotation work which is time-consuming and labour-intensive. How to make full use of unlabelled data to assist in training deep learning models has been a research hotspot in recent years. This paper takes in-strument segmentation in endoscopic surgery as the background to explore how to use unlabelled data for semi-supervised learning more reasonably and effectively. An adaptive gradient correction method based on the degree of perturbation is proposed to improve segmentation accuracy. This paper integrates the recently popular segment anything model (SAM) with semi-supervised learning, taking full advantage of the large model to enhance the zero-shot ability of the model. Experimental results demonstrate the superior performance of the proposed segmentation strategy compared to traditional semi-supervised segmentation methods, achieving a 2.56% improvement in mean intersection over union (mIoU). The visual segmentation results show that incorporation of SAM significantly enhances our method, resulting in more accurate segmentation boundaries.

  • research-article
    Jinfu Liu, Zhongzien Jiang, Xinhua Xu, Wenhao Li, Mengyuan Liu, Hong Liu
    2025, 10(6): 1646-1660. https://doi.org/10.1049/cit2.70066

    Indoor scene semantic segmentation is essential for enabling robots to understand and interact with their environments effectively. However, numerous challenges remain unresolved, particularly in single-robot systems, which often struggle with the complexity and variability of indoor scenes. To address these limitations, we introduce a novel multi-robot collaborative framework based on multiplex interactive learning (MPIL) in which each robot specialises in a distinct visual task within a unified multitask architecture. During training, the framework employs task-specific decoders and cross-task feature sharing to enhance collaborative optimisation. At inference time, robots operate independently with optimised models, enabling scalable, asynchronous and efficient deployment in real-world scenarios. Specifically, MPIL employs specially designed modules that integrate RGB and depth data, refine feature representations and facilitate the simultaneous execution of multiple tasks, such as instance segmentation, scene classification and semantic segmentation. By leveraging these modules, distinct agents within multi-robot systems can effectively handle specialised tasks, thereby enhancing the overall system's fiexibility and adaptability. This collaborative effort maximises the strengths of each robot, resulting in a more comprehensive understanding of envi-ronments. Extensive experiments on two public benchmark datasets demonstrate MPIL's competitive performance compared to state-of-the-art approaches, highlighting the effectiveness and robustness of our multi-robot system in complex indoor environments.

  • research-article
    Zhang Jiahui, Meng Zhijun, He Jiazheng
    2025, 10(6): 1661-1674. https://doi.org/10.1049/cit2.70068

    The Unmanned Aerial Vehicle (UAV) air combat trajectory prediction algorithm facilitates strategic pre-planning by predicting UAV fiight trajectories with high accuracy, thus mitigating risks and securing advantages in intricate aerial scenarios. This study tackles the prevalent limitations of existing datasets, which are often restricted in scale and scenario diversity, by introducing a novel UAV air combat trajectory prediction methodology predicated on QCNet. Firstly, a robust UAV air combat dynamics model is developed to synthesise air combat trajectories, forming the basis for a comprehensive trajectory prediction dataset. Subsequently, a specialised trajectory prediction framework utilising QCNet is devised, followed by rigorous algorithm training. The parameter impact analysis is conducted to assess the infiuence of critical algorithm parameters on efficiency. The results of the parameter impact analysis experiment indicate that augmenting the number of encoder layers and the decoder's recurrent steps generally enhances performance, albeit an excessive increment in recurrent steps may inversely affect efficiency. Finally, the proposed algorithm is evaluated compared with other traditional time-series prediction algorithms and shows better performance. The effectiveness experiment indicates that the proposed algorithm can predict the fiight trajectories of UAVs and provide corresponding probabilities under different manoeuvres.

  • research-article
    Junyang Chen, Jingcai Guo, Huan Wang, Zhihui Lai, Qin Zhang, Kaishun Wu, Liang-Jie Zhang
    2025, 10(6): 1675-1687. https://doi.org/10.1049/cit2.70054

    Point of interest (POI) recommendation analyses user preferences through historical check-in data. However, existing POI recommendation methods often overlook the infiuence of weather information and face the challenge of sparse historical data for individual users. To address these issues, this paper proposes a new paradigm, namely temporal-weather-aware transition pattern for POI recommendation (TWTransNet). This paradigm is designed to capture user transition patterns under different times and weather conditions. Additionally, we introduce the construction of a user-POI interaction graph to alleviate the problem of sparse historical data for individual users. Furthermore, when predicting user interests by aggregating graph in-formation, some POIs may not be suitable for visitation under current weather conditions. To account for this, we propose an attention mechanism to filter POI neighbours when aggregating information from the graph, considering the impact of weather and time. Empirical results on two real-world datasets demonstrate the superior performance of our proposed method, showing a substantial improvement of 6.91%-23.31% in terms of prediction accuracy.

  • research-article
    Zilin Guo, Dongyue Wu, Changxin Gao, Nong Sang
    2025, 10(6): 1688-1702. https://doi.org/10.1049/cit2.70042

    Existing weakly supervised semantic segmentation (WSSS) methods based on image-level labels always rely on class activation maps (CAMs), which measure the relationships between features and classifiers. However, CAMs only focus on the most discriminative regions of images, resulting in their poor coverage performance. We attribute this to the deficiency in the recognition ability of a single classifier and the negative impacts caused by magnitudes during the CAMs normalisation process. To address the aforementioned issues, we propose to construct selective multiple classifiers (SMC). During the training process, we extract multiple prototypes for each class and store them in the corresponding memory bank. These prototypes are divided into foreground and background prototypes, with the former used to identify foreground objects and the latter aimed at pre-venting the false activation of background pixels. As for the inference stage, multiple prototypes are adaptively selected from the memory bank for each image as SMC. Subsequently, CAMs are generated by measuring the angle between SMC and features. We enhance the recognition ability of classifiers by adaptively constructing multiple classifiers for each image, while only relying on angle measurement to generate CAMs can alleviate the suppression phenomenon caused by magnitudes. Further-more, SMC can be integrated into other WSSS approaches to help generate better CAMs. Extensive experiments conducted on standard WSSS benchmarks such as PASCAL VOC 2012 and MS COCO 2014 demonstrate the superiority of our proposed method.

  • research-article
    Walid Emam, Ubaid ur Rehman, Tahir Mahmood, Faisal Mehmood
    2025, 10(6): 1703-1716. https://doi.org/10.1049/cit2.70048

    The evaluation and assessment of network security is a decision-making (DM) problem that occurs in an environment with multiple criteria, which have uncertainty, bipolarity, and extra-related information. The traditional approaches fail to address the need to acquire a wide range of information for the assessment, especially in situations where the criteria have both positive and negative aspects and contain extra fuzzy information. Therefore, in this manuscript, we aim to introduce a DM approach based on the concept of bipolar complex fuzzy (BCF) Yager aggregation operators (AOs). The related properties of these ag-gregation operators (AOs) are also discussed. Moreover, in this article, we diagnose the Yager operations in the setting of BCF. The basic idea of the interpreted operators and DM approach is to access the problem linked with the network security that is to evaluate and select the finest network security control and network security protocols for protecting and safeguarding the network of any organization or home (case studies). Finally, to exhibit the supremacy and success of the described theory, we examine them with the prevailing theories.

  • research-article
    Youneng Bao, Wen Tan, Mu Li, Jiacong Chen, Qingyu Mao, Yongsheng Liang
    2025, 10(6): 1717-1730. https://doi.org/10.1049/cit2.70034

    Neural image compression (NIC) has shown remarkable rate-distortion (R-D) efficiency. However, the considerable compu-tational and spatial complexity of most NIC methods presents deployment challenges on resource-constrained devices. We introduce a lightweight neural image compression framework designed to efficiently process both local and global information. In this framework, the convolutional branch extracts local information, whereas the frequency domain branch extracts global information. To capture global information without the high computational costs of dense pixel operations, such as attention mechanisms, Fourier transform is employed. This approach allows for the manipulation of global information in the frequency domain. Additionally, we employ feature shift operations as a strategy to acquire large receptive fields without any computa-tional cost, thus circumventing the need for large kernel convolution. Our framework achieves a superior balance between rate-distortion performance and complexity. On varying resolution sets, our method not only achieves rate-distortion (R-D) per-formance on par with versatile video coding (VVC) intra and other state-of-the-art (SOTA) NIC methods but also exhibits the lowest computational requirements, with approximately 200 KMACs/pixel. The code will be available at https://github.com/baoyu2020/SFNIC.

  • research-article
    Xubin Wu, Yan Niu, Xia Li, Jie Xiang, Yidi Li
    2025, 10(6): 1731-1744. https://doi.org/10.1049/cit2.70046

    Functional brain networks have been used to diagnose brain disorders such as autism spectrum disorder (ASD) and attention-deficit/hyperactivity disorder (ADHD). However, existing methods not only fail to fully consider various levels of interaction information between brain regions, but also limit the transmission of information among unconnected regions, resulting in the node information loss and bias. To address these issues, we propose a causality-guided multi-view diffusion (CG-MVD) network, which can more comprehensively capture node information that is difficult to observe when aggregating direct neighbours alone. Specifically, our approach designs multi-view brain graphs and multi-hop causality graphs to represent multi-level node interactions and guide the diffusion of interaction information. Building on this, a multi-view diffusion graph attention module is put forward to learn node multi-dimensional embedding features by broadening the interaction range and extending the receptive field. Additionally, we propose a bilinear adaptive fusion module to generate and fuse connectivity-based features, addressing the challenge of high-dimensional node-level features and integrating richer feature information to enhance classification. Experimental results on the ADHD-200 and ABIDE-I datasets demonstrate the effectiveness of the CG-MVD network, achieving average accuracies of 79.47% and 80.90%, respectively, and surpassing state-of-the-art methods.

  • research-article
    Mang Ye, Wenke Huang, Zekun Shi, Zhiwei Ye, Bo Du
    2025, 10(6): 1745-1758. https://doi.org/10.1049/cit2.70064

    Class-incremental learning studies the problem of continually learning new classes from data streams. But networks suffer from catastrophic forgetting problems, forgetting past knowledge when acquiring new knowledge. Among different approaches, replay methods have shown exceptional promise for this challenge. But performance still baffies from two aspects: (i) data in imbalanced distribution and (ii) networks with semantic inconsistency. First, due to limited memory buffer, there exists imbalance between old and new classes. Direct optimisation would lead feature space skewed towards new classes, resulting in performance degradation on old classes. Second, existing methods normally leverage previous network to regularise the present network. However, the previous network is not trained on new classes, which means that these two networks are semantic inconsistent, leading to misleading guidance information. To address these two problems, we propose BCSD (BiaMix contrastive learning and memory similarity distillation). For imbalanced distribution, we design Biased MixUp, where mixed samples are in high weight from old classes and low weight from new classes. Thus, network learns to push decision boundaries towards new classes. We further leverage label information to construct contrastive learning in order to ensure discriminability. Meanwhile, for semantic inconsistency, we distill knowledge from the previous network by capturing the similarity of new classes in current tasks to old classes from the memory buffer and transfer that knowledge to the present network. Empirical results on various datasets demonstrate its effectiveness and efficiency.

  • research-article
    Jia-Qi Chen, Yu-Lin He, Ying-Chao Cheng, Philippe Fournier-Viger, Ponnuthurai Nagaratnam Suganthan, Joshua Zhexue Huang
    2025, 10(6): 1759-1782. https://doi.org/10.1049/cit2.70063

    Estimating probability density functions (PDFs) is critical in data analysis, particularly for complex multimodal distributions. traditional kernel density estimator (KDE) methods often face challenges in accurately capturing multimodal structures due to their uniform weighting scheme, leading to mode loss and degraded estimation accuracy. This paper presents the fiexible kernel density estimator (F-KDE), a novel nonparametric approach designed to address these limitations. F-KDE introduces the concept of kernel unit inequivalence, assigning adaptive weights to each kernel unit, which better models local density variations in multimodal data. The method optimises an objective function that integrates estimation error and log-likelihood, using a particle swarm optimisation (PSO) algorithm that automatically determines optimal weights and bandwidths. Through extensive experiments on synthetic and real-world datasets, we demonstrated that (1) the weights and bandwidths in F-KDE stabilise as the optimisation algorithm iterates, (2) F-KDE effectively captures the multimodal characteristics and (3) F-KDE outperforms state-of-the-art density estimation methods regarding accuracy and robustness. The results confirm that F-KDE provides a valuable solution for accurately estimating multimodal PDFs.

  • research-article
    Changxu Dong, Yanqing Liu, Dengdi Sun
    2025, 10(6): 1783-1798. https://doi.org/10.1049/cit2.70059

    Epilepsy is a neurological disorder characterised by recurrent seizures due to abnormal neuronal discharges. Seizure detection via EEG signals has progressed, but two main challenges are still encountered. First, EEG data can be distorted by physiological factors and external variables, resulting in noisy brain networks. Static adjacency matrices are typically used in current mainstream methods, which neglect the need for dynamic updates and feature refinement. The second challenge stems from the strong reliance on long-range dependencies through self-attention in current methods, which can introduce redundant noise and increase computational complexity, especially in long-duration data. To address these challenges, the Attention-based Adaptive Graph ProbSparse Hybrid Network (AA-GPHN) is proposed. Brain network structures are dynamically optimised using variational inference and the information bottleneck principle, refining the adjacency matrix for improved epilepsy classification. A Linear Graph Convolutional Network (LGCN) is incorporated to focus on first-order neighbours, minimising the aggregation of distant information. Furthermore, a ProbSparse attention-based Informer (PAT) is introduced to adaptively filter long-range dependencies, enhancing efficiency. A joint optimisation loss function is applied to improve robustness in noisy environments. Experimental results on both patient-specific and cross-subject datasets demonstrate that AA-GPHN out-performs existing methods in seizure detection, showing superior effectiveness and generalisation.

  • research-article
    Qilong Yuan, Enze Shi, Di Zhu, Xiaoshan Zhang, Kui Zhao, Dingwen Zhang, Tianming Liu, Shu Zhang
    2025, 10(6): 1799-1812. https://doi.org/10.1049/cit2.70056

    Electroencephalography (EEG) is a widely used neuroimaging technique for decoding brain states. Transformer is gaining attention in EEG signal decoding due to its powerful ability to capture global features. However, relying solely on a single feature extracted by the traditional transformer model to address the domain shift problem caused by the time variability and complexity of EEG signals is challenging. In this paper, we propose a novel Transferable Fusion Multi-band EEG Transformer (TF-MEET) to enhance the performance of cross-session decoding of EEG signals. TF-MEET is mainly divided into three parts: (1) transform the EEG signals into spatial images and band images; (2) design an encoder to obtain spatial features and band features for the two types of images, and comprehensive fusion features are obtained through the weight adaptive fusion module; (3) cross-session EEG signals decoding is achieved by aligning the joint distribution of different domain features and categories through multi-loss domain adversarial training. Experimental results demonstrate (1) TF-MEET outperforms other advanced transfer learning methods on two public EEG emotion recognition datasets, SEED and SEED_IV, achieving an ac-curacy of 91.68% on SEED and 76.21% on SEED_IV; (2) TF-MEET proves the effectiveness of the transferable fusion module; (3) TF-MEET can identify explainable activation areas in the brain. We demonstrate that TF-MEET can capture comprehensive, transferable and interpretable features in EEG signals and perform well in cross-session EEG signals decoding, which can promote the development of brain-computer interface system.

  • research-article
    Xuan Yu, Yaping Wu, Yan Bai, Nan Meng, Shuting Jin, Qingxia Wu, Lijuan Chen, Ningli Wang, Xiaosheng Song, Guofeng Shen, Meiyun Wang
    2025, 10(6): 1813-1828. https://doi.org/10.1049/cit2.70044

    Accurate genotyping and prognosis of glioma patients present significant clinical challenges, often dependent on subjective judgement and insufficient scientific evidence. This study aims to develop a robust, noninvasive preoperative multi-modal MRI-based transformer learning model to predict IDH genotyping and glioma prognosis. This multi-centre study included 563 glioma patients to develop an interpretable classification model utilising various preoperative imaging sequences, including T1-weighted, T2-weighted, fiuid-attenuated inversion recovery, contrast-enhanced T1-weighted, and diffusion-weighted imaging. The model employs a multi-task learning framework to extract and fuse radiomic, deep learning, and clinical features for IDH genotyping and glioma prognosis. Additionally, a multi-modal transformer strategy is integrated to analyse structural and functional MRI, thereby enhancing model performance. Experimental results indicate that the model demonstrates superior performance, surpassing previous research and other state-of-the-art methods. The model achieves an AUC of 91.40% in the IDH genotyping task and 93.37% in the glioma prognosis task. Group analysis reveals that the model exhibits higher sensitivity to IDH-mutant cases and more accurately identifies low-risk groups compared to medium-or high-risk groups. This study aims to achieve accurate IDH genotyping and glioma prognosis through effective classification method, offering valuable diagnostic insights for clinical practice and expediting treatment decisions.

  • research-article
    Xianhang Luo, Kai Zhang, Enqiang Zhu, Jin Xu
    2025, 10(6): 1829-1843. https://doi.org/10.1049/cit2.70055

    Due to their exceptional programmability, DNA molecules are widely employed in the design of molecular circuits for appli-cations such as DNA computing, DNA storage and cancer diagnosis and treatment. The quality of DNA sequences directly determines the reliability of these molecular circuits. However, existing DNA encoding algorithms suffer from limitations such as reliance on Hamming distance and confiicts among multiple objectives, resulting in insufficient stability of the generated sequences. To address these issues, this paper proposes a thermodynamics-based multi-objective evolutionary optimisation algorithm (TEMOA). The core innovations of the proposed algorithm are as follows: First, a thermodynamics-based DNA encoding modelling strategy (TDEMS) is introduced, which simplifies the encoding process and significantly improves the sequence quality by incorporating thermodynamic stability constraints. Second, two diversity optimisation strategies—the di-versity assessment strategy (DAS) and the front equalisation nondominated sorting (FENS) strategy—are designed to enhance the algorithm's global search capability. Finally, a fiexible fitness function design is incorporated to accommodate diverse user requirements. Experimental results demonstrate that TEMOA is more effective than state-of-the-art methods on challenging multi-objective optimisation problems, whereas the DNA sequences generated by TEMOA exhibit greater reliability compared to those produced by traditional DNA encoding algorithms.

  • research-article
    Ying Chen, Peng Min, Huiling Chen, Cheng Tao, Zeye Long, Ali Asghar Heidari, Shuihua Wang, Yudong Zhang
    2025, 10(6): 1844-1866. https://doi.org/10.1049/cit2.70065

    Photovoltaic (PV) systems are electrical systems designed to convert solar energy into electrical energy. As a crucial component of PV systems, harsh weather conditions, photovoltaic panel temperature and solar irradiance infiuence the power output of photovoltaic cells. Therefore, accurately identifying the parameters of PV models is essential for simulating, controlling and evaluating PV systems. In this study, we propose an enhanced weighted-mean-of-vectors optimisation (EINFO) for efficiently determining the unknown parameters in PV systems. EINFO introduces a Lambert W-based explicit objective function for the PV model, enhancing the computational accuracy of the algorithm's population fitness. This addresses the challenge of improving the metaheuristic algorithms' identification accuracy for unknown parameter identification in PV models. We experimentally apply EINFO to three types of PV models (single-diode, double-diode and PV-module models) to validate its accuracy and stability in parameter identification. The results demonstrate that EINFO achieves root mean square errors (RMSEs) of 7.7301E-04, 6.8553E-04 and 2.0608E-03 for the single-diode model, double-diode model and PV-module model, respectively, surpassing those obtained by using INFO algorithm as well as other methods in terms of convergence speed, accuracy and stability. Furthermore, comprehensive experimental findings on three commercial PV modules (ST40, SM55 and KC200GT) indicate that EINFO consistently maintains high accuracy across varying temperatures and irradiation levels. In conclusion, EINFO emerges as a highly competitive and practical approach for parameter identification in diverse types of PV models.

  • research-article
    Shiqi Yu, Luojun Lin, Yuanlong Yu
    2025, 10(6): 1867-1879. https://doi.org/10.1049/cit2.70060

    Continual learning aims to empower a model to learn new tasks continuously while reducing forgetting to retain previously learnt knowledge. In the context of receiving streaming data that are not constrained by the independent and identically distributed (IID) assumption, continual learning efficiently transforms and leverages previously learnt knowledge through various methodologies and completes the learning of new tasks. The generalisation performance and learning efficiency of the model are enhanced in a sequence of tasks. However, the class imbalance in continual learning scenarios critically undermines model performance. In particular, in the class-incremental scenario, the class imbalance results in a bias towards new task classes while degrading the performance on previous learnt classes, leading to catastrophic forgetting. In this paper, a novel method based on balanced contrast is proposed to solve the class-incremental continual learning. The method utilises gradient balancing to mitigate the impact of class imbalance in the class-incremental scenario. The method leverages contrastive learning and gradient modifications to facilitate balanced processing of data across different classes in continual learning. The method proposed in this paper surpasses the existing baseline approaches in the class-incremental learning scenario on standard image datasets such as CIFAR-100, CIFAR-10 and mini-ImageNet. The research results reveal that the proposed method effectively mitigates catastrophic forgetting of previously learnt classes, markedly improving the efficacy of continual learning and offering a powerful solution for further advancing continual learning performance.

  • research-article
    Li An, Pengbo Zhou, Mingquan Zhou, Yong Wang, Guohua Geng, Yangyang Liu
    2025, 10(6): 1880-1892. https://doi.org/10.1049/cit2.70062

    Point cloud processing plays a crucial role in tasks such as point cloud classification, partial segmentation and semantic seg-mentation. However, existing processing frameworks are constrained by several challenges, such as recognising features in irregular and complex spatial structures, large attention parameter volumes and limitations in generalisation across different scenes. We propose a geometry transformer (PointGeo) method for addressing these concerns through point cloud analysis. This method utilises a geometry transformation network to process point cloud data, effectively capturing both local and global features and enhancing the modelling capability for irregular structures. We extensively test this method on multiple datasets, including ModelNet and ScanObjectNN for point cloud classification tasks, ShapeNet for point cloud partial segmentation tasks and S3DIS and SemanticKITTI for point cloud semantic segmentation tasks. Experimental results show that our approach delivers outstanding performance across all tasks, validating its effectiveness and generalisation capability in handling point cloud data.

  • research-article
    Shijie Liu, Chenqi Luo, Kang Yan, Feiwei Qin, Ruiquan Ge, Yong Peng, Jie Huang, Nenggan Zheng, Yongquan Zhang, Changmiao Wang
    2025, 10(6): 1893-1903. https://doi.org/10.1049/cit2.70070

    In the field of infrared small target detection (ISTD), the ability to detect targets in dim environments is critical, as it improves the performance of target recognition in nighttime and harsh weather conditions. The blurry contour, small size and sparse distribution of infrared small targets increase the difficulty of identifying such targets in cluttered backgrounds. Existing methodologies fall short of satisfying the requisites for the detection and categorisation of infrared small targets. To address these challenges and to enhance the precision of small object detection and classification, this paper introduces an innovative approach called location refinement and adjacent feature fusion YOLO (LA-YOLO), which enhances feature extraction by integrating a multi-head self-attention mechanism (MSA). We have improved the feature fusion method to merge adjacent features, to enhance information utilisation in the path aggregation network (PAN). Lastly, we introduce supervision on the target centre points in the detection network. Empirical results on publicly available datasets demonstrate that LA-YOLO achieves an impressive average precision (AP) of 92.46% on IST-A and a mean average precision (mAP) of 84.82% on FLIR. The results surpass those of contemporary state-of-the-art detectors, striking a balance between precision and speed. LA-YOLO emerges as a viable and efficacious solution for ISTD, making a substantial contribution to the progression of infrared imagery analysis. The code is available at https://github.com/liusjo/LA-YOLO.

  • research-article
    Shumeng He, Jie Shen, Houqun Yang, Gaodi Xu, Laurence T. Yang
    2025, 10(6): 1904-1918. https://doi.org/10.1049/cit2.70080

    Change detection identifies dynamic changes in surface cover and feature status by comparing remote sensing images at different points in time, which is of wide application value in the fields of disaster early warning, urban management and ecological monitoring. Mainstream datasets are dominated by long-term datasets; to support short-term change detection, we collected a new dataset, HNU-CD, which contains some small and hard-to-identify change regions. A time correlation network (TCNet) is also proposed to address these challenges. First, foreground information is enhanced by interactively modelling foreground relations, while background noise is smoothed. Secondly, the temporal correlation between bit-time images is utilised to refine the feature representation and minimise false alarms due to irrelevant changes. Finally, a U-Net inspired architecture is adapted for dense upsampling to preserve details. TCNet demonstrates excellent performance on both the HNU-CD (Hainan University change detection dataset) dataset and three widely used public datasets, indicating that its generalisation capabilities have been enhanced. The ablation experiments provide a good demonstration of the ability to reduce the impact caused by pseudo-variation through temporal correlation modelling.

  • research-article
    Xinran Zhou, Qinghua Zhang, Chengying Wu, Qin Xie, Guoyin Wang
    2025, 10(6): 1919-1933. https://doi.org/10.1049/cit2.70050

    Density peaks clustering (DPC) is a density-based clustering algorithm that identifies cluster centres by constructing decision graphs and allocates noncluster centres. Although DPC does not require specifying cluster numbers in advance, the local density is affected by the distribution of data points. Meanwhile, allocating noncluster centres is likely to result in continuous errors. Hence, a novel DPC based on weighted density estimating and multicluster merging (DPC-WDMM) is proposed. Firstly, a novel local density is defined by the nearest neighbour relationship. Then, to avoid incorrect selection of cluster centres, data points with relatively high local density are all marked within the local range. Finally, using these data points to represent microclusters for merging, the final clustering results can be obtained. The performance of this novel algorithm has been demonstrated through the experimental results on several datasets.