Jan 2026, Volume 20 Issue 1
    

  • Select all
    Code & Data
  • LETTER
    Guosen HOU, Yuxiao FEI, Hao GAO, Xiaoquan SU
  • Excellent Young Computer Scientists Vision
  • LETTER
    Haonan LUO, Sijia LI, Yijie ZENG, Zihang WANG, Botao JIANG, Xiruo JIANG
  • Artificial Intelligence
  • LETTER
    Qihuang ZHONG, Kang WANG, Ziyang XU, Liang DING, Juhua LIU, Bo DU
  • RESEARCH ARTICLE
    Xiu-Lin ZHENG, Pei-Pei LI, Zan ZHANG, Xin-Dong WU

    Knowledge graphs (KGs) often suffer from incompleteness, which limits their performance in practice where a vast amount of entities may co-exist. To aid, knowledge graph completion (KGC) has been proposed to infer the missing links between entities. Among them, reasoning over relation paths in incomplete KG is a popular research topic. However, there are still some issues remained to be solved, such as path noise, path sparsity of KG, the ambiguity of inferred relation and lack of explanability in path representation. To simultaneously address the aforementioned challenges, we propose a novel rule guided link prediction model with path noise avoidance and disambiguation of inferred relation, termed as RPND. Specifically, we utilize path selection strategy to filter noisy path and reduce the interference of path noise. To alleviate the path sparsity of KG, we leverage path overlapping feature of similar relations and combine them based on the semantic similarity. For the ambiguity of inferred relation, we draw the insight from language model like transformer by introducing position embedding to reflect the order of relation along the path when learning its representation. Meanwhile, we employ logic rules to compose paths in semantic level to enhance the explanability of path representation. Extensive experiments conducted on benchmark datasets demonstrate the superiority of our proposed RPND model compared to its SOTAs.

  • RESEARCH ARTICLE
    Jun-Hao FENG, Xia-Bing ZHOU, Wen-Liang CHEN, Min ZHANG

    Recognizing the fine-grained emotion cause extraction-Causal Span Extraction (CSE)-is challenging due to the more detailed analysis of contextual information. The previous research on CSE has focused on the powerful semantic representation capability of PLMs while overlooking the semantic coherence of the speaker and content when emotions arise in conversation. In this paper, we introduce a novel method by learning contextual information and enhancing the consistency of cross-task alignment. Specifically, we integrate the coreference resolution into the attention mechanism to capture the coreference-aware semantic correlations and employ the position relation strategy at both the utterance and token levels to understand the contextual information. Furthermore, by incorporating auxiliary tasks and a novel cross-task alignment approach, we reduce inconsistent predictions across tasks, thereby enabling a comprehensive, multi-dimensional comprehension of conversations. Our method demonstrates a marked improvement over current state-of-the-art models, evidenced by superior performance on two benchmark datasets.

  • RESEARCH ARTICLE
    Ang LI, Yawen LI, Zhe XUE

    Previous federated learning methods primarily addressed challenges involving Euclidean data, such as images and text, where relationships between data points are linear. However, information networks, as non-Euclidean data, inherently exhibit data heterogeneity. This heterogeneity is further amplified in federated learning environments, where data from multi-party information networks introduces even greater variability. It’s also worth noting that the contributions of multi-party information networks to federated learning process are dynamic. To address this challenge, we propose an Information Network representation method based on Federated Self-adaptive learning (FedSIN), which leverages the importance of neighboring nodes in the network to learn node representations and performs adaptive federated model aggregation. Specifically, FedSIN utilizes the self-attention mechanism of the graph attention network to capture the significance of neighbor nodes’ influence on each node, enabling effective aggregation of neighbor node information for improved node representation. Additionally, FedSIN designs an adaptive federated model aggregation mechanism to evaluate and incorporate the contributions of different clients based on their performance in each communication round. Experimental results on three public datasets demonstrate the superiority of our proposed FedSIN over state-of-the-art information network representation methods.

  • RESEARCH ARTICLE
    Biao ZHU, Jun ZHANG, Sirui ZHAO, Zhengye ZHANG, Enhong CHEN

    With the increasing frequency of natural disasters and health emergencies, wearable infrared thermal imaging devices are becoming more prevalent in fire protection and medical fields. However, these devices often face imaging performance challenges such as insufficient contrast, dark areas and blurred edges, which significantly limit their practical effectiveness. To tackle these challenges, we propose a novel unsupervised lightweight 3D convolutional network (UL3DCN) specifically designed for enhancing infrared images on wearable devices. In this framework, the task of infrared image enhancement is conceptualized as generating high dynamic range infrared images from the corresponding temperature sequences during thermal equilibrium. To achieve this, we first design a learnable dynamic filtering module tailored for simulating a series of infrared image sequences under varying temperature differences. This module extends a single image from the spatial domain into the spatio-temporal domain. Subsequently, we employ a lightweight 3D convolution module to effectively extract spatio-temporal information from the image sequence. Finally, inspired by Zero-DCE, we utilize the extracted information to estimate pixel values and high-order curves, thereby enhancing the infrared images. Comprehensive experimental results demonstrate that our method achieves outstanding performance and real-time capabilities. Additionally, the proposed UL3DCN model has been successfully integrated into a wearable infrared firefighting mask.

  • RESEARCH ARTICLE
    Wenyu MAO, Jiancan WU, Haoyang LIU, Yongduo SUI, Xiang WANG

    Graph out-of-distribution (OOD) generalization remains a major challenge in graph learning since graph neural networks (GNNs) often suffer from severe performance degradation under distribution shifts. Invariant learning, aiming to extract invariant features across varied distributions, has recently emerged as a promising approach for OOD generalization. Despite the great success of invariant learning in OOD problems for Euclidean data (i.e., images), the exploration within graph data remains constrained by the complex nature of graphs. The invariant features at both the attribute and structural levels, combined with the absence of prior knowledge regarding environmental factors, make the invariance and sufficiency conditions of invariant learning hard to satisfy on graph data. Existing studies, such as data augmentation or causal intervention, either suffer from disruptions to invariance during the graph manipulation process or face reliability issues due to a lack of supervised signals for causal parts. In this work, we propose a novel framework, called Invariant Graph Learning based on Information bottleneck theory (InfoIGL), to extract the invariant features of graphs and enhance models’ generalization ability to unseen distributions. Specifically, InfoIGL introduces a redundancy filter to compress task-irrelevant information related to environmental factors. Cooperating with our designed multi-level contrastive learning, we maximize the mutual information among graphs of the same class in the downstream classification tasks, preserving invariant features for prediction to a great extent. An appealing feature of InfoIGL is its strong generalization ability without depending on supervised signal of invariance. Experiments on both synthetic and real-world datasets demonstrate that our method achieves state-of-the-art performance under OOD generalization for graph classification tasks. The source code is available at github.com/maowenyu-11/InfoIGL website.

  • RESEARCH ARTICLE
    Jiasheng SI, Yingjie ZHU, Rui WANG, Wenpeng LU, Yulan HE, Deyu ZHOU

    Given a controversial target, such as “nuclear energy”, information-seeking argument mining aims to identify argumentative text from diverse sources. The main challenge in this task comes three-fold: the insufficiency of contextual information on targets, cross-domain adaptation across varying targets, and implicit argumentative information within the argument. Current approaches primarily address the first two challenges by improving the integration of target-related semantic information with arguments, while there has been little work on modeling all three aspects. To address these challenges, inspired by the potential capability of the neural topic model for mining the local and global topic information contained in the dataset, we propose a novel topic-enhanced information-seeking argument mining approach by leveraging the mutual interaction between the neural topic model and the language model. Specifically, (i) the global topic information is extracted from the corpora to encapsulate the common knowledge across different targets for solving the cross-domain adaptation; (ii) to capture the contextual information on targets, the target is augmented by target-aware subtopics derived from the global topic-word distribution; (iii) to capture the implicit argumentative information within the argument, the local topic information is captured by minimizing the similarity between its local topic distribution and its semantic representation through mutual learning. Experimental results show the superiority of the proposed model compared to the state-of-the-art baselines in both in-domain and cross-domain scenarios.

  • RESEARCH ARTICLE
    Zhi ZHENG, Zhao-Peng QIU, Chen ZHU, Xiao HU, Li-Kang WU, Yang SONG, Heng-Shu ZHU, Hui XIONG

    With the rapid development of Large Language Models (LLMs), an increasing number of researchers are turning their attention to Generative Recommender Systems (GRSs), which are not constrained by strict candidate sets and are more conducive to exploring user interests. Existing LLM-based GRSs mainly utilize Supervised Fine-Tuning (SFT) to endow LLMs with the capability to generate candidate items, and further employ similarity-based grounding methods to map the generated results to real-world items. However, SFT-based training methods are insufficient for LLMs to adequately grasp the knowledge embedded in complex interactive behaviors, and similarity-based grounding methods also face challenges for long text matching. Therefore, in this paper, we propose generative job recommendation based on large language models (GIRL). Specifically, we propose to train a model which can evaluate the matching degree between curriculum vitae (CV) and job description (JD) as a reward model, and we use a proximal policy optimization (PPO)-based reinforcement learning (RL) method to fine-tune the LLM-based recommender. Moreover, we propose a model-based grounding method for JD grounding. Extensive experiments on two real-world datasets demonstrate the superiority of the proposed model compared to seven baseline methods.

  • RESEARCH ARTICLE
    Wei ZOU, Ziyuan ZHUANG, Xiang GENG, Shujian HUANG, Jia LIU, Jiajun CHEN

    Paraphrase generation strives to generate high-quality and diverse expressions of a given text, a domain where diffusion models excel. Though SOTA diffusion generation reconciles generation quality and diversity, textual diffusion suffers from a truncation issue that hinders efficiency and quality control. In this work, we propose Latent Diffusion Paraphraser (LDP), a novel paraphrase generation by modeling a controllable diffusion process given a learned latent space. LDP achieves superior generation efficiency compared to its diffusion counterparts. It can facilitate only input segments to ensure paraphrase semantics, improving the results without external features. Experiments show that LDP better reconciles paraphrase generation quality and diversity than baselines. Further analysis shows that our method is also helpful to other similar text generations and domain adaptations

  • REVIEW ARTICLE
    Yanan ZHANG, Jinqing ZHANG, Zengran WANG, Junhao XU, Di HUANG

    In recent years, autonomous driving has garnered escalating attention for its potential to relieve drivers’ burdens and improve driving safety. Vision-based 3D occupancy prediction, which predicts the spatial occupancy status and semantics of 3D voxel grids around the autonomous vehicle from image inputs, is an emerging perception task suitable for cost-effective perception system of autonomous driving. Although numerous studies have demonstrated the greater advantages of 3D occupancy prediction over object-centric perception tasks, there is still a lack of a dedicated review focusing on this rapidly developing field. In this paper, we first introduce the background of vision-based 3D occupancy prediction and discuss the challenges in this task. Second, we conduct a comprehensive survey of the progress in vision-based 3D occupancy prediction from three aspects: feature enhancement, deployment friendliness and label efficiency, and provide an in-depth analysis of the potentials and challenges of each category of methods. Finally, we present a summary of prevailing research trends and propose some inspiring future outlooks. To provide a valuable reference for researchers, a regularly updated collection of related papers, datasets, and codes is organized at github.com/zya3d/Awesome-3D-Occupancy-Prediction website.

  • Theoretical Computer Science
  • LETTER
    Jing CAO, Yicheng PAN, Pan PENG