2025-12-15 2025, Volume 19 Issue 12

  • Select all
  • LETTER
    Zhiqing CUI , Fan MENG , Jingjia LUO
  • LETTER
    Qiang WANG , Kele XU , Dawei FENG , Bo DING , Huaimin WANG
  • LETTER
    Cong WANG , Zhilong MI , Ziqiao YIN , Binghui GUO
  • LETTER
    Bin-Bin JIA , Jun-Ying LIU , Min-Ling ZHANG
  • LETTER
    Yan ZHUANG , Huiwen WANG , Shuai MA , Yang LIU
  • RESEARCH ARTICLE
    Zhiwei SUN , Jun BAI , Zhuofan CHEN , Chen LI , Wenge RONG , Zhang XIONG

    Text classification is a pivotal task in natural language understanding, and its performance has seen remarkable advancements with the rise of Pre-trained Language Models (PLMs). Recently, the proliferation of PLMs has made it increasingly challenging to choose the most suitable model for a given dataset. Since fine-tuning the sheer number of models is impractical, Transferability Estimation (TE) has become a promising solution to efficient model selection. Unlike current TE methods that focus solely on fixed and hard class assignments to evaluate the quality of model-encoded features, our approach further takes into account the inter-sample and inter-model variations represented by soft class assignments. We achieve this by utilizing class embeddings to predict posterior class assignments, with the logarithm of the maximum posterior evidence serving as the transferability score. Moreover, we found that the informative sub-space of the dataset can lead to more accurate calculation of soft class assignments, where we achieve efficient annotation of informative samples by eliciting the powerful judging ability of large language model. The resulting posterior evidence over the informative sub-space, LogIPE, enables us to capture subtle differences between models, enhancing the accuracy of model selection and validated by extensive experiments conducted on a wide range of text classification datasets as well as candidate PLMs.

  • RESEARCH ARTICLE
    Shikai CHEN , Jin YUAN , Yang ZHANG , Zhongchao SHI , Jianping FAN , Xin GENG , Yong RUI

    Recent works in Unsupervised Domain Adaptation mainly focus on either divergence-based or adversarial methods. Divergence-based approaches minimize domain discrepancy by selecting an appropriate divergence measure, although the optimal choice can be task-specific in practice. On the other hand, adversarial methods aim to extract domain-invariant features by enforcing indistinguishability between domains in a Min-Max adversarial framework, neglecting the sample correlations. To overcome this limitation, we propose a novel adversarial domain adaptation framework that leverages the collective assumption to model and exploit higher-order interactions among samples. By capturing these collective domain features, our method achieves a more robust domain alignment, demonstrating enhanced resilience to noise and domain ambiguity. Furthermore, experimental results demonstrate that our approach achieves consistent improvements over conventional adversarial training techniques and can seamlessly integrate with existing domain adaptation strategies in a plug-and-play manner, offering a valuable contribution towards advancing state-of-the-art performance.

  • REVIEW ARTICLE
    Xiangfu MENG , Shuonan SUN , Xiaoyan ZHANG , Qiangkui LENG , Jinfeng FANG

    With the rapid development of Global Positioning System (GPS), Global System for Mobile Communications (GSM), and the widespread application of mobile devices, a massive amount of trajectory data have been generated. Current trajectory data processing methods typically require input in the form of fixed-length vectors, making it crucial to convert variable-length trajectory data into fixed-length, low-dimensional embedding vectors. Trajectory representation learning aims to transform trajectory data into more expressive and interpretable representations. This paper provides a comprehensive review of the research progress, methodologies, and applications of trajectory representation learning. First, it categorizes and introduces the key techniques of trajectory representation learning and summarizes the available public trajectory datasets. Then, it classifies trajectory representation learning methods based on various downstream tasks, with a focus on their principles, advantages, limitations, and application scenarios in trajectory similarity computation, similar trajectory search, trajectory clustering, and trajectory prediction. Additionally, representative model structures and principles in each task are analyzed, along with the characteristics and advantages of different methods in each task. Last, the challenges faced by current trajectory representation learning methods are analyzed, including data sparsity, multimodality, model optimization, and privacy protection, while potential research directions and methodologies to address these challenges are explored.

  • RESEARCH ARTICLE
    Lili ZHAO , Qi LIU , Wei CHEN , Liyi CHEN , Ruijun SUN , Min HOU , Yang WANG , Shijin WANG , Pingping REN , Jiafeng ZHOU

    Empirical Risk Minimization (ERM) models often rely on spurious correlations between features and labels during the learning process, leading to shortcut learning behavior that undermines robustness generalization performance. Current research mainly targets identifying or mitigating a single shortcut; however, in real-world scenarios, cues within the data are diverse and unknown. In empirical studies, we reveal that models rely more on strong shortcuts than weak ones, with their performance under multiple shortcuts typically falling between that of an individual shortcut. To address these challenges, we propose MiMu, a novel method integrated with Transformer-based ERMs designed to Mitigate Multiple shortcut learning behavior, which incorporates self-calibration strategy and self-improvement strategy. In the source model, we first propose the self-calibration strategy to prevent the model from relying on shortcuts and make overconfident predictions. Then, we design self-improvement strategy in target model to further reduce the reliance on multiple shortcuts. The random mask strategy involves randomly masking partial attention positions to diversify the focus of target model avoiding fixation on a fixed region. Meanwhile, the adaptive attention alignment module facilitates the alignment of attention weights to the calibrated source model, without the need for post-hoc attention maps or supervision. Finally, extensive experiments conducted on Natural Language Processing (NLP) and Computer Vision (CV) demonstrate the effectiveness of MiMu in improving the robustness generalization abilities.

  • RESEARCH ARTICLE
    Baoxin WANG , Yumeng LUO , Yixuan WANG , Dayong WU , Wanxiang CHE , Shijin WANG

    The primary objective of Chinese grammatical error correction (CGEC) is to detect and correct errors in Chinese sentences. Recent research shows that large language models (LLMs) have been applied to CGEC with significant results. For LLMs, selecting appropriate reference examples can help improve their performance. However, existing methods predominantly rely on text similarity for example retrieval, a strategy that frequently mismatches actual error patterns and retrieves lexically similar yet grammatically irrelevant sentences. To address this problem, we propose a method named RE2, which retrieves appropriate examples with explanations of grammatical errors. Instead of using text similarity of the input sentence, we use explanations of grammatical errors to select reference examples, which are used by LLMs to improve the performance of CGEC. We conduct experiments on two CGEC datasets and create a high-quality grammatical error explanation (GEE) dataset, which is not only used in our research but also serves as a valuable resource for future studies in both CGEC and GEE. The experimental results on the two datasets indicate that our proposed method effectively improves the performance of CGEC.

  • RESEARCH ARTICLE
    Ke XU , Guangyan ZHOU

    In this paper, we identify the distinction between non-brute-force computation and brute-force computation as the most fundamental problem in computer science. Subsequently, we prove, by the diagonalization method, that constructed self-referential CSPs cannot be solved by non-brute-force computation, which is stronger than P NP. This constructive method for proving impossibility results is very different (and missing) from existing approaches in computational complexity theory, but aligns with Gödel’s technique for proving logical impossibility. Just as Gödel showed that proving formal unprovability is feasible in mathematics, our results show that proving computational hardness is not hard in mathematics. Specifically, proving lower bounds for many problems, such as 3-SAT, can be challenging because these problems have various effective strategies available to avoid exhaustive search. However, for self-referential examples that are extremely hard, exhaustive search becomes unavoidable, making its necessity easier to prove. Consequently, it renders the separation between non-brute-force computation and brute-force computation much simpler than that between P and NP. Finally, our results are akin to Gödel’s incompleteness theorem, as they reveal the limits of reasoning and highlight the intrinsic distinction between syntax and semantics.

  • REVIEW ARTICLE
    Shao-Jie QIAO , Han-Lin FAN , Nan HAN , Lan DU , Yu-Han PENG , Rong-Min TANG , Xiao QIN

    Artificial intelligence-enabled database technology, known as AI4DB (Artificial Intelligence for Databases), is an active research area attracting significant attention and innovation. This survey first introduces the background of learning-based database techniques. It then reviews advanced query optimization methods for learning databases, focusing on four popular directions: cardinality/cost estimation, learning-based join order selection, learning-based end-to-end optimizers, and text-to-SQL models. Cardinality/cost estimation is classified into supervised and unsupervised methods based on learning models, with illustrative examples provided to explain the working mechanisms. Detailed descriptions of various query optimizers are also given to elucidate the working mechanisms of each component in learning query optimizers. Additionally, we discuss the challenges and development opportunities of learning query optimizers. The survey further explores text-to-SQL models, a new research area within AI4DB. Finally, we consider the future development prospects of learning databases.

  • REVIEW ARTICLE
    Shuyue WEI , Yongxin TONG , Zimu ZHOU , Yi XU , Jingkai GAO , Tongyu WEI , Tianran HE , Weifeng LV

    Reasoning has long been regarded as a distinctive hallmark of human cognition, and recent advances in the artificial intelligence community have increasingly focused on the reasoning large language models (rLLMs). However, due to strict privacy regulations, the domain-specific reasoning knowledge is often distributed across multiple data owners, limiting the rLLM’s ability to fully leverage such valuable resources. In this context, federated learning (FL) has gained increasing attention in both the academia and industry as a promising privacy-preserving paradigm for addressing the challenges in the data-efficient training of rLLMs.

    In this paper, we conduct a comprehensive survey on federated rLLMs and propose a novel taxonomy based on training signals, including training signals derived from raw data, learned representations, and preference feedback. For each category, we emphasize the emerging trends according to how to use FL to enhance reasoning capabilities of rLLMs considering the model effectiveness, communication cost and privacy preservation. Finally, we envision future research directions and challenges based on insights from existing studies.

  • RESEARCH ARTICLE
    Yang LIU , Xiaoxia JIANG , Yuanning CUI , Yu WANG , Wei HU

    Heterogeneous graphs organize data with nodes and edges, and have been widely used in various graph-centric applications. Often, some data are omitted during manual construction, leading to data reduction and performance degeneration on downstream tasks. Existing methods recover the missing data based on the data already within a single graph, neglecting the fact that graphs from different sources share some common nodes due to scope overlap. In this paper, we concentrate on the missing data recovery task on multi-source heterogeneous graphs under the incremental scenario and design a novel framework to recover the missing data by fusing multi-source complementary data from previously appeared graphs. Our model, namely SIKE, is present with a pre-trained language model and graph-specific adapters. To take advantage of the complementary data of multi-source graphs, we propose an embedding-based data fusion method to gather data among graphs. To evaluate the proposed model, we build two new datasets consisting of multi-source heterogeneous graphs. The experimental results show that our model SIKE achieves significant improvements compared with competitive baseline models, demonstrating the effectiveness of our model and shedding light on multi-source data fusion for data governance.

  • REVIEW ARTICLE
    Zhi-Min WANG , Mao-Hang RAO , Shang-Hua YE , Wei-Tao SONG , Feng LU

    With the widespread adoption of Extended Reality (XR) headsets, spatial computing technologies are gaining increasing attention. Spatial computing enables interaction with virtual elements through natural input methods such as eye tracking, hand gestures, and voice commands, thus placing natural human-computer interaction at its core. While previous surveys have reviewed conventional XR interaction techniques, recent advancements in natural interaction, particularly driven by artificial intelligence (AI) and large language models (LLMs), have introduced new paradigms and technologies. In this paper, we review research on multimodal natural interaction for wearable XR, focusing on papers published since 2022 in six top venues: ACM CHI, UIST, IMWUT (Ubicomp), IEEE VR, ISMAR, and TVCG. We classify and analyze these studies based on application scenarios, operation types, and interaction modalities. This analysis provides a structured framework for understanding how researchers are designing advanced natural interaction techniques in XR. Based on these findings, we discuss the challenges in natural interaction techniques and suggest potential directions for future research. This review provides valuable insights for researchers aiming to design natural and efficient interaction systems for XR, ultimately contributing to the advancement of spatial computing.

Publishing model
1

{"submissionFirstDecision":"40","jcrJfStr":"4.6 (2024)","editorEmail":"zhangdf@hep.com.cn"}

Downloads

{"submissionFirstDecision":"40","jcrJfStr":"4.6 (2024)","editorEmail":"zhangdf@hep.com.cn"}
Monthly

ISSN 2095-2228 (Print)
ISSN 2095-2236 (Online)
CN 10-1014/TP