Frontiers of Computer Science

REVIEW ARTICLE

Vision-based 3D occupancy prediction in autonomous driving: a review and outlook

Yanan ZHANG , Jinqing ZHANG , Zengran WANG , Junhao XU , Di HUANG

2026, 20(1): 2001301. https://doi.org/10.1007/s11704-024-40443-5

Download PDF

In recent years, autonomous driving has garnered escalating attention for its potential to relieve drivers’ burdens and improve driving safety. Vision-based 3D occupancy prediction, which predicts the spatial occupancy status and semantics of 3D voxel grids around the autonomous vehicle from image inputs, is an emerging perception task suitable for cost-effective perception system of autonomous driving. Although numerous studies have demonstrated the greater advantages of 3D occupancy prediction over object-centric perception tasks, there is still a lack of a dedicated review focusing on this rapidly developing field. In this paper, we first introduce the background of vision-based 3D occupancy prediction and discuss the challenges in this task. Second, we conduct a comprehensive survey of the progress in vision-based 3D occupancy prediction from three aspects: feature enhancement, deployment friendliness and label efficiency, and provide an in-depth analysis of the potentials and challenges of each category of methods. Finally, we present a summary of prevailing research trends and propose some inspiring future outlooks. To provide a valuable reference for researchers, a regularly updated collection of related papers, datasets, and codes is organized at github.com/zya3d/Awesome-3D-Occupancy-Prediction website.

RESEARCH ARTICLE

Improved paraphrase generation via controllable latent diffusion

Wei ZOU , Ziyuan ZHUANG , Xiang GENG , Shujian HUANG , Jia LIU , Jiajun CHEN

2026, 20(1): 2001302. https://doi.org/10.1007/s11704-025-40633-9

Download PDF

Paraphrase generation strives to generate high-quality and diverse expressions of a given text, a domain where diffusion models excel. Though SOTA diffusion generation reconciles generation quality and diversity, textual diffusion suffers from a truncation issue that hinders efficiency and quality control. In this work, we propose Latent Diffusion Paraphraser (LDP), a novel paraphrase generation by modeling a controllable diffusion process given a learned latent space. LDP achieves superior generation efficiency compared to its diffusion counterparts. It can facilitate only input segments to ensure paraphrase semantics, improving the results without external features. Experiments show that LDP better reconciles paraphrase generation quality and diversity than baselines. Further analysis shows that our method is also helpful to other similar text generations and domain adaptations

RESEARCH ARTICLE

Exploiting large language model with reinforcement learning for generative job recommendations

Zhi ZHENG , Zhao-Peng QIU , Chen ZHU , Xiao HU , Li-Kang WU , Yang SONG , Heng-Shu ZHU , Hui XIONG

2026, 20(1): 2001303. https://doi.org/10.1007/s11704-025-40843-1

Download PDF

With the rapid development of Large Language Models (LLMs), an increasing number of researchers are turning their attention to Generative Recommender Systems (GRSs), which are not constrained by strict candidate sets and are more conducive to exploring user interests. Existing LLM-based GRSs mainly utilize Supervised Fine-Tuning (SFT) to endow LLMs with the capability to generate candidate items, and further employ similarity-based grounding methods to map the generated results to real-world items. However, SFT-based training methods are insufficient for LLMs to adequately grasp the knowledge embedded in complex interactive behaviors, and similarity-based grounding methods also face challenges for long text matching. Therefore, in this paper, we propose generative job recommendation based on large language models (GIRL). Specifically, we propose to train a model which can evaluate the matching degree between curriculum vitae (CV) and job description (JD) as a reward model, and we use a proximal policy optimization (PPO)-based reinforcement learning (RL) method to fine-tune the LLM-based recommender. Moreover, we propose a model-based grounding method for JD grounding. Extensive experiments on two real-world datasets demonstrate the superiority of the proposed model compared to seven baseline methods.

RESEARCH ARTICLE

Topic-enhanced argument mining via mutual learning

Jiasheng SI , Yingjie ZHU , Rui WANG , Wenpeng LU , Yulan HE , Deyu ZHOU

2026, 20(1): 2001304. https://doi.org/10.1007/s11704-025-40460-y

Download PDF

Given a controversial target, such as “nuclear energy”, information-seeking argument mining aims to identify argumentative text from diverse sources. The main challenge in this task comes three-fold: the insufficiency of contextual information on targets, cross-domain adaptation across varying targets, and implicit argumentative information within the argument. Current approaches primarily address the first two challenges by improving the integration of target-related semantic information with arguments, while there has been little work on modeling all three aspects. To address these challenges, inspired by the potential capability of the neural topic model for mining the local and global topic information contained in the dataset, we propose a novel topic-enhanced information-seeking argument mining approach by leveraging the mutual interaction between the neural topic model and the language model. Specifically, (i) the global topic information is extracted from the corpora to encapsulate the common knowledge across different targets for solving the cross-domain adaptation; (ii) to capture the contextual information on targets, the target is augmented by target-aware subtopics derived from the global topic-word distribution; (iii) to capture the implicit argumentative information within the argument, the local topic information is captured by minimizing the similarity between its local topic distribution and its semantic representation through mutual learning. Experimental results show the superiority of the proposed model compared to the state-of-the-art baselines in both in-domain and cross-domain scenarios.

RESEARCH ARTICLE

Invariant graph learning meets information bottleneck for out-of-distribution generalization

Wenyu MAO , Jiancan WU , Haoyang LIU , Yongduo SUI , Xiang WANG

2026, 20(1): 2001305. https://doi.org/10.1007/s11704-025-40798-3

Download PDF

Graph out-of-distribution (OOD) generalization remains a major challenge in graph learning since graph neural networks (GNNs) often suffer from severe performance degradation under distribution shifts. Invariant learning, aiming to extract invariant features across varied distributions, has recently emerged as a promising approach for OOD generalization. Despite the great success of invariant learning in OOD problems for Euclidean data (i.e., images), the exploration within graph data remains constrained by the complex nature of graphs. The invariant features at both the attribute and structural levels, combined with the absence of prior knowledge regarding environmental factors, make the invariance and sufficiency conditions of invariant learning hard to satisfy on graph data. Existing studies, such as data augmentation or causal intervention, either suffer from disruptions to invariance during the graph manipulation process or face reliability issues due to a lack of supervised signals for causal parts. In this work, we propose a novel framework, called Invariant Graph Learning based on Information bottleneck theory (InfoIGL), to extract the invariant features of graphs and enhance models’ generalization ability to unseen distributions. Specifically, InfoIGL introduces a redundancy filter to compress task-irrelevant information related to environmental factors. Cooperating with our designed multi-level contrastive learning, we maximize the mutual information among graphs of the same class in the downstream classification tasks, preserving invariant features for prediction to a great extent. An appealing feature of InfoIGL is its strong generalization ability without depending on supervised signal of invariance. Experiments on both synthetic and real-world datasets demonstrate that our method achieves state-of-the-art performance under OOD generalization for graph classification tasks. The source code is available at github.com/maowenyu-11/InfoIGL website.

RESEARCH ARTICLE

Unsupervised lightweight 3D convolutional network for enhanced infrared imaging in wearable devices

Biao ZHU , Jun ZHANG , Sirui ZHAO , Zhengye ZHANG , Enhong CHEN

2026, 20(1): 2001306. https://doi.org/10.1007/s11704-025-40948-7

Download PDF

With the increasing frequency of natural disasters and health emergencies, wearable infrared thermal imaging devices are becoming more prevalent in fire protection and medical fields. However, these devices often face imaging performance challenges such as insufficient contrast, dark areas and blurred edges, which significantly limit their practical effectiveness. To tackle these challenges, we propose a novel unsupervised lightweight 3D convolutional network (UL3DCN) specifically designed for enhancing infrared images on wearable devices. In this framework, the task of infrared image enhancement is conceptualized as generating high dynamic range infrared images from the corresponding temperature sequences during thermal equilibrium. To achieve this, we first design a learnable dynamic filtering module tailored for simulating a series of infrared image sequences under varying temperature differences. This module extends a single image from the spatial domain into the spatio-temporal domain. Subsequently, we employ a lightweight 3D convolution module to effectively extract spatio-temporal information from the image sequence. Finally, inspired by Zero-DCE, we utilize the extracted information to estimate pixel values and high-order curves, thereby enhancing the infrared images. Comprehensive experimental results demonstrate that our method achieves outstanding performance and real-time capabilities. Additionally, the proposed UL3DCN model has been successfully integrated into a wearable infrared firefighting mask.

RESEARCH ARTICLE

FedSIN: information network representation based on federated self-adaptive learning

Ang LI , Yawen LI , Zhe XUE

2026, 20(1): 2001307. https://doi.org/10.1007/s11704-025-40529-8

Download PDF

Previous federated learning methods primarily addressed challenges involving Euclidean data, such as images and text, where relationships between data points are linear. However, information networks, as non-Euclidean data, inherently exhibit data heterogeneity. This heterogeneity is further amplified in federated learning environments, where data from multi-party information networks introduces even greater variability. It’s also worth noting that the contributions of multi-party information networks to federated learning process are dynamic. To address this challenge, we propose an Information Network representation method based on Federated Self-adaptive learning (FedSIN), which leverages the importance of neighboring nodes in the network to learn node representations and performs adaptive federated model aggregation. Specifically, FedSIN utilizes the self-attention mechanism of the graph attention network to capture the significance of neighbor nodes’ influence on each node, enabling effective aggregation of neighbor node information for improved node representation. Additionally, FedSIN designs an adaptive federated model aggregation mechanism to evaluate and incorporate the contributions of different clients based on their performance in each communication round. Experimental results on three public datasets demonstrate the superiority of our proposed FedSIN over state-of-the-art information network representation methods.

RESEARCH ARTICLE

Learning contextual information and task alignment for emotion cause extraction in conversation

Jun-Hao FENG , Xia-Bing ZHOU , Wen-Liang CHEN , Min ZHANG

2026, 20(1): 2001308. https://doi.org/10.1007/s11704-025-40931-2

Download PDF

Recognizing the fine-grained emotion cause extraction-Causal Span Extraction (CSE)-is challenging due to the more detailed analysis of contextual information. The previous research on CSE has focused on the powerful semantic representation capability of PLMs while overlooking the semantic coherence of the speaker and content when emotions arise in conversation. In this paper, we introduce a novel method by learning contextual information and enhancing the consistency of cross-task alignment. Specifically, we integrate the coreference resolution into the attention mechanism to capture the coreference-aware semantic correlations and employ the position relation strategy at both the utterance and token levels to understand the contextual information. Furthermore, by incorporating auxiliary tasks and a novel cross-task alignment approach, we reduce inconsistent predictions across tasks, thereby enabling a comprehensive, multi-dimensional comprehension of conversations. Our method demonstrates a marked improvement over current state-of-the-art models, evidenced by superior performance on two benchmark datasets.

RESEARCH ARTICLE

RPND: a rule guided link prediction model with specific-path selection

Xiu-Lin ZHENG , Pei-Pei LI , Zan ZHANG , Xin-Dong WU

2026, 20(1): 2001309. https://doi.org/10.1007/s11704-025-41288-2

Download PDF

Knowledge graphs (KGs) often suffer from incompleteness, which limits their performance in practice where a vast amount of entities may co-exist. To aid, knowledge graph completion (KGC) has been proposed to infer the missing links between entities. Among them, reasoning over relation paths in incomplete KG is a popular research topic. However, there are still some issues remained to be solved, such as path noise, path sparsity of KG, the ambiguity of inferred relation and lack of explanability in path representation. To simultaneously address the aforementioned challenges, we propose a novel rule guided link prediction model with path noise avoidance and disambiguation of inferred relation, termed as RPND. Specifically, we utilize path selection strategy to filter noisy path and reduce the interference of path noise. To alleviate the path sparsity of KG, we leverage path overlapping feature of similar relations and combine them based on the semantic similarity. For the ambiguity of inferred relation, we draw the insight from language model like transformer by introducing position embedding to reflect the order of relation along the path when learning its representation. Meanwhile, we employ logic rules to compose paths in semantic level to enhance the explanability of path representation. Extensive experiments conducted on benchmark datasets demonstrate the superiority of our proposed RPND model compared to its SOTAs.

LETTER

Achieving >97% on GSM8K: deeply understanding the problems makes LLMs better solvers for math word problems

Qihuang ZHONG , Kang WANG , Ziyang XU , Liang DING , Juhua LIU , Bo DU

2026, 20(1): 2001310. https://doi.org/10.1007/s11704-025-41102-z

Download PDF

LETTER

Bidirectional chain-of-thought for zero-shot object navigation

Haonan LUO , Sijia LI , Yijie ZENG , Zihang WANG , Botao JIANG , Xiruo JIANG

2026, 20(1): 2001317. https://doi.org/10.1007/s11704-025-41283-7

Download PDF

RESEARCH ARTICLE

Robust long-tailed learning under label noise

Tong WEI , Jiang-Xin SHI , Min-Ling ZHANG , Yu-Feng LI

2026, 20(1): 2001321. https://doi.org/10.1007/s11704-025-40860-0

Download PDF

Long-tailed learning aims to enhance the generalization performance of underrepresented tail classes. However, previous methods have largely overlooked the prevalence of noisy labels in training data. In this paper, we address the challenge of noisy labels in long-tailed learning. We identify a critical issue: the commonly used small-loss noisy label detection criterion fails to perform effectively in long-tailed class distributions. This failure arises from the inherent bias of deep neural networks, which tend to misclassify tail class examples as head classes, leading to unreliable loss calculations. To mitigate this, we propose a novel small-distance criterion that leverages the robustness of learned representations, enabling more accurate identification of correctly-labeled examples across both head and tail classes. Additionally, to improve training for tail classes, we replace discrete pseudo-labels with label distributions for examples flagged as noisy, resulting in significant performance gains. Based on these contributions, we introduce the robust long-tail learning framework, designed to train models that are resilient to both class imbalance and noisy labels. Extensive experiments on benchmark and real-world datasets demonstrate that our approach outperforms previous methods, offering substantial performance improvements. Our source code is available at the website of github.com/Stomach-ache/RoLT

LETTER

Streaming algorithms for triangle counting: adversarial robustness and the weighted case

Jing CAO , Yicheng PAN , Pan PENG

2026, 20(1): 2001401. https://doi.org/10.1007/s11704-025-41203-9

Download PDF

CORRESPONDENCE

Comment on “SAT requires exhaustive search”

Eric ALLENDER , Ryan WILLIAMS

2026, 20(1): 2001405. https://doi.org/10.1007/s11704-025-53000-5

Download PDF

An article that was recently published in Frontiers of Computer Science claims to prove that $P$ is not equal to $N P$ . In fact, it claims to show that the SAT problem requires at least $2 δ n$ time, for any constant $δ ∈ (0, 1)$ . We contend that the argument that is presented falls far short of a proof: it makes an assumption about all possible SAT algorithms that is unwarranted.

LETTER

MEFE: microbiome signature identification based on elastic feature extraction

Guosen HOU , Yuxiao FEI , Hao GAO , Xiaoquan SU

2026, 20(1): 2001901. https://doi.org/10.1007/s11704-025-50323-1

Download PDF

About the journal

Aims & scope

Description

Editorial board

Abstracting / indexing

Contact us

Browse

Just accepted

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers

Please choose a citation manager