Frontiers of Computer Science

RESEARCH ARTICLE

Decentralized multi-agent collaborating for job shop scheduling with spatial constraints

Guang LIU, Zhouhao WU, Shuping LI, Kai LV, Youfang LIN, Sheng HAN

2027, 21 (1): 2101303. https://doi.org/10.1007/s11704-025-50050-7

Download PDF

Existing job shop scheduling methods often neglect job mobility and machine spatial distribution. This paper addresses the flexible job shop scheduling problem under the spatial constraints. Specifically, it incorporates both job movement time and potential collision risks caused by local job density. The paper defines a spatially constrained scheduling environment with non-sequential machine distribution. The spatial constraints are then refined into moving distance constraints and local density constraints. Additionally, a reward function is designed, including penalties for both movement and density. This paper employs a multi-agent reinforcement learning method that combines dual attention and counterfactual baselines to solve the scheduling problem. Experimental results show that our approach effectively balances temporal and spatial factors. It reduces job movement costs and collision risks while achieving the shortest completion time.

RESEARCH ARTICLE

Ethical considerations of large language models in game playing

Qingquan ZHANG, Yuchen LI, Bo YUAN, Julian TOGELIUS, Georgios N. YANNAKAKIS, Jialin LIU

2027, 21 (1): 2101304. https://doi.org/10.1007/s11704-025-50136-2

Download PDF

Large language models (LLMs) have demonstrated tremendous potential in game playing, while little attention has been paid to their ethical implications in those contexts. This work investigates and analyses the ethical considerations of applying LLMs in game playing, using Werewolf, also known as Mafia, as a case study. Gender bias, which affects game fairness and player experience, has been observed from the behaviour of LLMs. Some roles, such as the Guard and Werewolf, are more sensitive than others to gender information, presented as a higher degree of behavioural change. We further examine scenarios in which gender information is implicitly conveyed through names, revealing that LLMs still exhibit discriminatory tendencies even in the absence of explicit gender labels. This research showcases the importance of developing fair and ethical LLMs. Beyond our research findings, we discuss the challenges and opportunities that lie ahead in this field, emphasising the need for diving deeper into the ethical implications of LLMs in gaming and other interactive domains.

RESEARCH ARTICLE

PRAE: progressive retrieval-augmented dynamic knowledge editing for large language models

Hao LI, Zheng CHU, Jiafeng LIANG, Yuxin WANG, Wei TANG, Xun MAO, Kai LV, Lei CHEN, Ming LIU, Bing QIN

2027, 21 (1): 2101310. https://doi.org/10.1007/s11704-025-50492-z

Download PDF

The knowledge stored within large language models (LLMs) tends to become outdated as the real world rapidly evolves. Therefore, efficient knowledge editing methods have gradually been widely studied. Previous methods primarily focus on parametric knowledge injection, which is struggling to extend to large-scale editing and is time-consuming for each edit. An alternative approach is Retrieval-Augmented Generation (RAG), which enables efficient knowledge injection. However, it faces issues with conflicts between internal and external knowledge, as well as fine-grained retrieval challenges. To address this, we propose Progressive Retrieval-Augmented Dynamic Knowledge Editing (PRAE), a knowledge editing framework based on contextual knowledge injection, which fine-tunes LLMs on a carefully designed dataset to equip them with two core capabilities: progressive retrieval, enabling the step-by-step incorporation of editing knowledge to tackle multi-hop problems, and dynamic knowledge utilization, allowing the flexible and effective use of retrieved knowledge. Experimental results on seven knowledge editing datasets demonstrate that our method outperforms state-of-the-art methods by 7.1% and 25.3% on single-hop and multi-hop tasks, respectively. Our further analysis reveals that PRAE exhibits superior generalization capability and computational efficiency.

RESEARCH ARTICLE

Continual document-level relation extraction with partial labeling compensation

Xinyi WANG, Zitao WANG, Yuan FENG, Wei HU

2027, 21 (1): 2101305. https://doi.org/10.1007/s11704-025-50828-9

Download PDF

Document-level relation extraction (RE) aims to identify the relations between entities across multiple sentences. In real life, new relations constantly emerge in new texts, raising the challenge to continually learn the new relations while avoiding forgetting the learned relations. Previous continual RE works have primarily focused on the continual learning of sentence-level RE, where each entity pair is associated with one single sentence and annotated with one relation. However, emerging relations may exist between entity pairs spanning multiple sentences or between entity pairs with pre-existing relations, necessitating the application of continual learning to document-level RE. To this end, we consider continual document-level RE and propose a novel model named CDRE to alleviate the partial labeling problem that severely degrades the performance of RE models. Specifically, we propose multi-binary knowledge distillation to transfer the knowledge of learned relations from the previously trained model to the current model. We introduce asymmetric training to coordinate the influence of positive samples and samples with learned yet unannotated relations. Furthermore, we explore the correlation between relations to augment label generation for re-annotating the learned and newly emerging relations in current and memorized samples, respectively. To simulate real-world scenarios, we construct two benchmark datasets derived from two widely-used document-level RE datasets. Experimental results on the datasets validate the superiority of our model CDRE in coping with continual document-level RE.

RESEARCH ARTICLE

Low-rank spectral adapter for parameter-efficient fine-tuning of transformer

Kun ZHANG, Guangyi LV, Yubin HUANGFU, Minglong XUE, Richang HONG, Jianping FAN, Xin LI, Si WEI

2027, 21 (1): 2101306. https://doi.org/10.1007/s11704-025-50771-9

Download PDF

With the rapid development of Large Language Models (LLMs), fine-tuning LLMs with downstream data for better capability transferring has become the mainstream of LLM applications, where Parameter-Efficient Fine-Tuning (PEFT) methods play the most important role. Considering the core architecture of LLMs: transformer block, existing PEFT methods focus on using limited data to fine-tune a small number of parameters of only key components, such as self-attention and feed-forward net. They have achieved impressive performance, where representative works are Low-Rank Adapter (LoRA) and its variances (e.g., AdaLora, GLoRA). However, existing PEFT methods still suffer from severe shortcomings: the sensitivity to the selection of hyper-parameters (e.g., ranks, scales, etc.) and the sensitivity to the initialization of low-rank factors. Inappropriate settings will lead to overfitting or underfitting problem when tuning LLMs, resulting in unstable fine-tuning performance. Meanwhile, searching the optimal hyper-parameters is resource-intensive and experience-dependent. To this end, in this paper, we propose a novel PEFT method: SpecAdapt, which could adapt to various scenarios without sophisticated hyper-parameter tuning. Specifically, to tackle the hyper-parameter sensitivity problem, we design a Singular-guided Weight Decay strategy to control the complexity of fine-tuned parameters. For the table fine-tuning of LLMs, we develop a simple but effective Gradient Normalization module to improve the tuning stability. Extensive experiments on multiple transformer-based pre-trained large models across various benchmarks (i.e., two image benchmarks and one language benchmark) demonstrate the superiority of our proposed SpecAdapt (achieving 75.6% average accuracy and outperforming the state-of-the-art methods with fixed hyper-parameters across 19 datasets). We also release the code to support the community.

RESEARCH ARTICLE

DualCL: dual-level contrastive learning model for multi-modal knowledge graph completion

Jie LI, Simin YANG, Linmei HU, Yuqiu DENG

2027, 21 (1): 2101307. https://doi.org/10.1007/s11704-025-50184-8

Download PDF

Knowledge graph completion aims to predict missing factual triples in knowledge graphs, thereby enhancing their completeness. Recent studies have significantly improved the performance of knowledge graph completion by integrating multi-modal information into knowledge graph representation learning. However, two major challenges remain: first, how to effectively align and integrate embeddings from structural, visual, and textual modalities to improve the quality of entity representations; second, how to strengthen the connections among head entities, relations, and tail entities in correct triples, making their associations more cohesive, thereby more clearly distinguishing between correct and incorrect triples. To address these challenges, we propose a Dual-level Contrastive Learning model (DualCL) for multi-modal knowledge graph completion. Specifically, our model consists of two levels of contrastive learning. (1) At the entity level, we employ a multi-modal contrastive representation method to align structural, visual, and textual information of the same entity into a shared embedding space, ensuring semantic consistency across modalities for more effective multi-modal information integration; (2) At the triple level, we enhance the semantic associations among head entities, relations, and tail entities in correct triples through contrastive learning, while optimizing the model’s ability to distinguish between different “entity-relation-entity” combinations. Experimental results demonstrate that our method outperforms recent strong baseline models on multiple link prediction datasets, thereby validating its effectiveness and advantages in knowledge graph completion.

RESEARCH ARTICLE

A variational stochastic dirichlet process-based autoencoder model for fine-grained music source separation

Yin ZHU, Jingqi LI, Cong JIN, Qiuqiang KONG, Hongming SHAN, Junping ZHANG

2027, 21 (1): 2101308. https://doi.org/10.1007/s11704-025-51043-2

Download PDF

Traditional source separation methods rely on coarse-grained categorical labels through labeling all vocals collectively without distinguishing individual voices in an audio mixture, which inherently limits the ability to isolate single tracks. While fine-grained annotations could partially address this issue, they demand substantial resources and face challenges in extracting tracks from raw signals. To overcome these limitations, we propose to extract each track through decomposing the patterns of data generation. Specifically, we refine Variational Stochastic Dirichlet Process-VAE, a variational autoencoder framework through replacing the standard variational distribution by a variational stochastic Dirichlet process (VSDP). Among our proposed framework, the encoder, leveraging stick-breaking constructions, adaptively partitions the latent space into clusters, while the decoder designed to recover each component achieves implicit signal separation. Its advantage is that the reconstruction target can be shifted from the raw input to its individual components. Experiments demonstrate our method’s efficacy in two scenarios: (1) Under coarse-grained source definitions, it reaches near-state-of-the-art performance (SDR=10.3); (2) For fine-grained track separation, what’s more, it identifies 83% of individual vocal tracks with an average SDR of 7.8, which cannot be obtained by other SOTA methods without the help of annotations.

RESEARCH ARTICLE

Long-tail learning with context-aware re-sampling

Jiang-Xin SHI, Xiao-Chao XIAO, Cong-Zhong ZHU, Wen TAO, Wen-Yu ZHOU, Wei ZHU, Yu-Feng LI

2027, 21 (1): 2101301. https://doi.org/10.1007/s11704-025-50835-w

Download PDF

Real-world data often exhibit a long-tail class distribution, where a small subset of classes dominate the majority of the training samples, while the remaining classes suffer from severe data scarcity. Long-tail learning (LTL) aims to tackle this extreme data imbalance problem and improve the generalization across both head and tail classes. Although re-sampling offers a straightforward solution to mitigate class imbalance, prior researches have empirically shown its limited effectiveness in modern long-tail learning tasks. To overcome this limitation, we propose Context-Aware RE-sampling (CARE), a novel framework that leverages large pre-trained models to suppress irrelevant contexts as well as enrich the diversity of the training data. Specifically, CARE introduces multiple practical implementations: CARE-DS, which integrates DINO and SAM to segment and transplant objects across images, generating diverse samples while preserving semantic consistency, and CARE-DM, which utilizes diffusion models to synthesize contextually diverse samples conditioned on original images and textual prompts. Extensive experiments demonstrate that CARE effectively mitigates performance deterioration for both head and tail classes, achieving significant generalization improvements over conventional re-sampling methods.

RESEARCH ARTICLE

Bi-directional semi-supervised domain adaptation via gradient and class centroid alignment

Yimin WEN, Jiazhen TANG, Hang YU, Chuanbo QIN, Chuangquan CHEN

2027, 21 (1): 2101309. https://doi.org/10.1007/s11704-025-50919-7

Download PDF

With the advancement of machine learning, domain adaptation has become increasingly important. Traditional research in domain adaptation has primarily focused on Unsupervised Domain Adaptation (UDA) and Semi-Supervised Domain Adaptation (SSDA). However, in many practical applications, it is common to encounter scenarios where both domains have labeled and unlabeled samples, which complicates the handling of domain adaptation. The scarcity of solutions to these scenarios further underscores the necessity of developing new methods to effectively explore the labeled and unlabeled samples. This paper proposes the problem of Bi-directional Semi-Supervised Domain Adaptation (BiSSDA) and a method of Gradient discrepancy minimization and labeled Class Centroid Align (GCCA) to address this problem. In GCCA, labeled and unlabeled samples from both domains are passed through a generator $G$ and two classifiers $F 1$ and $F 2$ , the generator $G$ opposes with $F 1$ and $F 2$ during training and in which both domains are better aligned via gradient and class centroid alignment. Extensive experiments on three widely used datasets demonstrate that GCCA significantly outperforms CGDM and several previous SSDA methods in terms of exploring the labeled and unlabeled samples in both domains and significantly reduce the reliance on labeled data in bi-directional domain adaptation through cooperation between two domains. The code of the proposed method is available at the website of gitee.com/ymw12345/gcca.

RESEARCH ARTICLE

GraphInstruct: empowering large language models with graph understanding and reasoning capability

Zihan LUO, Xiran SONG, Hong HUANG, Jianxun LIAN, Chenhao ZHANG, Jinqi JIANG, Xing XIE, Hai JIN

2027, 21 (1): 2101302. https://doi.org/10.1007/s11704-025-51382-0

Download PDF

Improving the general capabilities of Large Language Models (LLMs) is an active research topic. As a common data structure in many real-world domains, understanding graph data is a crucial part of advancing general intelligence. To this end, we propose a dynamic benchmark named GraphInstruct in this paper, which comprehensively includes 21 classical graph reasoning tasks, providing diverse graph generation pipelines and detailed intermediate reasoning steps for each sample. Based on GraphInstruct, we develop GraphSolver via efficient instruction-tuning, which demonstrates prominent graph understanding capability compared to other open-sourced LLMs. To further endow LLMs with multi-step graph reasoning capability, we propose a label-mask training strategy and build GraphSolver+, which leverages masked supervision on intermediate reasoning tokens to emphasize crucial node-identification signals. As one of the pioneering efforts to enhance the graph understanding and reasoning abilities of LLMs, extensive experiments have demonstrated the superiority of GraphSolver and GraphSolver+ over other LLMs. We sincerely hope GraphInstruct will facilitate further research on applying LLMs to graph-structured data. Our code and data are released publicly at the website of github.com/CGCL-codes/GraphInstruct.

LETTER

Romanization-enhanced large language models for parallel corpus annotation

Siqi ZHANG, Kairong LIU, Ran SONG, Yuxin HUANG, Cunli MAO, Zhengtao YU

2027, 21 (1): 2101311. https://doi.org/10.1007/s11704-025-50704-6

Download PDF

LETTER

PataLLM: personalized automated test assembly with educational knowledge graphs via reinforcement learning induced large language models

Yanan XIAO, Lu JIANG, Xiaoxia LI, Minghao YIN

2027, 21 (1): 2101312. https://doi.org/10.1007/s11704-025-51068-7

Download PDF

LETTER

Federated heterogeneous graph contrastive network with fine-grained bidirectional knowledge distillation

Dandan LIU, Yawen LI, Zhe XUE, Aijing LI, Tong ZHAO, Wenling LI, Haisheng LI

2027, 21 (1): 2101313. https://doi.org/10.1007/s11704-025-51239-6

Download PDF

LETTER

Evaluation of cross-silo federated graph learning under data heterogeneity

Jin YU, Junping DU, Zhe XUE, Yi LIU

2027, 21 (1): 2101314. https://doi.org/10.1007/s11704-025-50666-9

Download PDF

LETTER

Decoding citywide electric vehicle charging dynamics with multi-view heterogeneous spatio-temporal graph networks

Jiaming LENG, Chao WANG, Qi ZHANG, Jianyao HU, Bing YIN, Yanyong ZHANG

2027, 21 (1): 2101315. https://doi.org/10.1007/s11704-025-51732-y

Download PDF

REVIEW ARTICLE

Evolutionary perspectives on the evaluation of LLM-based AI agents: a comprehensive survey

Jiachen ZHU, Menghui ZHU, Renting RUI, Rong SHAN, Congmin ZHENG, Bo CHEN, Yunjia XI, Jianghao LIN, Weiwen LIU, Ruiming TANG, Yong YU, Weinan ZHANG

2027, 21 (1): 2101341. https://doi.org/10.1007/s11704-026-51590-2

Download PDF

The advent of large language models (LLMs), such as GPT, Gemini, and DeepSeek, has significantly advanced natural language processing, giving rise to sophisticated chatbots capable of diverse language-related tasks. The transition from these traditional LLM chatbots to more advanced AI agents represents a pivotal evolutionary step. However, existing evaluation frameworks often blur the distinctions between LLM chatbots and AI agents, leading to confusion among researchers selecting appropriate benchmarks. To bridge this gap, this paper introduces a systematic analysis of current evaluation approaches, grounded in an evolutionary perspective. We provide a detailed analytical framework that clearly differentiates AI agents from LLM chatbots along five key aspects: complex environment, multi-source instructor, dynamic feedback, multi-modal perception, and advanced capability. Further, we categorize existing evaluation benchmarks based on external environments driving forces, and resulting advanced internal capabilities. For each category, we delineate relevant evaluation attributes, presented comprehensively in practical reference tables. Finally, we synthesize current trends and outline future evaluation methodologies through four critical lenses: environment, agent, evaluator, and metrics. Our findings offer actionable guidance for researchers, facilitating the informed selection and application of benchmarks in AI agent evaluation, thus fostering continued advancement in this rapidly evolving research domain.

About the journal

Aims & scope

Description

Editorial board

Abstracting / indexing

Contact us

Browse

Just accepted

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submission

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers

Please choose a citation manager