Frontiers of Computer Science

LETTER

Flexible modality mixture of experts with refined prompt learning for flexible face anti-spoofing

Xu ZHANG, Hui MA, Yefan LI, Fuqing DUAN

2027, 21 (4): 2104331. https://doi.org/10.1007/s11704-026-50892-9

Download PDF

LETTER

StigMAP: stigmergy-guided manifold projection using diffusion models for multi-agent localization

Yan ZHANG, Rong XIE

2027, 21 (4): 2104332. https://doi.org/10.1007/s11704-026-51839-w

Download PDF

RESEARCH ARTICLE

Generative masked text-to-motion model with hybrid vector quantization

Jiaqi ZHANG, Jiajun WANG, Fanglue ZHANG, Miao WANG

2027, 21 (4): 2104701. https://doi.org/10.1007/s11704-025-50904-0

Download PDF

Text-based motion generation enhances the flexibility of human motion design and editing, enabling applications in animation, virtual reality, and beyond. However, diffusion-based methods for text-to-motion generation often produce low-quality results. Conditional autoregressive approaches leveraging vector quantization variational autoencoders (VQ-VAE) struggle with vector quantization errors, requiring hierarchical or residual quantization. This increases the length of quantized token sequences, forcing the model to predict more tokens from text input, which complicates high-quality generation. To address this, we introduce HyT2M, an innovative text-to-motion model based on a hybrid VQ-VAE framework. Our approach decomposes motion into global and local components: local motion is quantized using a single vector quantization layer to preserve fine details, while global motion is reconstructed via residual vector quantization (RVQ) to compensate for errors caused by the limited perceptual range of local components. This hybrid strategy shortens token sequences while maintaining high reconstruction quality, easing the burden on the second-stage model. Furthermore, we develop a conditional masked transformer with a hybrid cross-guidance module, leveraging global motion tokens to enhance local motion predictions. This improves accuracy and usability for motion editing. Experiments on the HumanML3D, KIT-ML, and Motion-X datasets indicate that HyT2M achieves competitive results and excels in tasks such as motion completion and long-motion generation.

RESEARCH ARTICLE

2.5D-GS: sparse-view geometry-aware Gaussian splatting via depth and normal clues

Yan XING, Yali GUO, Pan WANG, Yongxin WU, Jieqing TAN, Xiaonan LUO

2027, 21 (4): 2104702. https://doi.org/10.1007/s11704-025-50355-7

Download PDF

Recently, 3D Gaussian Splatting explicitly represents scenes and synthesizes high-quality novel views with impressive performance. However, reconstructing accurate Gaussian geometry becomes extremely challenging when using pure RGB images with few-shot inputs. We propose 2.5D-GS, which projects Gaussians into structured 2D spaces and utilizes the 2.5D representations from monocular models to separately optimize the projected depth and normal maps, ultimately achieving consistent and accurate Gaussian geometry. First, we ensure the spatial accuracy of Gaussians with Depth Plane Constraints. Since monocular depth maps construct only rough shapes, Normal Plane Constraints are then applied to refine the orientations of the Gaussians and enhance surface connectivity. Additionally, we introduce Density Ratio-Based Pruning to eliminate redundant Gaussians generated during optimization, leading to compact and efficient scene representations. Extensive experiments on the LLFF, DTU, Blender, and Mip-NeRF360 datasets demonstrate that 2.5D-GS accurately reconstructs scene geometry and renders high-quality novel views with sparse inputs.

RESEARCH ARTICLE

HybridPC: a hybrid implicit-explicit framework for zero-shot point cloud completion

Yongwei MIAO, Yijun LI, Ran FAN, Zhenghui HU, Fuchang LIU

2027, 21 (4): 2104703. https://doi.org/10.1007/s11704-025-50876-1

Download PDF

Point cloud completion is a fundamental task in 3D perception and 3D vision. Existing point cloud completion methods typically rely on supervised learning with limited 3D data, resulting in poor generalization and suboptimal recovery in scenarios involving complex shape structures or large missing regions. To overcome these limitations, we propose a novel zero-shot point cloud completion framework (called HybridPC) that achieves high-fidelity 3D reconstruction without any 3D supervision or task-specific training. HybridPC leverages powerful 2D diffusion priors and a progressive implicit-explicit architecture to address severe incompleteness and complex geometries. The framework comprises three key stages: 1) Edge-aware neural field initialization: ControlNet-guided stable diffusion synthesizes multi-view images conditioned on text prompts and orthographic edge projections of the incomplete point cloud, providing strong shape constraints to initialize a coarse NeRF field via Score Distillation Sampling (SDS). 2) Multi-view diffusion collaborative completion: A pre-trained multi-view diffusion model enforces cross-view consistency, collaboratively completing the entire neural radiance field (NeRF) with globally coherent geometry. To reconcile gradient conflicts between ControlNet and multi-view diffusion during joint SDS optimization, a PCGrad-based multi-objective optimization strategy is introduced to balance the structural and semantic guidance, yielding higher-fidelity shape completion. 3) Geometry-aware tetrahedral refinement: The implicit field is converted into a tetrahedral mesh using DMTet, which is further refined via implicit SDS-based normal optimization and explicit geometric constraints on the mesh surface, ensuring structural fidelity to the partial input. Extensive experiments on the ShapeNetPart and Redwood datasets demonstrate that HybridPC outperforms existing supervised and zero-shot methods in both qualitative and quantitative comparisons. Specifically, HybridPC preserves the input structure more faithfully, completes missing regions more accurately, and shows stronger generalization ability, with particularly significant improvements on real-world scans from Redwood dataset. Our results show the strong potential of coupling 2D diffusion priors with 3D geometric modeling for scalable, training-free point cloud completion.

RESEARCH ARTICLE

Position-aware modeling for fine-grained visually-rich long document understanding

Yixiao MA, Shulan RUAN, Zijie SONG, Xin ZHANG, Yuze ZHAO, Zhenya HUANG, Enhong CHEN

2027, 21 (4): 2104704. https://doi.org/10.1007/s11704-026-51131-x

Download PDF

Visually-rich long document understanding task requires accurate extraction of answers from documents like manuals and academic papers, which often consist of dozens of text-rich images. Recently, multimodal large language models (MLLMs) have demonstrated strong performance in this task. To alleviate the inefficiency problem of MLLMs caused by the increase of document length, retrieval-augmented methods select key pages to reduce computational cost by conducting answer generation only on retrieved pages. Despite the significant progress, existing methods still face some inherent challenges. For one thing, relative pages are usually retrieved based on textual content, which neglects spatial layout information. For another, coarse-grained retrieval at the page level can also lead to the semantic gap between retrieved pages and the query. In this paper, we propose PDU, a position-aware fine-grained retrieval-augmented model for long document understanding. Specifically, to bridge the semantic gap between the query and full pages, we first develop a fine-grained document encoding module to partition each document page into chunks and encode them with MLLMs. Then, we design a position-enhanced similarity calculation approach to compute the similarity between the query and each document chunk for retrieving the most relevant ones. To improve the model in terms of understanding document layout and structure, we further encode the bound coordinates and page number of each document chunk and add them to the MLLM-derived visual features. Next, we propose a chunk-to-page answer generation method to map back the retrieved chunks to their corresponding pages and generate the final answer. To support training, we construct a minimal answerable region (MAR) dataset using a bidirectional approximation algorithm to precisely link querys to relevant document chunks. Our method achieves strong results on public benchmarks, highlighting the value of incorporating layout information in retrieval-augmented document understanding.

RESEARCH ARTICLE

Paper shadow art

Fuwen ZHENG, Kang WU, Siyuan XING, Jiapeng GUO, Ligang LIU, Xiaoming FU

2027, 21 (4): 2104705. https://doi.org/10.1007/s11704-025-51641-0

Download PDF

Artists skillfully make several paper shapes in 3D to create different images using the shadows they cast. It is a creative art form called Paper Shadow Art. Central to this art is finding a small set of developable surfaces so that the shadow images match the desired images. However, it is challenging for users to imagine and generate developable surfaces for their desired images. To this end, we present a computational method for paper shadow art designs. The key is to convert the problem of generating piecewise developable surfaces for multiple images into one for a single image separately, subsequently merging the generated piecewise developable surfaces into the final result. Specifically, given an image, we optimize a height field to approach developability with two requirements: 1) for this image, there is no shadow deviation; 2) for other images, the shadows (projected onto their planes) of the optimized height field fall into them. To better satisfy the second requirement, we develop an iterative algorithm that alternates in each iteration between optimizing the height fields and deforming the input images, both to reduce the deviations. We demonstrate the effectiveness of our method over various examples. Its practicality is also proved by seven physical results with paper sheets.

RESEARCH ARTICLE

HFA-Transformer: hierarchical feature aggregation based Transformer for robust point cloud registration

Haiying XIA, Anran LEI, Lineng CHEN, Liping NONG, Shuxiang SONG

2027, 21 (4): 2104706. https://doi.org/10.1007/s11704-025-50289-0

Download PDF

The coarse-to-fine feature matching paradigm has demonstrated highly effective in point cloud registration. This paradigm progressively propagates feature correspondences from the coarse level to the fine level through hierarchical feature extraction. However, it is limited by the low discriminability of coarse-level features due to insufficient modeling of global geometric structures, which results in unreliable initial correspondences. Furthermore, relying on single-level features leads to the irreversible loss of fine-grained information, especially in low-overlap scenarios. These limitations present significant challenges in maintaining global geometric consistency and result in a high incidence of feature mismatches. To address these limitations, we propose the HFA-Transformer, a novel Hierarchical Feature Aggregation Transformer framework with two key innovations: (1) a feature enhancement mechanism that jointly encodes spatial and channel-wise characteristics of point clouds, enriching the global feature representation; (2) a Hierarchical Feature Aggregation Module that integrates hierarchical features to refine coarse-level correspondence estimation. Extensive experiments conducted on both indoor and outdoor benchmarks validate the superior performance and robustness of the proposed HFA-Transformer.

RESEARCH ARTICLE

Multi-source multi-client searchable symmetric encryption with post-compromise security

Yue GE, Ying GAO, Yunhao LING, Jianxin GAO

2027, 21 (4): 2104801. https://doi.org/10.1007/s11704-025-50782-6

Download PDF

As multi-source data sharing becomes increasingly prevalent in the digital economy, multi-source multi-client dynamic searchable symmetric encryption (MM-DSSE) has received significant attention. However, the complex key management of MM-DSSE exacerbates the cascading effect of key compromise risks. Existing MM-DSSE schemes only satisfy forward privacy and rely on the ideal “key non-compromised” assumption. We study the key compromise threat in the MM-DSSE and formally define the post-compromise security for MM-DSSE with respect to leakage functions. We introduce a framework for MM-DSSE that supports non-interactive key updates for data sources and clients, named Mosaic. Mosaic ensures data security even in the event of key compromise at any client, data source, or management center. Additionally, we construct an instance Mosaic_R based on Mosaic that supports range search. Both Mosaic and Mosaic_R satisfy forward and type-II backward privacy. We conduct comprehensive experimental evaluations using real-world datasets. The results show that Mosaic and Mosaic_R ensure strong security and competitive performance. Compared with the state-of-the-art single-user DSSE scheme with post-compromise security Bamboo, Mosaic achieves a 79.21% improvement in total search efficiency. The index storage overhead of Mosaic_R is reduced by 49.98% compared with the range search scheme (RS)².

RESEARCH ARTICLE

MAD-Fact: a multi-agent debate framework for long-form factuality evaluation in LLMs

Yucheng NING, Xixun LIN, Fang FANG, Yanan CAO

2027, 21 (4): 2104802. https://doi.org/10.1007/s11704-025-51369-x

Download PDF

The widespread adoption of Large Language Models (LLMs) raises critical concerns about the factual accuracy of their outputs, especially in high-risk domains such as biomedicine, law, and education. Existing evaluation methods for short texts often fail on long-form content due to complex reasoning chains, intertwined perspectives, and cumulative information. To address this, we propose a systematic approach integrating large-scale long-form datasets, multi-agent verification mechanisms, and weighted evaluation metrics. We construct LongHalluQA, a Chinese long-form factuality dataset; and develop MAD-Fact, a debate-based multi-agent verification system. We introduce a fact importance hierarchy to capture the varying significance of claims in long-form texts. Experiments on two benchmarks show that larger LLMs generally maintain higher factual consistency, while domestic models excel on Chinese content. Our work provides a structured framework for evaluating and enhancing factual reliability in long-form LLM outputs, guiding their safe deployment in sensitive domains.

RESEARCH ARTICLE

Stay away from my passwords! Revisiting the security of honeyword-based systems

Tingting RAO, Wanying XU, Peng XU, Wei WANG, Zhaojun LU, Mauro CONTI, Kaitai LIANG

2027, 21 (4): 2104803. https://doi.org/10.1007/s11704-025-51375-z

Download PDF

Honeywords are decoys stored alongside real passwords in credential databases. A real-world application of honeywords is the honeyword-based authentication system that detects malicious login attempts by utilizing these deceptive fake passwords. Existing honeyword-based authentication systems face two key restrictions: 1) the authentication server is a single point of full trust (i.e., it is not allowed to be intruded upon or colluded with by attackers), and 2) the stored real passwords are vulnerable to tweaking attacks once attackers gain knowledge of the passwords from other sources. To address the above challenges, we introduce SecHive, a secure three-layer honeyword-based authentication system with a hash-query server. SecHive ensures real password security even when the authentication server is semi-honest instead of fully honest. Moreover, we design a new honeyword generation method called $GenHoney$ , which is embedded in SecHive to detect tweaking attacks effectively. Our extensive experimental results prove that SecHive improves security over state-of-the-art honeyword-based authentication systems, in particular, at least a 7.39x improvement in the accuracy of detecting tweaking attacks.

LETTER

New efficient R1CS compilers for floating-point computations

Xi LIN, Lianglin YAN, Pengfei ZENG, Yongqiang LI, Mingsheng WANG, Huiyan CHEN

2027, 21 (4): 2104804. https://doi.org/10.1007/s11704-025-50899-8

Download PDF

REVIEW ARTICLE

Recent advances in attack and defense approaches of large language models

Jing CUI, Yishi XU, Zhewei HUANG, Zekeng ZENG, Jianbin JIAO, Junge ZHANG

2027, 21 (4): 2104805. https://doi.org/10.1007/s11704-025-50297-0

Download PDF

Large Language Models (LLMs) have revolutionized artificial intelligence and machine learning through their advanced generation and reasoning capabilities. However, their widespread deployment has raised significant safety and reliability concerns. Emerging threats, coupled with established vulnerabilities in deep neural networks, may compromise model security and even create a false sense of security. Given the extensive research in the field of LLM security, especially the studies in late 2023 and 2024, a survey that begins with model behavior and delves into its internal representational roots is crucial for providing the community with key insights and guiding future development. In this survey, we analyze recent studies on various attack vectors and threat models, providing insights into improving attack mechanisms. We also examine the present defense strategies, highlighting their strengths and current limitations. Our goal is to deepen the understanding of LLM safety challenges and contribute to the development of more robust security measures.

LETTER

SpiralGuard: characterizing and predicting death spiral risks in algorithmic stablecoins

Minrui WU, Jiang XIAO, Haoyu WANG, Xiaohai DAI

2027, 21 (4): 2104806. https://doi.org/10.1007/s11704-025-50577-9

Download PDF

About the journal

Aims & scope

Description

Editorial board

Abstracting / indexing

Contact us

Browse

Just accepted

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submission

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers

Please choose a citation manager