2026-07-15 2026, Volume 20 Issue 7

  • Select all
  • RESEARCH ARTICLE
    Yining ZHENG , Haiyang WEI , Jiahao LU , Linqi YIN , Yunke ZHANG , Chengguo XU , Hetao CUI , Tianxiang SUN , Shuang CHEN , Xipeng QIU

    Large Language Models (LLMs) have demonstrated the capability to utilize tools after training. However, there remains limited understanding of how to optimally enhance this ability. In this paper, we focus on the in-context tool use of LLMs and investigate effective methods to enable and improve this capability. Through preliminary analysis, 3 key factors influencing in-context tool use are identified: (1) the number of tools, (2) the number of instances per tool, and (3) model parameter size. Moreover, RapidTools, a large and high-quality tool-use dataset, is constructed to investigate these factors through two experimental series by varying the number of tools and instances per tool in the training data. Experimental results show that increasing the model parameter size and the number of tools in the training data consistently enhances performance, whereas increasing the number of instances per tool produces mixed effects. In this work, we deliver insightful and critical direction in order to establish a future foundation on tool use in LLMs.

  • RESEARCH ARTICLE
    Ruiqing CHU , Xiao FU , Bin LUO , Jin SHI , Xiaoyang ZHOU

    Robot vision systems are integral to the autonomous functioning of robots, enabling tasks such as object recognition, navigation, and interaction with the environment. Nonetheless, these systems are highly prone to data poisoning and adversarial attacks, which can undermine their effectiveness and reliability. This paper investigates the relationship between these two types of attacks, with a particular focus on their similarities in feature space distribution and sensitivity to mutations in robot vision models. By enhancing existing adversarial example detection methods, we make them more effective at defending against data poisoning attacks in robot vision systems. Experimental results show that our improved defense methods not only protect against various types of data poisoning attacks but often outperform techniques specifically designed for such attacks, significantly enhancing the robustness and security of robot vision systems in real-world scenarios.

  • RESEARCH ARTICLE
    Youjiang FANG , Liang ZHANG , Shihao WANG , Wenyuan ZHANG , Yuxin WANG , Yuanyuan LIU , Xiaopeng WEI , Xin YANG

    Recent advancements in multimodal sarcasm detection (MSD) have made significant progress in understanding the interplay between textual and visual cues. However, existing methods tend to overemphasize cross-modal semantic alignment, consequently neglecting sarcasm cues that are independently embedded within each modality. In this paper, we present DCPNet, a novel Dual-channel Cross-modal Perception Network, to integrate unimodal and cross-modal features via the incorporation of comprehensive structural semantics. To capture rich topological relationships within each modality, we introduce a Graph Topology Extraction and Enhancement Module (GTEE) that builds graph structures from both text and image features, facilitating deeper semantic representation. Additionally, we propose a Cross-Modal Multi-Scale Feature Fusion (CMFF) module that aligns and integrates features from both text and image at multiple scales, ensuring the capture of comprehensive contextual information. An attention mechanism is incorporated to assign appropriate weights to the textual and visual features, thereby optimizing the fusion process for more accurate sarcasm detection. Extensive experiments conducted on the MMSD and MMSD2.0 benchmark datasets demonstrate that DCPNet outperforms existing state-of-the-art (SOTA) methods in both accuracy and robustness.

  • RESEARCH ARTICLE
    Yong CHEN , Wenjie LIU , Xiaoning WU , Nuo CHEN , Zhi ZHENG , Tong XU , Enhong CHEN

    Dynamic updating of intelligence knowledge graphs has emerged as a significant research topic for wide range of applications. However, as intelligence data continuously accumulates, dynamic update process of knowledge graph faces the inaccuracy problem, caused by complexity of incremental data and noise interference. To address the issue, we propose a novel Graph Embedding-based Dynamic Update Method (GEDUM) for intelligence knowledge graphs, which comprehensively considers the dynamic evolution characteristics of intelligence data and optimizes the updating of knowledge graph through embedding networks. Specifically, we design a Local-to-Global Feature Aggregation Module (L2GFAM) for learning global graph embeddings, deeply exploring and optimizing intrinsic features of graph nodes and edges. Building on this, an Attention-guided Weighted Fusion Strategy (AWFS) is proposed to efficiently merge and update embeddings of local subgraphs and newly added graph components, taking into account the correlation and complementarity between new and existing data. Extensive validations on real-world dataset demonstrate the significant superiority of our proposed solution over traditional methods in handling dynamically evolving intelligence data.

  • REVIEW ARTICLE
    Zhihang YI , Hairong WANG , Fangping CHEN , Zhaojing XU , Jianling YANG

    Recent advances in deep learning have significantly improved recommendation systems. However, these methods often rely heavily on labeled data, leaving challenges like data sparsity and the cold-start problem unresolved. Self-supervised learning, particularly Graph Contrastive Learning (GCL), has emerged as a powerful approach to mitigate these issues by generating informative views from unlabeled data, attracting considerable attention in recent years. This survey provides a timely and comprehensive review of current GCL-based recommendation methods. First, it introduces a comprehensive framework and taxonomy for view construction in GCL for recommendation systems, dividing it into three main types: structure generation, feature generation, and modality generation. Each category is analyzed in detail, offering insights into their methodologies, strengths, and limitations. Comparative experiments and visualization experiments are conducted on three public datasets, analyzing the complexity of various methods to guide the selection of appropriate approaches. The survey also highlights existing limitations and proposes future research directions along with potential roadmaps to inspire innovative solutions in recommendation systems.

  • LETTER
    Yujia HUANG , Jinran WU , Zhe DING , Xi’an LI
  • LETTER
    Xiang GENG , Ming ZHU , Jiahuan LI , Zhejian LAI , Wei ZOU , Shuaijie SHE , Jiaxin GUO , Xiaofeng ZHAO , Yinglu LI , Yuang LI , Chang SU , Yanqing ZHAO , Xinglin LYU , Min ZHANG , Jiajun CHEN , Hao YANG , Shujian HUANG
  • LETTER
    Yong LUO , Yan HUANG , Songfeng LU , Xiaofei YIN , Shaorui XIE , Yiting WENG
  • RESEARCH ARTICLE
    Chao LI , Zi-Yuan LIANG , Fan ZHANG , Jian LONG , Bing-Sheng ZHANG , Jian LIU

    Private information retrieval (PIR) allows a client to privately request a block of data from a database such that no information about the queried block is revealed to the database owner. With the rapid rise of cloud computing, data is often shared across multiple servers, making multi-server PIR a promising privacy-enhancing technology. As the demand for faster keyword PIR protocols increases, current single-server PIR schemes suffer from significant computational and communication costs, while two-server PIR schemes demonstrate superior performance in this regard. In this paper, we address the problem of the keyword PIR against some adversary who can corrupt at most one party in our protocols in the semi-honest setting. A feasible two-server scheme DPF-PIR is presented, inspired by the original employment of the distributed point function. Without the need of downloading some “hint” about the database, DPF-PIR can achieve similar throughput results with the state-of-the-art single-server scheme, SimplePIR, and 25.5× faster than previous schemes. Meanwhile, the communication cost of DPF-PIR, which exhibits logarithmic complexity, is significantly lower compared to other schemes; for example, it is less than 2% of the communication cost of SimplePIR. We also present a variant of our scheme, PDPF-PIR, rendering the non-collusion assumption more acceptable in practice. Despite the decreased throughput due to heavier computational costs, PDPF-PIR is still at least 2× faster than previous single-server schemes.

  • REVIEW ARTICLE
    Yan JIA , Yuxin SONG , Zihou LIU , Qingyin TAN , Yang SONG , Yu ZHANG , Zheli LIU

    The Consumer Internet of Things (CIoT), a notable segment within the IoT domain, involves the integration of IoT technology into consumer electronics and devices, such as smart homes and smart wearables. Compared to traditional IoT fields, CIoT differs notably in target users, product types, and design approaches. While offering convenience to users, it also raises new security and privacy concerns. Network traffic analysis, a widely used technique in the security community, has been extensively applied to investigate these concerns about CIoT. Compared to traditional network traffic analysis in fields like mobile apps and websites, CIoT introduces unique characteristics that pose new challenges and research opportunities. Researchers have made significant contributions in this area. To aid researchers in understanding the application of traffic analysis tools for assessing CIoT security and privacy risks, this survey reviews 310 publications on traffic analysis within the CIoT security and privacy domain from January 2018 to June 2024, focusing on three research questions. Our work: 1) outlines the CIoT traffic analysis process and highlights its differences from general network traffic analysis; 2) summarizes and classifies existing research into four categories according to its application objectives, device fingerprinting, user activity inference, malicious traffic detection, and measurement; 3) explores emerging challenges and potential future research directions based on each step of the CIoT traffic analysis process. This will provide new insights to the community and guide the industry towards safer product designs.

  • LETTER
    Hutao SONG , Hua GUO , Fengju GAO , Xiyong ZHANG , Jianwei LIU
  • RESEARCH ARTICLE
    Naichao WANG , Yihe DIWU , Mingchen FENG , Yuchen ZHANG , Xiujuan LEI

    Drug-target interaction (DTI) is critical for drug discovery, providing insights into novel therapies. The development of language models has provided strong support for DTI prediction. However, Transformer-based self-attention mechanisms often fail to capture fine-grained drug-target interactions, and single-mode feature representations limit the ability to fully characterize DTI. To address these issues, this study proposes a novel prediction method DTIBFAI based on BERT and Informer model. DTIBFAI preprocesses drug and protein sequences using ChemBERTa and BioBERT, while incorporating molecular fingerprints and dipeptide composition features to augment the richness and representativeness of the features. Additionally, this study integrates a modified Informer for the DTI prediction problem. The modified Informer augments feature embeddings and effectively captures the complex interaction patterns within sequence data. The performance of the DTIBFAI model is evaluated by comparing it with several state-of-the-art methods for drug-target interaction prediction. Experimental results demonstrate that DTIBFAI significantly outperforms these methods on the evaluated datasets, achieving AUROC and AUPRC scores of 0.9661 and 0.9673, respectively. Case studies reveal the model’s ability to identify novel DTI, including previously unrecorded interactions, validated for their biological plausibility. These results demonstrate the potential of DTIBFAI in advancing DTI prediction and its application in drug discovery. The code and data of DTIBFAI are available at the website of github.com/KNDF001/DTIBFAI-Drug-target-Interaction-Prediction-Based-on-BERT-and-Feature-Augment-of-Informer.

  • RESEARCH ARTICLE
    Xu ZHOU , Xusheng XU , Shenggen ZHENG , Le LUO

    Distributed quantum computation has garnered immense attention in the noisy intermediate-scale quantum (NISQ) era, where each computational node necessitates fewer qubits and quantum gates. In this paper, we focus on a generalized search problem involving multiple targets within an unordered database and propose a Distributed Exact Generalized Grover’s Algorithm (DEGGA) to address this challenge by decomposing it into arbitrary t components, where 2tn. Specifically, (1) our algorithm ensures accuracy, with a theoretical probability of identifying the target states at 100%; (2) if the number of targets is fixed, the pivotal factor influencing the circuit depth of DEGGA is the partitioning strategy, rather than the magnitude of n; (3) the maximum number of qubits required by our method at a single node is max(n0,n1,,nt1), where nj represents the number of qubits for the jth node and satisfies j=0t1nj=n, eliminating the need for auxiliary qubits; (4) we elucidate the resolutions (two-node and three-node) of a particular generalized search issue incorporating two goal strings (000000 and 111111) by applying DEGGA. The feasibility and effectiveness of our suggested approach is further demonstrated by executing the quantum circuits on MindSpore Quantum (a quantum simulation software). Eventually, through the decomposition of multi-qubit gates, DEGGA diminishes the utilization of quantum gates by 90.7% and decreases the circuit depth by 91.3% in comparison to the modified Grover’s algorithm by Long. It is increasingly evident that distributed quantum algorithms offer augmented practicality.

Publishing model
1

{"submissionFirstDecision":"40","jcrJfStr":"4.6 (2024)","editorEmail":"zhangdf@hep.com.cn"}

Downloads

{"submissionFirstDecision":"40","jcrJfStr":"4.6 (2024)","editorEmail":"zhangdf@hep.com.cn"}
Monthly

ISSN 2095-2228 (Print)
ISSN 2095-2236 (Online)
CN 10-1014/TP