2025-04-15 2025, Volume 26 Issue 1
  • Select all
  • Perspective
    Wenguan WANG, Yi YANG, Yunhe PAN, , ,

    Visual knowledge is a new form of knowledge representation that can encapsulate visual concepts and their relations in a succinct, comprehensive, and interpretable manner, with a deep root in cognitive psychology. As the knowledge of the visual world has been identified as an indispensable component of human cognition and intelligence, visual knowledge is poised to have a pivotal role in establishing machine intelligence. With the recent advance of artificial intelligence (AI) techniques, large AI models (or foundation models) have emerged as a potent tool capable of extracting versatile patterns from broad data as implicit knowledge, and abstracting them into an outrageous amount of numeric parameters. To pave the way for creating visual knowledge empowered AI machines in this coming wave, we present a timely review that investigates the origins and development of visual knowledge in the pre-big-model era, and accentuates the opportunities and unique role of visual knowledge in the big model era.

  • Perspective
    Jing YANG, Xingyuan DAI, Yisheng LV, Levente KOVÁCS, Fei-Yue WANG, , , , ,
  • Chao JING, Jianwu XU, ,

    Although collaborative edge computing (CEC) systems are beneficial in enhancing the performance of mobile edge computing (MEC), the issue of user privacy leakage becomes prominent during task offloading. To address this issue, we design a privacy-preservation-aware delay optimization task-offloading algorithm (PPDO) in a CEC system. By considering location and usage pattern privacy protection, we establish a privacy task model to interfere with the edge server and ensure user privacy. To address the extra delay arising from privacy protection, we subsequently leverage a Markov decision processing (MDP) policy-iteration-based algorithm to minimize delays without compromising privacy. To simultaneously accelerate the MDP operation, we develop an extension that improves the PPDO by optimizing the action set. Finally, a comprehensive simulation was conducted using the edge user allocation (EUA) dataset. The results demonstrated that PPDO achieves an optimal trade-off between privacy protection and delay with a minimum delay compared with existing algorithms. Moreover, we examined the advantages and disadvantages of improving PPDO.

  • Ruipeng ZHANG, Ziqing FAN, Jiangchao YAO, Ya ZHANG, Yanfeng WANG, , , , ,

    Cross-silo federated learning (FL), which benefits from relatively abundant data and rich computing power, is drawing increasing focus due to the significant transformations that foundation models (FMs) are instigating in the artificial intelligence field. The intensified data heterogeneity issue of this area, unlike that in cross-device FL, is caused mainly by substantial data volumes and distribution shifts across clients, which requires algorithms to comprehensively consider the personalization and generalization balance. In this paper, we aim to address the objective of generalized and personalized federated learning (GPFL) by enhancing the global model’s cross-domain generalization capabilities and simultaneously improving the personalization performance of local training clients. By investigating the fairness of performance distribution within the federation system, we explore a new connection between generalization gap and aggregation weights established in previous studies, culminating in the fairness-guided federated training for generalization and personalization (FFT-GP) approach. FFT-GP integrates a fairness-aware aggregation (FAA) approach to minimize the generalization gap variance among training clients and a meta-learning strategy that aligns local training with the global model’s feature distribution, thereby balancing generalization and personalization. Our extensive experimental results demonstrate FFT-GP’s superior efficacy compared to existing models, showcasing its potential to enhance FL systems across a variety of practical scenarios.

  • Shuai REN, Hao GONG, Suya ZHENG, , ,

    Three-dimensional (3D) point cloud information hiding algorithms are mainly concentrated in the spatial domain. Existing spatial domain steganalysis algorithms are subject to more disturbing factors during the analysis and detection process, and can only be applied to 3D mesh objects, so there is a lack of steganalysis algorithms for 3D point cloud objects. To change the fact that steganalysis is limited to 3D mesh and eliminate the redundant features in the 3D mesh steganalysis feature set, we propose a 3D point cloud steganalysis algorithm based on composite operator feature enhancement. First, the 3D point cloud is normalized and smoothed. Second, the feature points that may contain secret information in 3D point clouds and their neighboring points are extracted as the feature enhancement region by the improved 3DHarris-ISS composite operator. Feature enhancement is performed in the feature enhancement region to form a feature-enhanced 3D point cloud, which highlights the feature points while suppressing the interference created by the rest of the vertices. Third, the existing 3D mesh feature set is screened to reduce the data redundancy of more relevant features, and the newly proposed local neighborhood feature set is added to the screened feature set to form the 3D point cloud steganography feature set POINT72. Finally, the steganographic features are extracted from the enhanced 3D point cloud using the POINT72 feature set, and steganalysis experiments are carried out. Experimental analysis shows that the algorithm can accurately analyze the 3D point cloud’s spatial steganography and determine whether the 3D point cloud contains hidden information, so the accuracy of 3D point cloud steganalysis, under the prerequisite of missing edge and face information, is close to that of the existing 3D mesh steganalysis algorithms.

  • Binkun LIU, Yu KANG, Yang CAO, Yunbo ZHAO, Zhenyi XU, , , , ,

    Recently, deep learning based city flow prediction has been extensively used in the establishment of smart cities. These methods are data-hungry, making them unscalable to areas lacking data. Although transfer learning can use data-rich source domains to assist target domain cities in city flow prediction, the performance of existing methods cannot meet the needs of actual use, because the long-distance road network connectivity is ignored. To solve this problem, we propose a transfer learning method based on spatiotemporal graph convolution, in which we construct a co-occurrence space between the source and target domains, and then align the mapping of the source and target domains’ data in this space, to achieve the transfer learning of the source city flow prediction model on the target domain. Specifically, a dynamic spatiotemporal graph convolution module along with a temporal encoder is devised to simultaneously capture the concurrent spatiotemporal features, which implies the inherent relationship among the road network structures, human travel habits, and city bike flow. Then, these concurrent features are leveraged as cross-city invariant representations and nonlinearly spanned to a co-occurrence space. The target domain features are thereby aligned with the source domain features in the co-occurrence space by using a Mahalanobis distance loss, to achieve cross-city bike flow prediction. The proposed method is evaluated on the public bike flow datasets in Chicago, New York, and Washington in 2015, and significantly outperforms state-of-the-art techniques.

  • Yinhong XIANG, Kaiqing ZHOU, Arezoo SARKHEYLI-HÄGELE, Yusliza YUSOFF, Diwen KANG, Azlan Mohd ZAIN, , , , , ,

    The state space explosion, a challenge analogous to that encountered in a Petri net (PN), has constrained the extensive study of fuzzy Petri nets (FPNs). Current reasoning algorithms employing FPNs, which operate through forward, backward, and bidirectional mechanisms, are examined. These algorithms streamline the inference process by eliminating irrelevant components of the FPN. However, as the scale of the FPN grows, the complexity of these algorithms escalates sharply, posing a significant challenge for practical applications. To address the state explosion issue, this work introduces a parallel bidirectional reasoning algorithm for an FPN that utilizes reverse and decomposition strategies to optimize the implementation process. The algorithm involves hierarchically dividing a large-scale FPN into two sub-FPNs, followed by a converse operation to generate the reversal sub-FPN for the right-sub-FPN. The detailed mapping between the original and reversed FPNs is thoroughly discussed. Parallel reasoning operations are then conducted on the left-sub-FPN and the resulting reversal right-sub-FPN, with the final result derived by computing the Euclidean distance between the outcomes from the output places of the two sub-FPNs. A case study is presented to illustrate the implementation process, demonstrating the algorithm’s significant enhancement of inference efficiency and substantial reduction in execution time.

  • Yanqi SHI, Peng LIANG, Hao ZHENG, Linbo QIAO, Dongsheng LI, , , , ,

    Large-scale deep learning models are trained distributedly due to memory and computing resource limitations. Few existing strategy generation approaches take optimal memory minimization as the objective. To fill in this gap, we propose a novel algorithm that generates optimal parallelism strategies with the constraint of minimal memory redundancy. We propose a novel redundant memory cost model to calculate the memory overhead of each operator in a given parallel strategy. To generate the optimal parallelism strategy, we formulate the parallelism strategy search problem into an integer linear programming problem and use an efficient solver to find minimal-memory intra-operator parallelism strategies. Furthermore, the proposed algorithm has been extended and implemented in a multi-dimensional parallel training framework and is characterized by high throughput and minimal memory redundancy. Experimental results demonstrate that our approach achieves memory savings of up to 67% compared to the latest Megatron-LM strategies; in contrast, the gap between the throughput of our approach and its counterparts is not large.

  • Qingsong ZHOU, Jialong QIAN, Zhongping YANG, Chao HUANG, Qinxian CHEN, Yibo XU, Zhengkai WEI, , , , , , ,

    Distributed precision jamming (DPJ) is a novel blanket jamming concept in electronic warfare, which delivers the jamming resource to the opponent equipment precisely and ensures that friendly devices are not affected. Robust jamming performance and low hardware burden on the jammers are crucial for practical DPJ implementation. To achieve these goals, we study the robust design of wideband constant modulus (CM) discrete phase waveform for DPJ, where the worst-case combined power spectrum (CPS) of both the opponent and friendly devices is considered in the objective function, and the CM discrete phase constraints are used to design the wideband waveform. Specifically, the resultant mathematical model is a large-scale minimax multi-objective optimization problem (MOP) with CM and discrete phase constraints. To tackle the challenging MOP, we transform it into a single-objective minimization problem using the Lp-norm and Pareto framework. For the approximation problem, we propose the Riemannian conjugate gradient for CM discrete phase constraints (RCG-CMDPC) algorithm with low computational complexity, which leverages the complex circle manifold and a projection method to satisfy the CM discrete phase constraints within the RCG framework. Numerical examples demonstrate the superior robust DPJ effectiveness and computational efficiency compared to other competing algorithms.

  • Haiquan LU, Yong ZENG, ,

    Delay alignment modulation (DAM) is recently proposed as an effective technique to address the intersymbol interference (ISI) issue, which circumvents the conventional channel equalization and multi-carrier transmission. Moreover, wireless communications are vulnerable to malicious eavesdropping and attacks due to their inherent open and broadcast nature. In particular, DAM not only eliminates the ISI at the desired receiver but may also introduce ISI to other locations, and thus is quite promising for secure communications. This paper considers the near-field secure wireless communication with DAM. To gain useful insights, it is first shown that when the antenna number of Alice is much larger than the number of multipaths for Bob and Eve, the delay compensation and low-complexity path-based maximal-ratio transmission (MRT) beamforming achieve a communication free of ISI and information leakage, owing to the asymptotically orthogonal property brought by the near-field nonuniform spherical wave (NUSW). The secrecy rate performance of path-based zero-forcing (ZF) beamforming toward ISI-free communication is then evaluated. Furthermore, the path-based optimized DAM beamforming scheme is proposed to maximize the secrecy rate, by considering the general case in the presence of some tolerable ISI. As a comparison, the benchmarking scheme of the artificial noise (AN) based orthogonal frequency-division multiplexing (OFDM) is considered. Simulation results show that DAM achieves a higher secrecy rate and lower peak-to-average-power ratio (PAPR) than the AN-based OFDM.

  • Zhongpeng NI, Jing XIA, Xinyu ZHOU, Wa KONG, Wence ZHANG, Xiaowei ZHU, , , , , ,

    The present paper proposes an optimization design method for the Doherty output matching network (OMN) using impedance-phase hybrid objective function constraints, which possesses the capability of enhancing the efficiency consistency of the Doherty power amplifier (DPA) using integrated enhancing reactance (IER) during the back-off power (BOP) range. By calculating the desired reactance for an extended BOP range and combining it with the two-impedance matching method, the S-parameters of the OMN are obtained. Meanwhile, the impedance and phase constraints of the OMN are proposed to narrow the distribution range of the IER. Furthermore, a fragment-type structure is employed in the OMN optimization so as to enhance the flexibility of the circuit optimization design. To validate the proposed method, a 1.7-2.5 GHz symmetric DPA with a large BOP range was designed and fabricated. Measurement results demonstrate that across the entire operating frequency band, the saturated output power is >44 dBm, and the efficiency ranges from 45% to 55% at a 9-dB BOP.