2025-04-22 2025, Volume 26 Issue 3
  • Select all
  • Review
    Yu TANG, Linbo QIAO, Lujia YIN, Peng LIANG, Ao SHEN, Zhilin YANG, Lizhi ZHANG, Dongsheng LI, , , , , , , ,

    Large-scale models have gained significant attention in a wide range of fields, such as computer vision and natural language processing, due to their effectiveness across various applications. However, a notable hurdle in training these large-scale models is the limited memory capacity of graphics processing units (GPUs). In this paper, we present a comprehensive survey focused on training large-scale models with limited GPU memory. The exploration commences by scrutinizing the factors that contribute to the consumption of GPU memory during the training process, namely model parameters, model states, and model activations. Following this analysis, we present an in-depth overview of the relevant research work that addresses these aspects individually. Finally, the paper concludes by presenting an outlook on the future of memory optimization in training large-scale language models, emphasizing the necessity for continued research and innovation in this area. This survey serves as a valuable resource for researchers and practitioners keen on comprehending the challenges and advancements in training large-scale language models with limited GPU memory.

  • Review
    Bin XIN, Sai LU, Qing WANG, Fang DENG, , , ,

    The flexible job shop scheduling problem for processing machines and transportation vehicles (FJSP_PT) has garnered significant attention from academia and industry. Due to the inclusion of transportation vehicle scheduling in the scheduling problem of flexible manufacturing systems, solving FJSP_PT becomes more challenging and significantly more practically relevant compared to the flexible job shop scheduling problem. We summarize the assumptions, constraints, objective functions, and benchmarks of FJSP_PT. Then, statistical analysis is conducted on the literature up to 2023, including journals, number of articles published each year, and solution algorithms. We analyze recent literature on FJSP_PT, categorizing it based on algorithms into exact algorithms, heuristic algorithms, meta-heuristic algorithms, and swarm intelligence based algorithms. Finally, the research trends and challenges faced by FJSP_PT are summarized.

  • Yinghao LI, Heyan HUANG, Baojun WANG, Yang GAO, , , ,

    Chinese spelling correction (CSC) is a task that aims to detect and correct the spelling errors that may occur in Chinese texts. However, the Chinese language exhibits a high degree of complexity, characterized by the presence of multiple phonetic representations known as pinyin, which possess distinct tonal variations that can correspond to various characters. Given the complexity inherent in the Chinese language, the CSC task becomes imperative for ensuring the accuracy and clarity of written communication. Recent research has included external knowledge into the model using phonological and visual modalities. However, these methods do not effectively target the utilization of modality information to address the different types of errors. In this paper, we propose a multimodal pretrained language model called DRMSpell for CSC, which takes into consideration the interaction between the modalities. A dynamically reweighting multimodality (DRM) module is introduced to reweight various modalities for obtaining more multimodal information. To fully use the multimodal information obtained and to further strengthen the model, an independent-modality masking strategy (IMS) is proposed to independently mask three modalities of a token in the pretraining stage. Our method achieves state-of-the-art performance on most metrics constituting widely used benchmarks. The findings of the experiments demonstrate that our method is capable of modeling the interactive information between modalities and is also robust to incorrect modal information.

  • Zhichao WANG, Xinhai CHEN, Junjun YAN, Jie LIU, , , ,

    In computational fluid dynamics (CFD), mesh-smoothing methods are widely used to refine the mesh quality for achieving high-precision numerical simulations. Specifically, optimization-based smoothing is used for high-quality mesh smoothing, but it incurs significant computational overhead. Pioneer works have improved its smoothing efficiency by adopting supervised learning to learn smoothing methods from high-quality meshes. However, they pose difficulties in smoothing the mesh nodes with varying degrees and require data augmentation to address the node input sequence problem. Additionally, the required labeled high-quality meshes further limit the applicability of the proposed method. In this paper, we present graph-based smoothing mesh net (GMSNet), a lightweight neural network model for intelligent mesh smoothing. GMSNet adopts graph neural networks (GNNs) to extract features of the node’s neighbors and outputs the optimal node position. During smoothing, we also introduce a fault-tolerance mechanism to prevent GMSNet from generating negative volume elements. With a lightweight model, GMSNet can effectively smooth mesh nodes with varying degrees and remain unaffected by the order of input data. A novel loss function, MetricLoss, is developed to eliminate the need for high-quality meshes, which provides stable and rapid convergence during training. We compare GMSNet with commonly used mesh-smoothing methods on two-dimensional (2D) triangle meshes. Experimental results show that GMSNet achieves outstanding mesh-smoothing performances with 5% of the model parameters compared to the previous model, but offers a speedup of 13.56 times over the optimization-based smoothing.

  • Yuxi HAN, Dequan LI, Yang YANG, , ,

    Deep reinforcement learning has shown remarkable capabilities in visual tasks, but it does not have a good generalization ability in the context of interference signals in the input images; this approach is therefore hard to be applied to trained agents in a new environment. To enable agents to distinguish between noise signals and important pixels in images, data augmentation techniques and the establishment of auxiliary networks are proven effective solutions. We introduce a novel algorithm, namely, saliency-extracted Q-value by augmentation (SEQA), which encourages the agent to explore unknown states more comprehensively and focus its attention on important information. Specifically, SEQA masks out interfering features and extracts salient features and then updates the mask decoder network with critic losses to encourage the agent to focus on important features and make correct decisions. We evaluate our algorithm on the DeepMind Control generalization benchmark (DMControl-GB), and the experimental results show that our algorithm greatly improves training efficiency and stability. Meanwhile, our algorithm is superior to state-of-the-art reinforcement learning methods in terms of sample efficiency and generalization in most DMControl-GB tasks.

  • Changwen DING, Chuntao SHAO, Siteng ZHOU, Di ZHOU, Runle DU, Jiaqi LIU, , , , , ,

    We propose a distributed labeled multi-Bernoulli (LMB) filter based on an efficient label matching method. Conventional distributed LMB filter fusion has the premise that the labels among local densities have already been matched. However, considering that the label space of each local posterior is independent, such a premise is not practical in many applications. To achieve distributed fusion practically, we propose an efficient label matching method derived from the divergence of arithmetic average (AA) mechanism, and subsequently label-wise LMB filter fusion is performed according to the matching results. Compared with existing label matching methods, this proposed method shows higher performance, especially in low detection probability scenarios. Moreover, to guarantee the consistency and completeness of the fusion outcome, the overall fusion procedure is designed into the following four stages: pre-fusion, label determination, posterior complement, and uniqueness check. The performance of the proposed label matching distributed LMB filter fusion is demonstrated in a challenging nonlinear bearings-only multi-target tracking (MTT) scenario.

  • Mengyu ZHANG, Zhenxue HE, Yijin WANG, Xiaojun ZHAO, Xiaodan ZHANG, Limin XIAO, Xiang WANG, , , , , , ,

    The power optimization of mixed polarity Reed–Muller (MPRM) logic circuits is a classic combinatorial optimization problem. Existing optimization approaches often suffer from slow convergence and a propensity to converge to local optima, limiting their effectiveness in achieving optimal power efficiency. First, we propose a novel multi-strategy fusion memetic algorithm (MFMA). MFMA integrates global exploration via the chimp optimization algorithm with local exploration using the coati optimization algorithm based on the optimal position learning and adaptive weight factor (COA-OLA), complemented by population management through truncation selection. Second, leveraging MFMA, we propose a power optimization approach for MPRM logic circuits that searches for the best polarity configuration to minimize circuit power. Experimental results based on Microelectronics Center of North Carolina (MCNC) benchmark circuits demonstrate significant improvements over existing power optimization approaches. MFMA achieves a maximum power saving rate of 72.30% and an average optimization rate of 43.37%; it searches for solutions faster and with higher quality, validating its effectiveness and superiority in power optimization.

  • Jia DUAN, Luanyun HU, Qiumei XIAO, Meiting LIU, Wenxin YU, , , , ,

    In response to the strong correlation between the chaotic system state and initial state and parameters in traditional chaotic encryption algorithms, which may lead to periodicity in chaotic sequences, the chaos long short-term memory (Chaos-LSTM) model is constructed by combining chaotic systems with LSTM neural networks. The chaos sequence proliferation (CSP) algorithm is constructed to address the problem that the limited computational accuracy of computers can lead to periodicity in long chaotic sequences, making them unsuitable for encrypting objects with large amounts of data. By combining the Chaos-LSTM model and CSP algorithm, a geographic information encryption system is proposed. First, the Chaos-LSTM model is used to output chaotic sequences with high spectral entropy (SE) complexity. Then, a shorter chaotic sequence is selected and proliferated using the CSP algorithm to generate chaotic proliferation sequences that match the encrypted object; a randomness analysis is conducted and testing is performed on it. Finally, using geographic images as encryption objects, the chaotic proliferation sequence, along with the scrambling and diffusion algorithms, are combined to form the encryption system, which is implemented on the ZYNQ platform. The system’s excellent confidentiality performance and scalability are proved by software testing and hardware experiments, making it suitable for the confidentiality peers of various encryption objects with outstanding application value.

  • Mai TANG, Wenqiang XIA, Jiuqiang DENG, Yao MAO, , , ,

    Electro-optical tracking systems have been widely used in the cutting-edge domains of free space environment detection and communication owing to their exceptional performance. However, external disturbances often significantly impact the working accuracy of these systems. As their scope of application continues to broaden, increasingly complex operating conditions introduce more intricate environments and disturbances. This paper introduces a composite control structure of an enhanced error-based observer, rooted in the repetitive control strategy, tailored for two types of complex disturbances: periodic harmonic disturbance and narrow-band peak periodic disturbance. This structure not only ensures the system’s stability, but also suppresses periodic disturbances across multiple frequencies, effectively addressing the challenge that current disturbance suppression methods face in mitigating complex periodic disturbances. Moreover, necessary proofs are provided and an experimental platform is established for the electro-optical system, demonstrating the efficacy and reliability of the proposed control methods under various conditions.

  • Yang YANG, Fanming HUANG, Dong YUE, , ,

    This paper investigates a privacy-preserving consensus tracking problem for a class of nonstrict-feedback discrete-time multi-agent systems (MASs). An improved Liu cryptosystem is developed to alleviate the errors between encryption and decryption on the plaintext, which ensures satisfactory recovery of the plaintext information. A reinforcement learning (RL) technique is then employed to compensate for unknown dynamics and errors between true signals and decrypted ones. Based on the backstepping and graph theory, an RL-based privacy-preserving consensus tracking control strategy is further designed. By virtue of graph theory and Lyapunov stability theory, it is shown that the consensus tracking errors and all signals in the MAS are ultimately bounded. Finally, simulation examples are presented for verification of the effectiveness of the control strategy.

  • Correspondence
    Hao ZHOU, Liping CAO, Qi WEI, Zhenyu SHU, Yiwei JIANG, , , , ,