2025-09-30 2025, Volume 5 Issue 3

  • Select all
  • Research Article
    Kening Zhang, Zhou Chen, Jingsong Wang, Chuansai Zhou

    The ionosphere is a critical region of near-Earth space, directly influencing satellite navigation and shortwave communication quality. The total electron content (TEC) is a key parameter for ionospheric physics, and AI-based research on TEC has become a major focus in space weather studies. However, current AI models often function as “black boxes” with limited physical interpretability, hindering our understanding of ionospheric dynamics. We employed two mainstream neural networks combined with partial differential equations (PDEs): PDE-Net2 (a deep learning technique capable of automatically extracting PDEs from data), physics-informed neural networks, and SINDy (a traditional method for sparse identification of PDEs), to compare the performance of these methods in reconstructing ionospheric TEC data. The comparison shows that PDE-Net2 significantly outperforms the other methods in reconstructing TEC data. Its performance metrics indicate superior effectiveness in TEC reconstruction. By directly extracting PDEs from PDE-Net2, we analyzed the expressions and found that the longitudinal convection term (e.g., $$ \frac{\partial u}{\partial x} $$) and the latitudinal diffusion term (e.g., $$ \frac{\partial^2 u}{\partial y^2} $$) have the largest coefficients. This suggests that the longitudinal electron transport process in the ionosphere is the most dominant, potentially linked to the effects of longitudinal winds and diurnal solar radiation variations. Additionally, the latitudinal diffusion process plays an important role, which may involve nonlinear coupling between the Earth’s magnetosphere and ionosphere.

  • Review
    Xiaoqiang Hu, Zijia Zhen, Yunxiao Xu, Changsheng Zhu, Huimin Peng, Cheng Ma, Min Yang, Qi Zhu

    As an advanced human-computer interaction mode to simulate physical touch sense, force tactile interaction technology is gradually becoming a bridge connecting digital and reality. This paper summarizes the development of force tactile interaction technology, from early mechanical devices to modern intelligent tactile systems, after many technical innovations. This paper focuses on the content of the China Force Touch technology and application conference, shows the technology frontier and interdisciplinary communication, and summarizes the research and application team, as well as the software copyright and patent. In addition, the paper expounds on the wide application of force tactile technology in the meta-universe, robots, intelligent devices, car driving, education and other fields, and predicts its far-reaching impact on society and future development trends, emphasizing the importance of intelligent, personalized and cross-field integration.

  • Review
    Zeyu Gao, Zhiyuan Liao, Chunquan Li

    As an emerging robotics technology, soft robots have received extensive attention and research in recent years, and are gradually emerging in the field of robotics by virtue of their soft materials and unique drive methods. They are able to adapt to a variety of unstructured environments and interact with humans in a safer way, opening up a whole new direction for the development of robotics. As an important component of the human-machine interface (HMI), the performance of the sensor also determines the smoothness and accuracy of human-machine interaction. The emergence of new materials and the maturity of various technologies and algorithms also make the HMI of soft robot systems increasingly refined, bringing more possibilities to soft robots. The abstract introduces the development of HMI in recent years, and discusses the role played by HMI in soft robot system and mentions some optimization solutions.

  • Research Article
    Han Yang, Di Wang, Wenjie Pan, Chaoying Jiang, Weichang Gao, Xiaoji Luo, Zugui Tu

    Soil microbial communities are crucial for essential ecosystem functions such as nitrogen cycling and organic matter decomposition. However, accurately classifying their gene sequences remains challenging due to overlooked taxonomic hierarchies, environmental variability, and insufficient structural dynamics. Current methods predominantly focus on intra-sequence nucleotide features while neglecting the community’s hierarchical taxonomy. To address these gaps, we analyzed soil samples collected from the loess regions of Guizhou and investigated dynamic changes in microbial community composition across plant growth stages. We propose MicroGraphBERT, a deep learning framework synergizing DNABERT’s context-aware embeddings with taxonomy-aware priors via graph attention network to enable joint modeling of sequence and ecological features for microbial classification. Trained on high-throughput sequencing data from the Guizhou loess regions, MicroGraphBERT integrates nucleotide-level contextual semantics from DNABERT and cross-species relational learning with graph attention network to capture both sequence features and taxonomic hierarchies. This approach identifies complex microbial patterns under varying soil conditions, achieving a classification accuracy of 98.72%. Our work advances precision microbiome analytics by providing a scalable solution for soil health monitoring, intelligent fertilizer optimization, and sustainable agroecosystem management.

  • Research Article
    Xinqiang Chen, Peiyang Wu, Yuzhen Wu, Loay Aboud, Octavian Postolache, Zichuang Wang

    With the rapid development of global maritime trade, ship trajectory prediction plays an increasingly important role in maritime safety, efficiency optimization, and the development of green shipping. However, the complexity of the marine environment, multi-factor influences, and automatic identification system (AIS) data quality issues pose significant challenges to trajectory prediction. This study proposes a ship trajectory prediction model based on the Crossformer architecture comprising three core components: Dimension-Segment-Wise embedding, Two-Stage Attention layer, and Hierarchical Encoder-Decoder structure, which efficiently captures spatiotemporal dependencies in ship movement patterns. Through experiments on public AIS datasets, we validate the model using two navigation scenarios (complex turning and smooth sailing) and conducted comprehensive comparisons with traditional models such as gated recurrent unit (GRU), long short-term memory (LSTM), and temporal graph convolutional network (TGCN). Experimental results demonstrate that Crossformer significantly outperforms the comparative models across multiple evaluation metrics including average Euclidean distance error (ADE), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE), reducing average error by over 60% in complex scenarios and up to 70% in smooth scenarios. For Case 1, Crossformer achieved the lowest values across metrics with ADE of 2.35 × 10-2, MSE of 7.00 × 10-4, RMSE of 2.58 × 10-2, and MAE of 2.35 × 10-2, substantially outperforming GRU, LSTM, and TGCN models. For Case 2, Crossformer similarly excelled with an ADE of 1.64 × 10-2, MSE of 4.00 × 10-4, RMSE of 2.06 × 10-2, and MAE of 1.64 × 10-2. The model maintains low error levels in predicting both latitude and longitude dimensions, exhibiting excellent multi-dimensional prediction capability and robustness. This research not only provides a high-precision solution for ship trajectory prediction but also establishes an important technical foundation for intelligent ship scheduling, maritime traffic management, and navigation safety assurance.

  • Review
    Zhuorui Wang, Mingkai Chen, Qian Liu

    With collaborative advances in wireless communication, artificial intelligence, and sensor technologies, robotic systems are undergoing a revolutionary evolution from single-function actuators to intelligent task processing platforms. In complex dynamic environments, the limitations of conventional unimodal perception have become increasingly apparent, struggling to meet the precision requirements for object attribute recognition and environmental interaction. In the future, deep-integrated multimodal perception technologies will emerge as a predominant trend, where cross-modal communication between vision and tactile sensing represents a critical breakthrough direction for enhancing robotic environmental cognition. Currently, research on multimodal visual-tactile communication remains scarce. Therefore, this paper conducts a comprehensive survey of this emerging field. First, this paper systematically summarizes mature video and tactile communication frameworks. Subsequently, this paper analyzes current implementations of single-modal streaming transmission for visual and tactile data, thereby investigating the state-of-the-art in multimodal visual-tactile communication. Finally, this paper briefly explores the promising prospects of visual-tactile communication technology, highlighting its transformative potential to enable context-aware robotic manipulation and adaptive human-robot collaboration.

  • Review
    Si Chen, Tonghe Yuan, Lujin Xu, Weimin Ru, Dongqing Wang

    This paper provides an overview of the development of haptic texture reproduction technology, focusing on methods such as vibration, ultrasound, and electrostatic systems. It also explores how artificial intelligence (AI) and deep learning contribute to enhancing the adaptability and personalization of tactile feedback. The paper emphasizes the importance of understanding tactile perception mechanisms, particularly the role of Piezo proteins and the interaction between receptors and their microenvironment, in improving feedback system accuracy. Despite technological advancements, the accurate reproduction of fine textures and high-frequency vibrations remains a challenge. The review underscores that interdisciplinary research, including neuroscience, materials science, and AI, is crucial for future advancements in haptic systems.

  • Review
    Shilong Sun, Haodong Huang, Chiyao Li

    Humanoid robots are attracting increasing global attention owing to their potential applications and advances in embodied intelligence. Enhancing their practical usability remains a major challenge that requires robust frameworks that can reliably execute tasks. This review systematically categorizes and summarizes existing methods for motion control and planning in humanoid robots, dividing the control approaches into traditional dynamics-based and modern learning-based methods. It also examines the navigation and obstacle-avoidance capabilities of humanoid robots. By providing a detailed comparison of the advantages and limitations of various control methods, this review offers a comprehensive understanding of current technological progress, real-world application challenges, and future development directions in humanoid robotics. Key topics include the principles and applications of simplified dynamic models, widely used control algorithms, reinforcement learning, imitation learning, and the integration of large language models. This review highlights the importance of both traditional and innovative approaches in advancing the adaptability, efficiency, and overall performance of humanoid robots.

  • Correction
    Xinqiang Chen, Chen Chen, Huafeng Wu, Octavian Postolache, Yuzhen Wu
  • Research Article
    Xinqiang Chen, Rui Yang, Yuzhen Wu, Han Zhang, Prakash Ranjitkar, Octavian Postolache, Yiwen Zheng, Zichuang Wang

    Aiming to address the problems of insufficient ship detection accuracy and high miss rate for small targets in water transport traffic situational awareness under low-light conditions, this paper proposes an EG-YOLO+ framework that integrates ship image enhancement and ship detection. The method achieves adaptive enhancement of low-light images through the unsupervised enhancement-low-light-image-network generative adversarial network (EnlightenGAN) model, effectively solving the problem of detail loss in traditional methods under extreme lighting conditions; subsequently, based on the latest You Only Look Once version 11 (YOLOv11) architecture, it innovatively introduces the squeeze-and-excitation channel attention mechanism, significantly improving the detection accuracy of small-target ships through dynamic feature channel reweighting. The experimental results on the self-constructed maritime dataset show that the proposed method can effectively identify image targets in low-light environments, even small targets, with a 6-percentage point improvement in mAP over the baseline YOLOv11.

  • Research Article
    Jianjun Ni, Jiamei Shi, Qiao Zhan, Ziru Zhang, Yang Gu

    The condition of the retinal vessels is involved in various ocular diseases, such as diabetes, cardiovascular and cerebrovascular diseases. Accurate and early diagnosis of eye diseases is important to human health. Recently, deep learning has been widely used in retinal vessel segmentation. However, problems such as complex vessel structures, low contrast, and blurred boundaries affect the accuracy of segmentation. To address these problems, this paper proposes an improved model based on U-Net. In the proposed model, pyramid pooling structure is introduced to help the network capture the contextual information of the images at different levels, thus enhancing the receptive field. In the decoder, a dual attention block module is designed to improve the perception and selection of fine vessel features while reducing the interference of redundant information. In addition, an optimization method for morphological processing in image pre-processing is proposed, which can enhance segmentation details while removing some background noise. Experiments are conducted on three recognized datasets, namely DRIVE, CHASE-DB1 and STARE. The results show that our model has excellent performance in retinal vessel segmentation tasks.

  • Research Article
    Ronghua Zhang, Xincheng Xu, Yang Lu, Xin Xu, Xinglong Zhang, Qingwen Ma

    Cooperative navigation of multiple autonomous vehicles (MAVs) at unsignalized intersections remains a core challenge in intelligent transportation systems. This paper proposes a learning-based cooperative decision-making and control (LCDMC) method for MAVs, which improves policy learning efficiency and ensures safe and efficient cooperative navigation. In the proposed LCDMC algorithm, the global value function is decomposed into two components: a local utility function and a joint-action utility function among vehicles, which incorporates both the offline policy learning phase and the online deployment phase. During the offline phase, the kernel-based least-squares policy iteration method is employed to learn localized decision-making policies from high-dimensional samples. In the online deployment phase, a coordination graph for MAVs is developed, and a collaborative utility function characterizing joint action performance is designed. To solve optimized decision actions, the local utility function is integrated with a message propagation mechanism, and then the decision actions are converted into velocity commands. Furthermore, a receding-horizon reinforcement learning approach is designed to achieve trajectory tracking control of the autonomous vehicles in MAVs. Finally, to verify the effectiveness of the proposed method, numerical simulations of MAVs are performed, and the results demonstrate that the proposed LCDMC method exhibits superior performance in both traffic efficiency and safety for cooperative navigation of MAVs at unsignalized intersections.

  • Review
    Li Zhang, Shubo Qin, Junfei Li, Simon X. Yang, Xiaofei Li, Huqiang Sun, Jun Wang, Xiaobing Liu, Kun Yang

    As fundamental prerequisites for the operation and maintenance (O&M) of hydroelectric units, situation awareness and health management have emerged as research hotspots in recent years. Bioinspired intelligence, with advantages such as high efficiency, environmental adaptability, robust performance, and transferability, provides new research ideas, methods, and applications for the O&M of hydroelectric units, especially in situation awareness and health management. This paper reviews the prospects, current applications, and technical challenges of bioinspired intelligence in situation awareness and health management of hydroelectric units from the perspective of reliability-centered maintenance (RCM). First, the technical requirements and features of situation awareness and health management for hydroelectric units in RCM are elucidated. Next, the technical frameworks of hydroelectric units are reviewed from the perspective of bioinspired intelligence. A detailed discussion is then provided regarding the relevant implementation strategies in multiple domains, including real-time monitoring, multi-source signal fusion, state characteristic extraction, intelligent health diagnostics, maintenance decision-making optimization, and smart O&M systems. Finally, future trends and development opportunities in applying bioinspired intelligence to situation awareness and health management of hydroelectric units are proposed: integrating the advantages of bioinspired intelligence with the engineering requirements of RCM and innovating approaches for intelligent O&M, which would provide further support for safe, reliable, and efficient energy systems.

  • Research Article
    Zheng Yao, Puqing Chang, Qiwu Zhu, Wenjie Sun

    In developing Wi-Fi indoor positioning systems for large-scale complex environments, the fundamental challenge lies in the significant impact of signal noise on high-frequency data volatility, which substantially degrades positioning accuracy. To address this limitation, we propose an improved hierarchical positioning model combining a Gaussian mixture model (GMM) regional classifier with random forest secondary classifiers. During the offline phase, recognizing that Wi-Fi signal strength typically follows Gaussian distributions, we employed GMM to partition the target area into non-overlapping sub-regions with similar signal strength characteristics. For each sub-region, we then trained dedicated random forest classifiers. In the online phase, the system first identifies the probable sub-region using the GMM classifier before applying the corresponding random forest classifier for precise location estimation. We evaluated our approach in an indoor parking lot featuring an irregular layout, numerous solid walls, scattered access point distribution, and intermittent electromagnetic interference. Experimental results demonstrated that our hierarchical model delivers satisfactory performance for indoor location-based services in such challenging large-scale environments.

  • Research Article
    Ye Xu, Huarong Zhao, Yi Gao, Hongnian Yu, Li Peng

    This paper investigates a dynamic event-triggered model-free adaptive control (MFAC) method for a three-degree-of-freedom helicopter subjected to aperiodic denial-of-service (DoS) attacks to perform attitude control tasks. Firstly, a redefined output is designed to satisfy a quasi-linearization requirement of MFAC theory. Meanwhile, a differential signal is designed to reduce the impact of the redefined output. Then, a dynamic event-triggered strategy is formulated, including a dynamic condition that reduces communication frequencies. Additionally, a DoS attack compensation method has been developed, which effectively mitigates the effects of aperiodic DoS attacks. Moreover, the convergence of the tracking error of the controlled helicopter with the designed method is strictly proven. Finally, simulation results further demonstrate the effectiveness of the designed scheme.

  • Review
    Zhengyi Lu, Yunhong Liao, Jia Li

    Translation-based multimodal learning addresses the challenge of reasoning across heterogeneous data modalities by enabling translation between modalities or into a shared latent space. In this survey, we categorize the field into two primary paradigms: end-to-end translation and representation-level translation. End-to-end methods leverage architectures such as encoder–decoder networks, conditional generative adversarial networks, diffusion models, and text-to-image generators to learn direct mappings between modalities. These approaches achieve high perceptual fidelity but often depend on large paired datasets and entail substantial computational overhead. In contrast, representation-level methods focus on aligning multimodal signals within a common embedding space using techniques such as multimodal transformers, graph-based fusion, and self-supervised objectives, resulting in robustness to noisy inputs and missing data. We distill insights from over forty benchmark studies and highlight two notable recent models. The Explainable Diffusion Model via Schrödinger Bridge Multimodal Image Translation (xDSBMIT) framework employs stochastic diffusion combined with the Schrödinger Bridge to enable stable synthetic aperture radar-to-electro-optical image translation under limited data conditions, while TransTrans utilizes modality-specific backbones with a translation-driven transformer to impute missing views in multimodal sentiment analysis tasks. Both methods demonstrate superior performance on benchmarks such as UNICORN-2008 and CMU-MOSI, illustrating the efficacy of integrating optimal transport theory (via the Schrödinger Bridge in xDSBMIT) with transformer-based cross-modal attention mechanisms (in TransTrans). Finally, we identify open challenges and future directions, including the development of hybrid diffusion–transformer pipelines, cross-domain generalization to emerging modalities such as light detection and ranging and hyperspectral imaging, and the necessity for transparent, ethically guided generation techniques. This survey aims to inform the design of versatile, trustworthy multimodal systems.

  • Research Article
    Xinran Ba, Xinguang Zhang, Shufeng Li, Jin Yuan, Jun Hu

    Current semantic communication systems primarily use single-modal data and face challenges such as intermodal information loss and insufficient fusion, limiting their ability to meet personalized demands in complex scenarios. To address these limitations, this study proposes a novel multimodal semantic communication system based on graph neural networks. The system integrates graph convolutional networks and graph attention networks to collaboratively process multimodal data and leverages knowledge graphs to enhance semantic associations between image and text modalities. A multilayer bidirectional cross-attention mechanism is introduced to mine fine-grained semantic relationships across modalities. Shapley-value-based dynamic weight allocation optimizes intermodal feature contributions. In addition, a long short-term memory-based semantic correction network is designed to mitigate distortion caused by physical and semantic noise. Experiments performed using multimodal tasks (emotion analysis and visual question answering) demonstrate the superior performance of the system. Under low signal-to-noise ratio conditions, the proposed BERT-ResNet and GCN–GAT enhanced deep semantic communication (BR-GG-DeepSC) model achieves higher accuracy than conventional methods, while reducing the total number of transmitted symbols to approximately 33% of that in conventional approaches. These results validate the robustness, efficiency, and potential of the proposed system for practical deployment in resource-constrained environments.