2026-06-30 2026, Volume 17 Issue 2

  • Select all
  • research-article
    Xingjian ZHONG, Peng WANG, Yue LI, Lin LI, Luhua FU, Changku SUN

    Machine vision-based detection methods have been widely applied in the detection of aircraft skin damage. During drone inspection processes, a key step is to spatially locate high-resolution detailed images of aircraft skin from multiple angles onto a three-dimensional point cloud model of the aircraft. This relies on the rigid registration of image center position coordinate point cloud with the aircraft 3D point cloud. To address the issues of low accuracy and poor robustness encountered by existing registration algorithms when dealing with heterogeneous point clouds with significant differences in density and low overlap, this paper presents a novel cross-source point cloud registration network. The network integrates multi-scale information from the point cloud and employs an attention mechanism to identify representative overlapping points. First, the network achieves initial correspondences using the multi-scale geometric features and positional information of the point cloud. Then, an overlapping feature guidance module predicts the overlapping score of the point cloud. By utilizing information interaction through the attention mechanism, the network combines point overlapping scores with fused features to filter out representative overlapping points, achieving precise correspondences in the point cloud. The network employs weighted singular value decomposition (SVD) to estimate two sets of transformation matrices, yielding the relative pose parameters of the point cloud. Experiments were conducted in an unsupervised manner.The experimental results on the ModelNet40 dataset and the aero object dataset aircraft measurement data showed that, compared to other existing traditional and learning-based methods, this approach demonstrated excellent performance in terms of registration accuracy and robustness.

  • research-article
    Haoqing QIU, Hongwei GE, Ting LI

    In the saliency object detection task, a salient object detection method based on object integrity enhancement guided edge information is proposed to address the problems of blurred edges and object incompleteness in recognition results. Firstly, the diversity feature extraction module was proposed to capture the features of complex and variable salient objects through various convolutional operations, thereby enriching the feature representation of the model. Then, the object integrity enhancement module was designed to process the initial fused multi-level features in parallel, and the integrity information of salient objects was further enhanced by exploring spatial and channel branches. Finally, the edge feature enhancement module was employed to use the deep edge prediction features to guide the feature map to pay more attention to the foreground and background region and edge information, and to improve the model’s edge perception capability. Experiments on four public datasets, such as ECSSD and DUTS-TE, showed that the proposed algorithm achieved higher detection accuracy than other advanced algorithms in several metrics, such as S-measure and F-measure on DUTS-TE dataset were 0.859 and 0.895, respectively. The proposed algorithm demonstrated superior capability in the perception and refinement of salient object boundaries, further enhancing its robustness in complex scenes.

  • research-article
    Shangsi DING, Guiqin YANG, Bingkun GAN

    Aiming at the problems of low target pixels and intricate background in small target detection in infrared scenes, a target detection model based on multi-scale feature extraction with YOLOv8 was proposed. Firstly, all downsampling convolutions in the network were replaced with the Haar wavelet downsampling (HWD) module to better preserve fine-grained details in infrared imagery during downsampling. Secondly, the spatial pyramid pooling-fast (SPPF) module was improved by introducing separable convolutions, which expanded the receptive field in both horizontal and vertical directions, enabling more comprehensive spatial information capture. Furthermore, a novel C2f_CDWR module was designed using dilated convolutions with varying dilation rates to achieve adaptive feature extraction across multiple receptive fields, thus enhancing detection performance for objects of different sizes. Finally, to improve localization accuracy, the original CIoU loss in YOLOv8 was replaced with Inner-SIoU, which effectively improved bounding box regression accuracy and significantly boosted the model’s capability in detecting small infrared targets. The experimental evaluation on the HIT-UAV dataset shows that the precision of the enhanced YOLOv8 model is 90.5%, the recall rate is 75.9%, and the mean average precision is 85.7%. In terms of infrared target detection, its performance was significantly better than that of the baseline YOLOv8 model and other benchmark models.

  • research-article
    Baoqiang DU, Xiang WANG, Dunzhi YAO

    A precise frequency measurement and analysis method utilizing the concept of the different frequency phase synchronization fuzzy region is presented based on the principle of phase synchronization detection. Initially, the frequency of the measured signal was roughly estimated by using the traditional high-precision time and frequency detection technology. Subsequently, the frequency estimated was fed into a direct digital synthesizer (DDS) to generate a real-time frequency standard signal that exhibited a slight frequency deviation relative to the measured signal. The edge pulses of the different frequency phase synchronization fuzzy region served as counter switch signals, enabling the counting of both the detected signal and the real-time frequency standard signal within a specified gate time. Through subsequent data processing of the obtained values, the frequency, frequency difference, and frequency accuracy of the detected signal could be determined. Experimental results demonstrate that the system based on this method achieves a frequency stability of 10-13 at 1 s, with a frequency deviation of less than 3 Hz. Compared with traditional frequency measurement and analysis methods, this approach has many advantages, especially its fast response time of less than 1 ms, high measurement accuracy of more than 10-11 at 1 s, high integration with only one FPGA chip, and cost of less than 1 000 yuan. It is widely applied in the fields of time and frequency services and security technology of the Beidou satellite (BDS) navigation system, such as BDS pseudo-range measurement, Beidou positioning, navigation and time services, as well as precise time and frequency measurement and control, etc.

  • research-article
    Xuan YU, Jiaqing MA, Leiyi YU

    Image dehazing remains a highly ill-posed problem in low-level computer vision, primarily due to the complex coupling of spatially variant haze and high-frequency background details. Existing deep learning-based methods often struggle to balance the trade-off between aggressive haze removal and the preservation of fine textures, frequently resulting in color distortion, halos, or detail loss. To address these limitations and achieve high-fidelity restoration, this paper proposes a parallel feature fusion framework driven by a haze-detail collaboration mechanism. Specifically, the proposed framework adopts a dual-branch parallel architecture to disentangle the dehazing process. The upper branch functions as a haze layer extraction network. It employs a Res2Net-based encoder to capture multi-scale semantic features and integrates a novel deformable convolution-residual hybrid attention module. By dynamically adjusting the receptive fields, this module precisely characterizes non-uniform haze distributions and models long-range dependencies. Simultaneously, the lower branch serves as a detail compensation network, leveraging context detail information blocks with multi-scale dilated convolutions to aggregate contextual cues and reinforce the representation of high-frequency textural details. Subsequently, a fusion network performs adaptive feature integration, effectively merging the extracted haze features with the enhanced detail information to reconstruct the haze-free image. To ensure robust training, a dual-supervision mechanism is introduced, combining a feature regularization loss to align feature distributions in the latent space and a reconstruction loss to constrain pixel-level content fidelity. Extensive quantitative and qualitative experiments are conducted on both synthetic benchmarks and real-world datasets. The results demonstrated that the proposed algorithm delivered superior performance, achieving higher peak signal-to-noise ratio and structural similarity scores compared to state-of-the-art methods. Visual comparisons further confirmed that our method effectively removed dense haze while recovering vivid colors and sharp structural details without introducing artifacts.

  • research-article
    Longpan CAO, Xin ZHOU, Yongbo SI, Yuqian YAN, Guangwu CHEN

    The ensemble Kalman filter (EnKF) has emerged as a popular data fusion filtering method in vehicle-mounted global navigation satellite system/strapdown inertial navigation system (GNSS/SINS) integrated navigation systems. It employs Monte Carlo methods based on sample estimates to approximate the system’s state distribution. However, the EnKF typically assumes a Gaussian distribution for the state distribution, and this assumption may fail in non-Gaussian scenarios. To address this issue, this paper proposes a Cauchy robust ensemble Kalman filter (CREnKF) that dynamically identifies and suppresses outliers through the Cauchy weighting function, and reduces the impact of non-Gaussian noise by combining residual direct weighting and observation covariance reconstruction dual-path robustness strategies. The algorithm was applied to a GNSS/SINS integrated navigation system and tested through simulation experiments and in-vehicle experiments. The experimental results show that the position RMSE of this scheme in a non-Gaussian noise environment is decreased by 82%, 81%, and 63% relative to EKF, EnKF, and EnKF robust with Huber Kernel function, respectively, effectively enhancing the positioning accuracy of the integrated navigation system.

  • research-article
    Jing REN, Xiuhui TAN, Yanping BAI, Peng WANG, Rong CHENG, Feng ZHANG, Ting XU

    The application of deep learning to direction of arrival (DOA) estimation is of great significance in the field of array signal processing. The use of deep learning for DOA estimation of vector hydrophone array usually directly inputs the covariance matrix of the signal as the signal feature into the network, but this method has limitations such as high data requirements and high computational complexity. This paper proposes a DOA estimation method for vector hydrophone array based on a convolutional sparse autoencoder under sparse prior conditions. This method adds an L1 norm regularization term to the convolutional layer of a convolutional autoencoder to achieve sparsity constraints, and establishes a convolutional sparse autoencoder. At the same time, a residual compensation mechanism is introduced to avoid overfitting and loss of details during the training process. Subsequently, the columns of the signal covariance matrix of the vector hydrophone array are treated as under-sampled noisy linear measurements of the spatial spectrum, and are input into a convolutional sparse autoencoder for feature extraction and reconstruction. Finally, the obtained features are used as inputs for training a convolutional neural network to achieve multi-source DOA estimation. Furthermore, to address the shortcomings of classification methods in off-grid situations, we propose a DOA regression estimation method based on the convolutional sparse autoencoder. The simulation results show that under complex conditions such as low signal-to-noise ratio and a small number of snapshots, the classification method proposed in this paper outperforms various deep learning algorithms and traditional algorithms mentioned in the literature in terms of estimation performance. In addition, the proposed regression method can further improve the DOA estimation performance in off-grid scenarios.

  • research-article
    Jiying LI, Yu LIU, Jie LIU

    When images are captured under hazy conditions, light is attenuated and deflected by particle scattering, resulting in reduced brightness and color distortion, which affects the imaging quality of the visual system. This paper proposes a defogging method that combines super-pixel segmentation and transmission optimization. First, the complexity of the haze image was calculated using color entropy to adaptively determine the number of super-pixel blocks. The simple linear iterative clustering (SLIC) super-pixel segmentation method was used to obtain super-pixel blocks with the same features. And the super-pixel block with the highest score was selected as a candidate block to accurately estimate the atmospheric light value. Then, the transmission was estimated using the multiscale dark channel prior and the non-local haze-lines prior, and then the initial transmission after fusion was obtained by wavelet transform. In addition, a guided filter based on unsharp masking was introduced to further improve the transmission estimation accuracy. Finally, an atmospheric scattering model was used to invert the haze-free image. We have conducted a large number of quantitative and qualitative experiments on three datasets, and the results show that the proposed algorithm can achieve a better de-fogging effect, especially in the sky region, where the image restoration effect is more prominent.

  • research-article
    Tianxiang YANG, Lingjun MENG, Hong JIN, Wenjie FENG, Xinhao LIU

    Monocular depth estimation aims to predict depth information within a scene from a single RGB image, but many models remain computationally intensive for real-time inference on resource-constrained edge devices. This paper presents a lightweight self-supervised monocular depth estimation network that balances accuracy and efficiency through targeted encoder–decoder design. The encoder employed a synergistic modeling approach combining decomposable large-kernel convolutions and local depthwise convolutions to capture both long-range context and local details with low computational overhead. The decoder utilized cross-scale feature differences as guidance to dynamically fuse multi-scale features, enhancing detail recovery and geometric consistency under lightweight constraints. In addition, a temporal soft fusion reprojection loss was employed to better leverage the complementary information of forward and backward frames, improving the robustness of self-supervised training. The model contained 3.0 M parameters and required 3.5 GFLOPs of computation. On KITTI, it achieves Abs Rel=0.105 and δ1=0.892. On Make3D, it achieves Abs Rel=0.308 in a zero-shot setting. On a Rockchip RK3588S, a hybrid-quantized multi-thread implementation runs at 67 frames/s. The results demonstrated that the proposed method achieved a favorable accuracy–efficiency balance on edge devices, making it suitable for real-time monocular depth estimation tasks.

  • research-article
    Huijia WU, Yafeng HAO, Pu ZHU, Fupeng MA, Yujie HUANG, Wenyu NIU, Shiqi LI, Luxiao SANG, Tengteng LI

    Terahertz (THz) waves exhibit distinctive advantages for biomedical sensing, while metasurface technology offers an effective route toward high-performance THz sensors. A high-sensitivity THz metasurface absorber sensor for biological sample detection is proposed and numerically investigated. The sensor adopts a metal-insulator-metal (MIM) configuration, in which four centrosymmetrically arranged split metal rings form a multimode resonator. Copper is employed as the metallic layer, and PTFE is selected as the dielectric spacer. The electromagnetic response of the sensor is analyzed using CST Microwave Studio based on the FIT. Simulation results demonstrate that the proposed sensor supports two independent near-perfect absorption resonances at 3.156 4 THz and 3.724 THz, with peak absorption rates of 99.75% and 99.84%, respectively. Owing to the strong localized field enhancement and multimode coupling effects, the resonant frequencies exhibit pronounced sensitivity to variations in the surrounding refractive index. Within the typical refractive-index range of biomedical samples, the maximum sensitivity reaches 241 GHz/RIU, accompanied by a maximum figure of merit (FOM) of 8.02. In addition, the absorption performance remains above 99% for polarization angles from 0° to 90°, with negligible resonance shift, indicating excellent polarization insensitivity. Benefiting from the use of low-dielectric-constant materials, the proposed sensor showed good material compatibility and portability. These results suggested that the designed metasurface absorber provided a promising platform for high-sensitivity terahertz biosensing applications.

  • research-article
    Jun LI, Yi MA

    Due to the interference of strong noise, feature extraction faces the challenge of limited information, which is not conducive to motor equipment fault diagnosis. This paper proposes a fault diagnosis method based on the empirical wavelet transform (EWT) and an improved ConvNeXt network. The modal components were extracted from the signals of different sensors using empirical wavelet transform, noise was removed, and then the signals were reconstructed. Secondly, the short-time Fourier transform (STFT) was used to convert the one-dimensional signal after noise reduction and reconstruction into a two-dimensional time-frequency spectrum image that enhanced signal features. Single-channel images generated by a single sensor were fused to form multi-channel images, thereby boosting the feature extraction capability of the ConvNeXt network. Additionally, the Ghost convolution module and the efficient local attention mechanism (ELA) were introduced into the ConvNeXt-T (ConvNeXt-Tiny) network, further enhancing the network’s performance. Experimental validation was conducted on application examples of various fault diagnostic devices, and comparisons were made with existing mainstream deep learning methods such as SE-InceptionV3, CBAM-ResNet, and CNN-LSTM etc. Experimental results confirmed under different noise environments and variable operating conditions, the proposed method achieved better diagnostic accuracy and enhanced generalization performance.

  • research-article
    Xingliang SU, Ke XU, Kaixing ZHANG, Jianbin LIAO, Guoqiang LI, Jin YAN

    Gearboxes with complex structures are key components in rotating machinery, and their failures exhibit both sporadic and coupling characteristics. However, in engineering practice, gearbox fault data are extremely scarce, and the sample size for composite faults is effectively zero. Therefore, it is of great significance to investigate fault diagnosis modeling methods under scarce fault samples and zero composite fault samples. This paper designs a new loss function for single single-fault samples and zero composite fault samples based on the prior knowledge of similarity between time series monitoring signals and the difference between single faults, as well as the prior knowledge that composite faults are composed of single faults. This enables the effective optimization of the feature encoder. On this basis, a diagnostic algorithm for single faults and composite faults of gearboxes is constructed, achieving effective diagnosis of various states of gearboxes. This paper validates the proposed method using two gearbox fault experimental platforms. The results show that the proposed method can construct and obtain the gearbox fault diagnosis model under single single-fault samples and zero composite fault samples.

  • research-article
    Tiantian LI, Zhen XUE, Liangliang ZHANG, Xu LIAN

    Label scarcity and long-tailed distribution imbalance are significant challenges in industrial equipment monitoring. Currently, self-supervised learning methods are affected by sample quantity bias and semantic confusion under complex operating conditions, which limits their ability to represent sparse critical states. To address these issues, we propose a co-evolutionary prototypical contrastive learning (EPCL) framework. Through progressive learning from coarse-grained semantic discovery to fine-grained discriminative enhancement, this framework enables an in-depth analysis of the intrinsic structure of long-tailed data. Specifically, an adaptive prototype-based clustering algorithm based on optimal transport theory is introduced, thereby achieving unbiased representation learning through data-driven dynamic priors. Furthermore, a semantic-aware and hierarchical negative sample weighting scheme is designed to optimize discriminative boundaries while mitigating class imbalance by enforcing prototype consistency constraints and employing an adaptive weighting strategy. Extensive experiments were conducted on several public long-tailed visual benchmarks, including CIFAR10-LT, CIFAR100-LT, and ImageNet-100-LT, as well as the industrial fault diagnosis dataset. The results demonstrated that the EPCL achieved better performance than fifteen mainstream self-supervised methods (e.g., SimCLR and SwAV) in both linear evaluation and few-shot classification tasks. On the CIFAR100-LT dataset , the EPCL improved the tail-class accuracy by 4.56% compared to SimCLR. Ablation studies and visualization results verified the effectiveness and generalization ability of the framework. This work offers a promising insight and practical solution for representation learning from unlabeled long-tailed measurement data.

  • research-article
    Jiaqi WANG, Xiaopeng YANG, Yuxi WANG, Qiulin TAN

    Artificial tactile perception sensors have great application value in robotics, medical health, intelligent prosthetics, and human-computer interaction. However, existing sensors encounter difficulties in achieving high sensitivity, good linear response, and high spatial resolution, which has become a key problem restricting their practical application. Here, inspired by the tube foot system in starfish, we report the design, fabrication, and performance of a multi-level dome structure tactile sensor (MLDSTS) that provides simultaneous high sensitivity, excellent linearity, stability, and high resolution. Through a series of experiments, a comprehensive and systematic characterization and analysis of its sensing performance were conducted. The experimental results showed that the constructed multi-level dome array structure could effectively expand the linear detection range of the sensor while stably maintaining high spatial resolution. Its linear range was increased by about 2 times compared with the single-level dome structure, and realized segmented sensitive response. The sensor could detect weak external forces as low as 0.1 N, with a sensitivity of 1.309 V/N in the range of 0.1-3 N and 0.447 V/N in the range of 4-10 N, possessing both high precision and large-range detection capabilities. Through tests in various practical application scenarios such as respiration monitoring, water droplet detection, fabric texture recognition, and knuckle movement monitoring, it was confirmed that the sensor exhibited excellent stability and detection accuracy in different pressure scenarios, and its performance was significantly superior to that of single-dome structure sensors. The research results indicated that the multi-level dome structure provided an effective strategy to solve the industry problem that it was difficult to balance the sensitivity, linearity, and resolution of tactile sensors. With its comprehensively excellent performance, the MLDSTS sensor has broad application prospects in the fields of flexible electronics, intelligent health monitoring, human-computer interaction, and intelligent robots.

  • research-article
    Qiuyuan SUN, Zhifei YAO, Jianheng SUN, Bohui ZHAO, Ketong ZHANG, Haoran YAO, Jiuyan WEI, Huanfei WEN, Jun TANG, Zongmin MA, Jun LIU

    Atomic force microscopy (AFM) probe vibration monitoring is essential for achieving accurate nanoscale imaging and reliable signal interpretation. This paper presents a low-noise vibration detection method based on a real-time FPGA-LabVIEW homebuilt system. The FPGA receives signals from a quadrant photodiode (QPD), performs analog-to-digital conversion and parallel processing, and integrates cascaded digital filters for noise reduction. A finite impulse response (FIR) low-pass filter extracts the static spot position, while an infinite impulse response (IIR) band-pass filter preserves the probe’s resonance vibrations. Compared with conventional analog detection, the proposed system reduces background noise by approximately 50% (measured as 50.23%), enhances the signal-to-noise ratio (SNR) from 15 dB to 20 dB, and maintains FPGA signal-processing latency below 5 μs. This work demonstrates that the proposed real-time FPGA-LabVIEW AFM noise optimization system significantly improves signal-to-noise ratio and real-time performance, providing a practical solution for high-precision, low-noise AFM imaging.