Journal of Beijing Institute of Technology

2025-08-15 2025, Volume 34 Issue 4

Previous Next

Select all

DWDet: A Fine-Grained Object Detection Algorithm for Remote Sensing Aircraft

Meijing Gao, Yonghao Yan, Xiangrui Fan, Huanyu Sun, Sibo Chen, Xu Chen, Bingzhou Sun, Ning Guan

2025, 34(4): 337-349. https://doi.org/10.15918/j.jbit1004-0579.2024.118

Download PDF

Fine-grained aircraft target detection in remote sensing holds significant research value and practical applications, particularly in military defense and precision strikes. Given the complexity of remote sensing images, where targets are often small and similar within categories, detecting these fine-grained targets is challenging. To address this, we constructed a fine-grained dataset of remotely sensed airplanes; for the problems of remote sensing fine-grained targets with obvious head-to-tail distributions and large variations in target sizes, we proposed the DWDet fine-grained target detection and recognition algorithm. First, for the problem of unbalanced category distribution, we adopt an adaptive sampling strategy. In addition, we construct a deformable convolutional block and improve the decoupling head structure to improve the detection effect of the model on deformed targets. Then, we design a localization loss function, which is used to improve the model’s localization ability for targets of different scales. The experimental results show that our algorithm improves the overall accuracy of the model by 4.1% compared to the baseline model, and improves the detection accuracy of small targets by 12.2%. The ablation and comparison experiments also prove the effectiveness of our algorithm.

A High-Order Modulation Signal Classification Method Based on a Fourier Analysis Network Integrated with an Attention Mechanism

Yuepeng Li, Xiaogang Tang, Binquan Zhang, Lu Wang, Hao Huan

2025, 34(4): 350-361. https://doi.org/10.15918/j.jbit1004-0579.2025.020

Download PDF

In modern wireless communication and electromagnetic control, automatic modulation classification (AMC) of orthogonal frequency division multiplexing (OFDM) signals plays an important role. However, under Doppler frequency shift and complex multipath channel conditions, extracting discriminative features from high-order modulation signals and ensuring model interpretability remain challenging. To address these issues, this paper proposes a Fourier attention network (FAttNet), which combines an attention mechanism with a Fourier analysis network (FAN). Specifically, the method directly converts the input signal to the frequency domain using the FAN, thereby obtaining frequency features that reflect the periodic variations in amplitude and phase. A built-in attention mechanism then automatically calculates the weights for each frequency band, focusing on the most discriminative components. This approach improves both classification accuracy and model interpretability. Experimental validation was conducted via high-order modulation simulation using an RF testbed. The results show that under three different Doppler frequency shifts and complex multipath channel conditions, with a signal-to-noise ratio of 10 dB, the classification accuracy can reach 89.1%, 90.4% and 90%, all of which are superior to the current mainstream methods. The proposed approach offers practical value for dynamic spectrum access and signal security detection, and it makes important theoretical contributions to the application of deep learning in complex electromagnetic signal recognition.

2D-DOA for a Monostatic ULA EMVS-MIMO Radar Based on RC-ESPRIT

2025, 34(4): 362-372. https://doi.org/10.15918/j.jbit1004-0579.2024.074

Download PDF

Electromagnetic vector sensor (EMVS) embedded multiple-input multiple-output (MIMO) radar is an emerging technology that enables two-dimensional (2D) direction of arrival (DOA) estimation. In this paper, we proposed a low-complexity estimation of signal parameters via rotational invariance techniques (ESPRIT) algorithm for uniform linear array (ULA) EMVS-MIMO radar at a monostatic, enabling rapid estimation of 2D target angles. Initially, by employing a selection matrix, complexity reduction is applied to the array data, thereby eliminating redundancy in the array data. Subsequently, leveraging the rotation invariance propagator method (PM) algorithm, obtain the estimation of the elevation angle, but due to array sparsity, this estimation exhibits ambiguity. Then, the vector cross-product (VCP) technique is employed to achieve unambiguous 2D-DOA estimation. Finally, the aforementioned estimates are synthesized to obtain high-resolution, unambiguous elevation angle estimation. The proposed algorithm is applicable to large-scale and spare EMVS-MIMO radar systems and provides higher estimation accuracy compared to existing ESPRIT algorithms. The effectiveness of the algorithm is verified through matrix laboratory (MATLAB) simulations.

Bayesian Inference of Hit Probability of Ammunition Based on Normal-Inverse Wishart Distribution

Meng Yang, Weimin Ye, Huaiqiang Zhang, Aming Ye

2025, 34(4): 373-387. https://doi.org/10.15918/j.jbit1004-0579.2024.113

Download PDF

In order to solve the problems of high experimental cost of ammunition, lack of field test data, and the difficulty in applying the ammunition hit probability estimation method in classical statistics, this paper assumes that the projectile dispersion of ammunition is a two-dimensional joint normal distribution, and proposes a new Bayesian inference method of ammunition hit probability based on normal-inverse Wishart distribution. Firstly, the conjugate joint prior distribution of the projectile dispersion characteristic parameters is determined to be a normal inverse Wishart distribution, and the hyperparameters in the prior distribution are estimated by simulation experimental data and historical measured data. Secondly, the field test data is integrated with the Bayesian formula to obtain the joint posterior distribution of the projectile dispersion characteristic parameters, and then the hit probability of the ammunition is estimated. Finally, compared with the binomial distribution method, the method in this paper can consider the dispersion information of ammunition projectiles, and the hit probability information is more fully utilized. The hit probability results are closer to the field shooting test samples. This method has strong applicability and is conducive to obtaining more accurate hit probability estimation results.

A Fast Automatic Road Crack Segmentation Method Based on Deep Learning with Model Compression Framework

2025, 34(4): 388-404. https://doi.org/10.15918/j.jbit1004-0579.2025.012

Download PDF

Computer-vision and deep-learning techniques are widely applied to detect, monitor, and assess pavement conditions including road crack detection. Traditional methods fail to achieve satisfactory accuracy and generalization performance in for crack detection. Complex network model can generate redundant feature maps and computational complexity. Therefore, this paper proposes a novel model compression framework based on deep learning to detect road cracks, which can improve the detection efficiency and accuracy. A distillation loss function is proposed to compress the teacher model, followed by channel pruning. Meanwhile, a multi-dilation model is proposed to improve the accuracy of the model pruned. The proposed method is tested on the public database CrackForest dataset (CFD). The experimental results show that the proposed method is more efficient and accurate than other state-of-art methods.

PPFormer: Patch Prototype Transformer for Semantic Segmentation

2025, 34(4): 405-417. https://doi.org/10.15918/j.jbit1004-0579.2025.028

Download PDF

Since the introduction of vision Transformers into the computer vision field, many vision tasks such as semantic segmentation tasks, have undergone radical changes. Although Transformer enhances the correlation of each local feature of an image object in the hidden space through the attention mechanism, it is difficult for a segmentation head to accomplish the mask prediction for dense embedding of multi-category and multi-local features. We present patch prototype vision Transformer (PPFormer), a Transformer architecture for semantic segmentation based on knowledge-embedded patch prototypes. 1) The hierarchical Transformer encoder can generate multi-scale and multi-layered patch features including seamless patch projection to obtain information of multi-scale patches, and feature-clustered self-attention to enhance the interplay of multi-layered visual information with implicit position encodes. 2) PPFormer utilizes a non-parametric prototype decoder to extract region observations which represent significant parts of the objects by unlearnable patch prototypes and then calculate similarity between patch prototypes and pixel embeddings. The proposed contrasting patch prototype alignment module, which uses new patch prototypes to update prototype bank, effectively maintains class boundaries for prototypes. For different application scenarios, we have launched PPFormer-S, PPFormer-M and PPFormer-L by expanding the scale. Experimental results demonstrate that PPFormer can outperform fully convolutional networks (FCN)- and attention-based semantic segmentation models on the PASCAL VOC 2012, ADE20k, and Cityscapes datasets.

Robotic System for Total Hip Arthroplasty Based on Self-Positioning and Grinding

2025, 34(4): 418-432. https://doi.org/10.15918/j.jbit1004-0579.2024.049

Download PDF

Total hip arthroplasty (THA) has limitations in grinding angles, prosthesis placements, and thickness variations. THA robotics offer promise but encounter challenges like manual control of the robotic arm for precise positioning and potential over-grinding when controlled manually. This paper presents a THA surgical robot system with automatic positioning and automatic grinding and filing functions. It achieves precise positioning during the surgery by using the singular value decomposition of initial value screening and sliding mode control(SMC), and ensures uniformity, stability and controlled filing thickness through the designed end grinding and filing actuator system. It has been verified experimentally that the average position errors in the x, y, and z directions are 0.692 mm, 0.512 mm, and 0.66 mm respectively, and the Euclidean distance error is 1.322 mm. The average angle error is less than 1.136°. The end effector can perform automatic grinding according to the predetermined planning value within the safe force threshold of 30 N. This THA surgical robot system can meet the requirements of the hip replacement surgery for the accuracy, driving ability and robustness of the system.

About the journal

Aims & scope

Editorial board

Description

Cover gallery

Contact us

Browse

Latest issue

All volumes and issues

Featured articles

Most accessed

Most cited

Authors & reviewers

Online submission

Guidelines for authors

Download templates

Please choose a citation manager