WaveMFFM: wavelet-guided multi-feature fusion module for X-ray prohibited item detection

SUN Peng; CHEN Guangfeng

doi:10.19884/j.1672-5220.202501013

Journal of Donghua University(English Edition) ›› 2026, Vol. 43 ›› Issue (2) :112 -119. DOI: 10.19884/j.1672-5220.202501013

Information Technology and Artificial Intelligence

research-article

WaveMFFM: wavelet-guided multi-feature fusion module for X-ray prohibited item detection

SUN Peng
, CHEN Guangfeng ^*

Author information +

History +

PDF (6092KB)

Abstract

To improve the accuracy of detecting prohibited items in X-ray images, this study proposes a wavelet-guided multi-feature fusion module (WaveMFFM), an easy-to-integrate, plug-and-play module that can be seamlessly incorporated into existing detectors. WaveMFFM innovatively introduces the wavelet transform and pioneers the de-occlusion wavelet convolution (DOWC) structure, which dynamically integrates low-frequency global contour information and high-frequency detailed texture features through a frequency-domain decoupling mechanism. This approach effectively resolves the feature confusion issue inherent in conventional convolutional operations under occlusion scenarios, achieving a groundbreaking synergistic enhancement between edge features and region-specific deep features. Consequently, the proposed method significantly improves the discriminative power of detection features. Extensive experiments on YOLOv8, ViT, and SSD detectors demonstrate that WaveMFFM effectively mitigates occlusion problems, thus improving the prohibited item detection performance of these representative methods.

Keywords

object detection / feature fusion / wavelet transform / prohibited item / X-ray

Cite this article

Download citation ▾

SUN Peng, CHEN Guangfeng. WaveMFFM: wavelet-guided multi-feature fusion module for X-ray prohibited item detection. Journal of Donghua University(English Edition), 2026, 43(2): 112-119 DOI:10.19884/j.1672-5220.202501013

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	WEI Y L , TAO R S , WU Z J , et al. Occluded prohibited items detection: an X—ray security inspection benchmark and de—occlusion attention module[C]// Proceedings of the 28th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2020: 138—146.

[2]	ZHAO C R , ZHU L , DOU S G , et al. Detecting overlapped objects in X—ray security imagery by a label—aware mechanism[J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 998—1009.

[3]	SHAO F T , LIU J , WU P , et al. Exploiting foreground and background separation for prohibited item detection in overlapping X—ray images[J]. Pattern Recognition, 2022, 122:108261.

[4]	MA B W , JIA T , SU M , et al. Automated segmentation of prohibited items in X—ray baggage images using dense de—overlap attention snake[J]. IEEE Transactions on Multimedia, 2023, 25: 4374—4386.

[5]	LI M Y , MA B W , WANG H , et al. GADet: a geometry—aware X—ray prohibited items detector[J]. IEEE Sensors Journal, 2024, 24(2): 1665—1678.

[6]	YANG M P , WANG Z , CHI Z Q , et al. WaveGAN: frequency—aware GAN for high—fidelity few—shot image generation[C]// European Conference on Computer Vision. Cham: Springer, 2022: 1—17.

[7]	ZHANG B W , GU S Y , ZHANG B , et al. StyleSwin: transformer—based GAN for high—resolution image generation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2022: 11294—11304.

[8]	HSU W Y , JIAN P W . Detail—enhanced wavelet residual network for single image super—resolution[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1—13.

[9]	HWANG S , HAN D , JUNG C , et al. WaveDH: wavelet sub—bands guided ConvNet for efficient image dehazing[EB/OL]. (2024—04—02)[ 2024—12—01]. https://arxiv.org/abs/2404.01604.

[10]	ZHAO C , CAI W L , DONG C Y , et al. Wavelet—based Fourier information interaction with frequency diffusion adjustment for underwater image restoration[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2024: 8281—8291.

[11]	ZHANG Z , GUO D Y , ZHOU S Z , et al. Flight trajectory prediction enabled by time—frequency wavelet transform[J]. Nature Communications, 2023, 14: 5258.

[12]	MIAO C J , XIE L X , WAN F , et al. SIXray: a large—scale security inspection X—ray benchmark for prohibited item discovery in overlapping images[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2019: 2114—2123.

[13]	TAO R S , WEI Y L , JIANG X J , et al. Towards real—world X—ray security inspection: a high—quality benchmark and lateral inhibition module for prohibited items detection[EB/OL]. (2021—08—01)[2025—01—01]. https://arxiv.org/abs/2108.09917.

[14]	LI Q F , SHEN L L , GUO S , et al. WaveCNet: wavelet integrated CNNs to suppress aliasing effect for noise—robust image classification[J]. IEEE Transactions on Image Processing, 2021, 30: 7074—7089.

[15]	WILLIAMS T , LI R Y . Wavelet pooling for convolutional neural networks[C]// International Conference on Learning Representations.[S. l.]: OpenReview.net, 2018.

[16]	FINDER S E , AMOYAL R , TREISTER E , et al. Wavelet convolutions for large receptive fields[C]// Computer Vision—ECCV 2024. Cham: Springer, 2025: 363—380.

[17]	WOO S , PARK J , LEE J Y , et al. CBAM: convolutional block attention module[C]// Computer Vision—ECCV 2018. Cham: Springer, 2018: 3—19.

[18]	VARGHESE R , SAMBATH M . YOLOv8: a novel object detection algorithm with enhanced performance and robustness[C]// 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS). New York: IEEE, 2024: 1—6.

[19]	HU J , SHEN L , SUN G . Squeeze—and—excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 7132—7141.

[20]	ZHANG Y , YE M , ZHU G Y , et al. FFCA—YOLO for small object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1—15.

[21]	LIU W , ANGUELOV D , ERHAN D , et al. SSD: single shot MultiBox detector[C]// Computer Vision—ECCV 2016. Cham: Springer, 2016: 21—37.

[22]	DOSOVITSKIY A , BEYER L , KOLESNIKOV A , et al. An image is worth 16 × 16 words: transformers for image recognition at scale[EB/OL]. (2020—10—22)[2024—12—01]. https://arxiv.org/abs/2010.11929.