Improving sound event detection through enhanced feature extraction and attention mechanisms

Dongping ZHANG , Siyi WU , Zhanhong LU , Zhehao ZHANG , Haimiao HU , Jiabin YU

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (10) : 1910707

PDF (530KB)
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (10) : 1910707 DOI: 10.1007/s11704-025-41108-7
Image and Graphics
LETTER

Improving sound event detection through enhanced feature extraction and attention mechanisms

Author information +
History +
PDF (530KB)

Graphical abstract

Cite this article

Download citation ▾
Dongping ZHANG, Siyi WU, Zhanhong LU, Zhehao ZHANG, Haimiao HU, Jiabin YU. Improving sound event detection through enhanced feature extraction and attention mechanisms. Front. Comput. Sci., 2025, 19(10): 1910707 DOI:10.1007/s11704-025-41108-7

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Li Y F, Liang D M . Safe semi-supervised learning: a brief introduction. Frontiers of Computer Science, 2019, 13( 4): 669–676

[2]

Ji Z, Ni J, Liu X, Pang Y . Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning. Frontiers of Computer Science, 2023, 17( 2): 172312

[3]

Nam H, Kim S H, Ko B Y, Park Y H. Frequency dynamic convolution: frequency-adaptive pattern recognition for sound event detection. In: Proceedings of the 23rd Annual Conference of the International Speech Communication Association. 2022, 2763–2767

[4]

Xiao S, Zhang X, Zhang P. Multi-dimensional frequency dynamic convolution with confident mean teacher for sound event detection. In: Proceedings of ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023, 1–5

[5]

Chen S, Wu Y, Wang C, Liu S, Tompkins D, Chen Z, Che W, Yu X, Wei F. BEATs: audio pre-training with acoustic tokenizers. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 5178–5193

[6]

Chen Y, Dai X, Liu M, Chen D, Yuan L, Liu Z. Dynamic convolution: attention over convolution kernels. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 11027–11036

[7]

Li K, Cai P, Song Y. Li USTC team’s submission for DCASE 2023 challenge task4a. Technical Report, DCASE2023 Challenge, 2023

[8]

Li K, Song Y, Dai L R, McLoughlin I, Fang X, Liu L. AST-SED: an effective sound event detection method based on audio spectrogram transformer. In: Proceedings of ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023, 1–5

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (530KB)

Supplementary files

Highlights

683

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/