Improving sound event detection through enhanced feature extraction and attention mechanisms

Dongping ZHANG; Siyi WU; Zhanhong LU; Zhehao ZHANG; Haimiao HU; Jiabin YU

doi:10.1007/s11704-025-41108-7

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (10) :1910707 DOI: 10.1007/s11704-025-41108-7

Image and Graphics

LETTER

Improving sound event detection through enhanced feature extraction and attention mechanisms

Author information +

History +

PDF (530KB)

Graphical abstract

Cite this article

Download citation ▾

Dongping ZHANG, Siyi WU, Zhanhong LU, Zhehao ZHANG, Haimiao HU, Jiabin YU. Improving sound event detection through enhanced feature extraction and attention mechanisms. Front. Comput. Sci., 2025, 19 (10) : 1910707 DOI:10.1007/s11704-025-41108-7

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Li Y F, Liang D M . Safe semi-supervised learning: a brief introduction. Frontiers of Computer Science, 2019, 13( 4): 669–676

[2]	Ji Z, Ni J, Liu X, Pang Y . Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning. Frontiers of Computer Science, 2023, 17( 2): 172312

[3]	Nam H, Kim S H, Ko B Y, Park Y H. Frequency dynamic convolution: frequency-adaptive pattern recognition for sound event detection. In: Proceedings of the 23rd Annual Conference of the International Speech Communication Association. 2022, 2763–2767

[4]	Xiao S, Zhang X, Zhang P. Multi-dimensional frequency dynamic convolution with confident mean teacher for sound event detection. In: Proceedings of ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023, 1–5

[5]	Chen S, Wu Y, Wang C, Liu S, Tompkins D, Chen Z, Che W, Yu X, Wei F. BEATs: audio pre-training with acoustic tokenizers. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 5178–5193

[6]	Chen Y, Dai X, Liu M, Chen D, Yuan L, Liu Z. Dynamic convolution: attention over convolution kernels. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 11027–11036

[7]	Li K, Cai P, Song Y. Li USTC team’s submission for DCASE 2023 challenge task4a. Technical Report, DCASE2023 Challenge, 2023

[8]	Li K, Song Y, Dai L R, McLoughlin I, Fang X, Liu L. AST-SED: an effective sound event detection method based on audio spectrogram transformer. In: Proceedings of ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023, 1–5