Dynamic prompting class distribution optimization for semi-supervised sound event detection

Lijian GAO; Qing ZHU; Yaxin SHEN; Qirong MAO; Yongzhao ZHAN

doi:10.1631/FITEE.2400061

Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (4) :556 -567. DOI: 10.1631/FITEE.2400061

Dynamic prompting class distribution optimization for semi-supervised sound event detection

Author information +

History +

PDF (1157KB)

Abstract

Semi-supervised sound event detection (SSED) tasks typically leverage a large amount of unlabeled and synthetic data to facilitate model generalization during training, reducing overfitting on a limited set of labeled data. However, the generalization training process often encounters challenges from noisy interference introduced by pseudo-labels or domain knowledge gaps. To alleviate noisy interference in class distribution learning, we propose an efficient semi-supervised class distribution learning method through dynamic prompt tuning, named prompting class distribution optimization (PADO). Specifically, when modeling real labeled data, PADO dynamically incorporates independent learnable prompt tokens to explore prior knowledge about the true distribution. Then, the prior knowledge serves as prompt information, dynamically interacting with the posterior noisy-class distribution information. In this case, PADO achieves class distribution optimization while maintaining model generalization, leading to a significant improvement in the efficiency of class distribution learning. Compared with state-of-the-art methods on the SSED datasets from DCASE 2019, 2020, and 2021 challenges, PADO achieves significant performance improvements. Furthermore, it is readily extendable to other benchmark models.

Keywords

Prompt tuning / Class distribution learning / Semi-supervised learning / Sound event detection

Cite this article

Download citation ▾

Lijian GAO, Qing ZHU, Yaxin SHEN, Qirong MAO, Yongzhao ZHAN. Dynamic prompting class distribution optimization for semi-supervised sound event detection. Front. Inform. Technol. Electron. Eng, 2025, 26 (4) : 556-567 DOI:10.1631/FITEE.2400061