Selective Multiple Classifiers for Weakly Supervised Semantic Segmentation

Zilin Guo , Dongyue Wu , Changxin Gao , Nong Sang

CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) : 1688 -1702.

PDF (3546KB)
CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) :1688 -1702. DOI: 10.1049/cit2.70042
ORIGINAL RESEARCH
research-article

Selective Multiple Classifiers for Weakly Supervised Semantic Segmentation

Author information +
History +
PDF (3546KB)

Abstract

Existing weakly supervised semantic segmentation (WSSS) methods based on image-level labels always rely on class activation maps (CAMs), which measure the relationships between features and classifiers. However, CAMs only focus on the most discriminative regions of images, resulting in their poor coverage performance. We attribute this to the deficiency in the recognition ability of a single classifier and the negative impacts caused by magnitudes during the CAMs normalisation process. To address the aforementioned issues, we propose to construct selective multiple classifiers (SMC). During the training process, we extract multiple prototypes for each class and store them in the corresponding memory bank. These prototypes are divided into foreground and background prototypes, with the former used to identify foreground objects and the latter aimed at pre-venting the false activation of background pixels. As for the inference stage, multiple prototypes are adaptively selected from the memory bank for each image as SMC. Subsequently, CAMs are generated by measuring the angle between SMC and features. We enhance the recognition ability of classifiers by adaptively constructing multiple classifiers for each image, while only relying on angle measurement to generate CAMs can alleviate the suppression phenomenon caused by magnitudes. Further-more, SMC can be integrated into other WSSS approaches to help generate better CAMs. Extensive experiments conducted on standard WSSS benchmarks such as PASCAL VOC 2012 and MS COCO 2014 demonstrate the superiority of our proposed method.

Keywords

image segmentation / multiple classifiers / weakly supervised learning

Cite this article

Download citation ▾
Zilin Guo, Dongyue Wu, Changxin Gao, Nong Sang. Selective Multiple Classifiers for Weakly Supervised Semantic Segmentation. CAAI Transactions on Intelligence Technology, 2025, 10(6): 1688-1702 DOI:10.1049/cit2.70042

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

B. Zhou, A. Khosla, A. Lapedriza,A. Oliva, and A. Torralba, “Learning Deep Features for Discriminative Localization,” in Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), 2921-2929.

[2]

J. Ahn and S. Kwak, “Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmenta-tion,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), 4981-4990.

[3]

J. Ahn,S. Cho, and S. Kwak, “Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), 2209-2218.

[4]

L. Van der Maaten and G. Hinton, “Visualizing Data Using t-SNE Journal of Machine Learning Research 9, no. 11 (2008).

[5]

M. Everingham, S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The Pascal Visual Object Classes Challenge: A Retrospective,” International Journal of Computer Vision 111, no. 1 (2015): 98-136, https://doi.org/10.1007/s11263-014-0733-5.

[6]

T.-Y. Lin M. Maire S. Belongie, et al., “Microsoft COCO: Common Objects in Context,” in Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 Springer, 2014),740-755.

[7]

B. Kim, S. Han, and J. Kim, “Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation,” in Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, no. 2 (2021), 1754-1761, https://doi.org/10.1609/aaai.v35i2.16269.

[8]

L. Xu, W. Ouyang, M. Bennamoun,F. Boussaid, and D. Xu, “Multi-Class Token Transformer for Weakly Supervised Semantic Segmenta-tion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 4310-4319.

[9]

Z. Qin, Y. Chen, G. Zhu, et al., “Enhanced Pseudo-Label Generation With Self-Supervised Training for Weakly-Supervised Semantic Seg-mentation,” IEEE Transactions on Circuits and Systems for Video Technology 34, no. 8 (2024): 7017-7028, https://doi.org/10.1109/tcsvt.2024.3364764.

[10]

J. Chen, W. Lu, Y. Li, L. Shen, and J. Duan, “Adversarial Learning of Object-Aware Activation Map for Weakly-Supervised Semantic Seg-mentation,” IEEE Transactions on Circuits and Systems for Video Technology 33, no. 8 (2023): 3935-3946, https://doi.org/10.1109/tcsvt.2023.3236432.

[11]

D. Zhang, H. Li, W. Zeng, et al., “Weakly Supervised Semantic Segmentation Via Alternate Self-Dual Teaching,” IEEE Transactions on Image Processing 34 (2023): 3086-3095, https://doi.org/10.1109/tip.2023.3343112.

[12]

W. Wang, G. Sun, and L. Van Gool, “Looking Beyond Single Images for Weakly Supervised Semantic Segmentation Learning,” IEEE Trans-actions on Pattern Analysis and Machine Intelligence 46, no. 3 (2022): 1635-1649, https://doi.org/10.1109/tpami.2022.3168530.

[13]

F. Meng, K. Luo, H. Li, Q. Wu, and X. Xu, “Weakly Supervised Semantic Segmentation by a Class-Level Multiple Group Cosegmenta-tion and Foreground Fusion Strategy,” IEEE Transactions on Circuits and Systems for Video Technology 30, no. 12 (2019): 4823-4836, https://doi.org/10.1109/tcsvt.2019.2962073.

[14]

L. Lovász, “Random Walks on Graphs Combinatorics, Paul Erdos Is Eighty 2, no.1-46 (1993): 4.

[15]

T. Wu, J. Huang, G. Gao, et al., “Embedded Discriminative Atten-tion Mechanism for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), 16765-16774.

[16]

S. Lee, M. Lee,J. Lee, and H. Shim, “Railroad Is Not a Train:Sa-liency as Pseudo-Pixel Supervision for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), 5495-5505.

[17]

L. Xu, M. Bennamoun, F. Boussaid, W. Ouyang, F. Sohel, and D. Xu, “Auxiliary Tasks Enhanced Dual-Affinity Learning for Weakly Super-vised Semantic Segmentation,” IEEE Transactions on Neural Networks and Learning Systems 36, no. 3 (2024): 5082-5096, https://doi.org/10.1109/tnnls.2024.3373566.

[18]

X. Yang and X. Gong, “Foundation Model Assisted Weakly Super-vised Semantic Segmentation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024), 523-532.

[19]

L. Zhou, C. Gong, Z. Liu, and K. Fu, “SAL: Selection and Attention Losses for Weakly Supervised Semantic Segmentation,” IEEE Trans-actions on Multimedia 23 (2020): 1035-1048, https://doi.org/10.1109/tmm.2020.2991592.

[20]

G. Wang, X. Liu, C. Li, et al., “A Noise-Robust Framework for Automatic Segmentation of COVID-19 Pneumonia Lesions From CT Images,” IEEE Transactions on Medical Imaging 39, no. 8 (2020): 2653-2663, https://doi.org/10.1109/tmi.2020.3000314.

[21]

Y. Li, Y. Duan, Z. Kuang, Y. Chen, W. Zhang, and X. Li, “Uncer-tainty Estimation Via Response Scaling for Pseudo-Mask Noise Mitiga-tion in Weakly-Supervised Semantic Segmentation,” in Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36, no. 2, (2022), 1447-1455, https://doi.org/10.1609/aaai.v36i2.20034.

[22]

S. Rong, B. Tu,Z. Wang, and J. Li, “Boundary-Enhanced Co-Training for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 19574-19584.

[23]

G. Li, V. Jampani, L. Sevilla-Lara, D. Sun,J. Kim, and J. Kim, “Adaptive Prototype Learning and Allocation for Few-Shot Segmenta-tion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), 8334-8343.

[24]

K. Wang, J. H. Liew, Y. Zou,D. Zhou, and J. Feng, “PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (2019), 9197-9206.

[25]

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A Simple Framework for Contrastive Learning of Visual Representations,” in International Conference on Machine Learning (PMLR, 2020), 1597-1607.

[26]

T. Zhou and W. Wang, “Cross-Image Pixel Contrasting for Semantic Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence 46, no. 8 (2024): 5398-5412, https://doi.org/10.1109/tpami.2024.3367952.

[27]

J. Pang, L. Qiu, X. Li, et al., “Quasi-Dense Similarity Learning for Multiple Object Tracking,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), 164-173.

[28]

L. Ke, X. Li, M. Danelljan, Y.-W. Tai, C.-K. Tang, and F. Yu, “Pro-totypical Cross-Attention Networks for Multiple Object Tracking and Segmentation,” Advances in Neural Information Processing Systems 34 (2021): 1192-1203.

[29]

Y. Du, Z. Fu,Q. Liu, and Y. Wang, “Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 4320-4329.

[30]

Q. Chen, L. Yang, J.-H. Lai and X. Xie,“Self-Supervised Image-Specific Prototype Exploration for Weakly Supervised Semantic Seg-mentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 4288-4298.

[31]

Z. Chen and Q. Sun,“Extracting Class Activation Maps From Non-Discriminative Features as Well,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 3135-3144.

[32]

W. Wang, C. Han, T. Zhou, and D. Liu, “Visual Recognition With Deep Nearest Centroids,” preprint, arXiv:2209. 07383 (2022).

[33]

Z. Qin, C. Han, Q. Wang, X. Nie, Y. Yin, and L. Xiankai, “Unified 3D Segmenter as Prototypical Classifiers,” Advances in Neural Information Processing Systems 36 (2023): 46419-46432.

[34]

S. Guerriero, B. Caputo, and T. Mensink, “DeepNCM: Deep Nearest Class Mean Classifiers,” (2018).

[35]

B. Hariharan, P. Arbeláez, L. Bourdev, S. Maji, and J. Malik, “Se-mantic Contours From Inverse Detectors,” in 2011 International Con-ference on Computer Vision IEEE, 2011), 991-998.

[36]

K. He, X. Zhang,S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), 770-778.

[37]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition IEEE, 2009), 248-255.

[38]

P. Krähenbühl and V. Koltun, “Efficient Inference in Fully Con-nected CRFs With Gaussian Edge Potentials,” Advances in Neural In-formation Processing Systems 24 (2011).

[39]

K. Zhu and J. Wu, “Residual Attention: A Simple But Effective Method for Multi-Label Recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (2021), 184-193.

[40]

P.-T. Jiang Q. Hou Y. Cao M.-M. Cheng Y. Wei and H.-K. Xiong,“Integral Object Mining Via Online Attention Accumulation,” in Pro-ceedings of the IEEE/CVF International Conference on Computer Vision (2019), 2070-2079.

[41]

Y. Wang, J. Zhang, M. Kan,S. Shan, and X. Chen, “Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), 12275-12284.

[42]

J. Lee, S. J. Oh, S. Yun, J. Choe,E. Kim, and S. Yoon, “Weakly Supervised Semantic Segmentation Using Out-of-Distribution Data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 16897-16906.

[43]

M. Lee,D. Kim, and H. Shim, “Threshold Matters in WSSS: Manipulating the Activation for the Robust and Accurate Segmentation Model Against Thresholds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 4330-4339.

[44]

J. Xie, X. Hou,K. Ye, and L. Shen, “CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation,” in Pro-ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 4483-4492.

[45]

L. Zhu, Y. Li, J. Fang, et al., “WeakTr: Exploring Plain Vision Transformer for Weakly-Supervised Semantic Segmentation,” preprint, arXiv:2304.01184 (2023).

[46]

L. Ru, H. Zheng,Y. Zhan, and B. Du, “Token Contrast for Weakly-Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 3093-3102.

[47]

T. Wu, G. Gao, J. Huang, X. Wei, X. Wei, and C. H. Liu, “Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation,” in European Conference on Computer Vision (Springer, 2022), 199-216.

[48]

J. Xu, H. Xie, H. Xu, Y. Wang, S.-A. Liu and Y. Zhang, “Boat in the Sky: Background Decoupling and Object-Aware Pooling for Weakly Supervised Semantic Segmentation,” in Proceedings of the 30th ACM International Conference on Multimedia (2022), 5783-5792.

[49]

T. Zhou, M. Zhang,F. Zhao, and J. Li, “Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 4299-4309.

[50]

C. Qian and H. Zhang, “Region-Based Pixels Integration Mecha-nism for Weakly Supervised Semantic Segmentation,” in Proceedings of the 30th ACM International Conference on Multimedia (2022), 6165-6173.

[51]

P.-T. Jiang Y. Yang Q. Hou and Y.Wei, “L2G:A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Se-mantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 16886-16896.

[52]

Z. Cheng, P. Qiao, K. Li, et al., “Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 23673-23684.

[53]

S.-H. Yoon H. Kwon H. Kim and K.-J. Yoon,“Class Tokens Infusion for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024), 3595-3605.

[54]

J. Lee, J. Choi, J. Mok, and S. Yoon, “Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation,” Advances in Neural Information Processing Systems 34 (2021): 27408-27421.

[55]

J. Li, Z. Jie, X. Wang, X. Wei, and L. Ma, “Expansion and Shrinkage of Localization for Weakly-Supervised Semantic Segmentation,” Ad-vances in Neural Information Processing Systems 35 (2022): 16037-16051.

[56]

Y. Lin, M. Chen, W. Wang, et al., “CLIP Is Also an Efficient Seg-menter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 15305-15314.

[57]

L.-C. Chen G. Papandreou I. Kokkinos K. Murphy and A. L. Yuille, “DeepLab: Semantic Image Segmentation With Deep Convolu-tional Nets, Atrous Convolution, and Fully Connected CRFs,” IEEE Transactions on Pattern Analysis and Machine Intelligence 40, no. 4 (2017): 834-848, https://doi.org/10.1109/tpami.2017.2699184.

[58]

M. Oquab, T. Darcet, T. Moutakanni, et al., “DINOv2: Learning Robust Visual Features Without Supervision,” preprint, arXiv:2304.07193 (2023).

[59]

Z. Chen, T. Wang, X. Wu, X.-S.Hua, H. Zhang, and Q. Sun, “Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 969-978.

Funding

National Natural Science Foundation of China(Grants 62176097)

National Natural Science Foundation of China(61433007)

Fundamental Research Funds for the Central Universities(Grant 2019kfyXKJC024)

111 Project on Computational Intelligence and Intelligent Control(Grant B18024)

AI Summary AI Mindmap
PDF (3546KB)

25

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/