SSA: semantic structure aware inference on CNN networks for weakly pixel-wise dense predictions without cost
Yanpeng SUN , Zechao LI
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (2) : 192702
SSA: semantic structure aware inference on CNN networks for weakly pixel-wise dense predictions without cost
The pixel-wise dense prediction tasks based on weakly supervisions currently use Class Attention Maps (CAMs) to generate pseudo masks as ground-truth. However, existing methods often incorporate trainable modules to expand the immature class activation maps, which can result in significant computational overhead and complicate the training process. In this work, we investigate the semantic structure information concealed within the CNN network, and propose a semantic structure aware inference (SSA) method that utilizes this information to obtain high-quality CAM without any additional training costs. Specifically, the semantic structure modeling module (SSM) is first proposed to generate the class-agnostic semantic correlation representation, where each item denotes the affinity degree between one category of objects and all the others. Then, the immature CAM are refined through a dot product operation that utilizes semantic structure information. Finally, the polished CAMs from different backbone stages are fused as the output. The advantage of SSA lies in its parameter-free nature and the absence of additional training costs, which makes it suitable for various weakly supervised pixel-dense prediction tasks. We conducted extensive experiments on weakly supervised object localization and weakly supervised semantic segmentation, and the results confirm the effectiveness of SSA.
class attention maps / semantic structure / weakly-supervised object localization / weakly-supervised semantic segmentation
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
Yun S, Han D, Chun S, Oh S J, Yoo Y, Choe J. CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. 2019, 6022−6031 |
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
Fan J, Zhang Z, Tan T, Song C, Xiao J. CIAN: cross-image affinity net for weakly supervised semantic segmentation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence. 2020, 10762−10769 |
| [54] |
|
| [55] |
|
| [56] |
|
| [57] |
|
| [58] |
|
| [59] |
Li Y, Duan Y, Kuang Z, Chen Y, Zhang W, Li X. Uncertainty estimation via response scaling for pseudo-mask noise mitigation in weakly-supervised semantic segmentation. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence,34th Conference on Innovative Applications of Artificial Intelligence, The 12th Symposium on Educational Advances in Artificial Intelligence. 2022, 1447−1455 |
| [60] |
|
Higher Education Press
Supplementary files
/
| 〈 |
|
〉 |