Neighbourhood clustering for enhancing domain-adaptive zero-shot learning and beyond
Jinkun Jiang , Yuezun Li , Jiaao Yu , Long Chen , Junyu Dong
Intelligent Marine Technology and Systems ›› 2025, Vol. 3 ›› Issue (1)
Neighbourhood clustering for enhancing domain-adaptive zero-shot learning and beyond
Domain-adaptive zero-shot learning is an emerging and challenging task that extends zero-shot learning to scenarios where the source and target domains have different distributions. To address this issue, existing methods typically rely on category prototypes learned from the semantic embeddings of class labels and leverage pairwise semantic relationships to align the source and target samples in a shared feature space. While promising, these methods still face two key limitations: (1) they focus solely on pairwise relationships, limiting their ability to capture the complex, high-order structural correlations among category prototypes; and (2) they align samples only with the category prototypes, overlooking the intrinsic correlations among samples, which undermines alignment efficacy. To tackle these issues, we propose a new method, neighbourhood clustering for enhancing (NCE) learning framework, which fully exploits correlations among both category prototypes and samples. For category prototypes, we adopt a hypergraph-based approach to capture high-order correlations that go beyond simple pairwise relationships. During the alignment process, we incorporate both intraclass and interclass correlations among samples. Experimental results on the I2AwA and I2WebV datasets demonstrate that our method significantly outperforms state-of-the-art methods in terms of performance. Furthermore, to validate the effectiveness of our method in more challenging scenarios, we use it in underwater image scenarios. Experimental results show that our method significantly improves the accuracy and robustness of underwater image recognition.
Unsupervised domain adaptation / Zero-shot learning / Domain-adaptive zero-shot learning / Underwater application
| [1] |
|
| [2] |
Chen CQ, Xie WP, Huang WB, Rong Y, Ding XH, Huang Y et al (2019) Progressive feature alignment for unsupervised domain adaptation. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 627–636 |
| [3] |
Chen JF, Li CX, Ru YZ, Zhu J (2017) Population matching discrepancy and applications in deep learning. In: NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM, pp 6263–6275 |
| [4] |
Chen SM, Hou WJ, Khan S, Khan FS (2024) Progressive semantic-guided vision transformer for zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 23964–23974 |
| [5] |
Das D, Lee CGS (2019) Zero-shot image recognition using relational matching, adaptation and calibration. In: 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1–8 |
| [6] |
Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 1778–1785 |
| [7] |
Feng QY, Kang GL, Fan HH, Yang Y (2019a) Attract or distract: exploit the margin of open set. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. IEEE, pp 7989–7998 |
| [8] |
Feng Y, You H, Zhang Z, Ji R, Gao Y (2019b) Hypergraph neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI, pp 3558–3565 |
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
Gao Y, Feng YF, Ji SY, Ji RR (2023) HGNN+: general hypergraph neural networks. IEEE Trans Pattern Anal Mach Intell 45(3):3181–3199 |
| [13] |
|
| [14] |
|
| [15] |
Ghifary M, Kleijn WB, Zhang M, Balduzzi D, Li W (2016) Deep reconstruction-classification networks for unsupervised domain adaptation. In: Computer Vision–ECCV 2016: 14th European Conference. Springer, pp 597–613 |
| [16] |
Guo YC, Ding GV, Jin XM, Wang JM (2016) Transductive zero-shot recognition via shared model space learning. In: AAAI’16 Proceedings of the AAAI Conference on Artificial Intelligence. AAAI, pp 3494–3500 |
| [17] |
Hao SZ, Han K, Wong KYK (2023) Learning attention as disentangler for compositional zero-shot learning. In: Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 15315–15324 |
| [18] |
He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 770–778 |
| [19] |
Hwang U, Lee J, Shin J, Yoon S (2024) Sf(DA)$^{2}$: source-free domain adaptation through the lens of data augmentation. Preprint at arXiv:2403.10834 |
| [20] |
Jiang HJ, Wang RP, Shan SG, Chen XL (2018) Learning class prototypes via structure alignment for zero-shot recognition. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer, pp 118–134 |
| [21] |
Jin Y, Wang X, Long M, Wang J (2020) Minimum class confusion for versatile domain adaptation. In: Computer Vision–ECCV 2020: 16th European Conference. Springer, pp 464–480 |
| [22] |
Jing TT, Liu HF, Ding ZM (2021) Towards novel target discovery through open-set domain adaptation. In: Proceedings of the 18th IEEE/CVF International Conference on Computer Vision. IEEE, pp 9302–9311 |
| [23] |
Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. Preprint at arXiv:1607.01759v3 |
| [24] |
Kampffmeyer M, Chen YB, Liang XD, Wang H, Zhang YJ, Xing EP (2019) Rethinking knowledge graph propagation for zero-shot learning. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 11487–11496 |
| [25] |
Kang GL, Zheng L, Yan Y, Yang Y (2018) Deep adversarial attention alignment for unsupervised domain adaptation: the benefit of target expectation maximization. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer, pp 401–416 |
| [26] |
Khare V, Mahajan D, Bharadhwaj H, Verma VK, Rai P (2020) A generative framework for zero shot learning with adversarial domain adaptation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. IEEE, pp 3090–3099 |
| [27] |
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. Preprint at arXiv:1609.02907 |
| [28] |
Kodirov E, Xiang T, Gong SG (2017) Semantic autoencoder for zero-shot learning. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3174–3183 |
| [29] |
Kundu JN, Venkat N, Revanur A, Rahul M, Babu RV (2020) Towards inheritable models for open-set domain adaptation. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 12373–12382 |
| [30] |
|
| [31] |
|
| [32] |
Li W, Wang L, Li W, Agustsson E, Van Gool L (2017) Webvision database: visual learning and understanding from web data. Preprint at arXiv:1708.02862 |
| [33] |
Li WY, Liu J, Han B, Yuan YX (2023) Adjustment and alignment for unbiased open set domain adaptation. In: Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 24110–24119 |
| [34] |
Liu H, Cao ZJ, Long MS, Wang JM, Yang Q (2019) Separate to adapt: open set domain adaptation via progressive separation. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 2922–2931 |
| [35] |
Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: ICML’17: Proceedings of the 34th International Conference on Machine Learning. PMLR, pp 2208–2217 |
| [36] |
|
| [37] |
Min SB, Yao HT, Xie HT, Wang CQ, Zha ZJ, Zhang YD (2020) Domain-aware visual bias eliminating for generalized zero-shot learning. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 12661–12670 |
| [38] |
Orenstein EC, Beijbom O, Peacock EE, Sosik HM (2015) WHOI-plankton–a large scale fine grained visual recognition benchmark dataset for plankton classification. Preprint at arXiv:1510.00745 |
| [39] |
Pan YW, Yao T, Li YH, Ngo CW, Mei T (2020) Exploring category-agnostic clusters for open-set domain adaptation. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 13864–13872 |
| [40] |
Pandimeena R, Soni U, Pathak GP, Rastogi S (2025) Echo highlight image recognition using zero-shot learning in active sonar systems. In: 2025 3rd International Conference on Integrated Circuits and Communication Systems (ICICACS). IEEE, pp 1–6 |
| [41] |
Patterson G, Hays J (2012) Sun attribute database: discovering, annotating, and recognizing scene attributes. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 2751–2758 |
| [42] |
|
| [43] |
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, pp 1532–1543 |
| [44] |
Purushotham S, Carvalho W, Nilanon T, Liu Y (2017) Variational recurrent adversarial deep domain adaptation. In: International Conference on Learning Representations. pp 1–15 |
| [45] |
Qin XB, Zhang ZC, Huang CY, Dehghan M, Zaiane OR, Jagersand M (2020) U2-Net: going deeper with nested U-structure for salient object detection. Pattern Recognit 106:107404 |
| [46] |
Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S et al (2021) Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning. PMLR, pp 8748–8763 |
| [47] |
|
| [48] |
|
| [49] |
Saito K, Kim D, Sclaroff S, Darrell T, Saenko K (2019) Semi-supervised domain adaptation via minimax entropy. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. IEEE, pp 8049–8057 |
| [50] |
Saito K, Yamamoto S, Ushiku Y, Harada T (2018) Open set domain adaptation by backpropagation. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer, pp 153–168 |
| [51] |
Shen J, Qu YR, Zhang WN, Yu Y (2018) Wasserstein distance guided representation learning for domain adaptation. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI, pp 4058–4065 |
| [52] |
Smith AR (1995) Image compositing fundamentals. Microsoft Technical Memo 4. pp 1–8. https://www.cs.princeton.edu/courses/archive/fall00/cs426/papers/smith95a.pdf |
| [53] |
Tang H, Chen K, Jia K (2020) Unsupervised domain adaptation via structurally regularized deep clustering. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 8722–8732 |
| [54] |
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 2962–2971 |
| [55] |
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. Preprint at arXiv:2108.01301 |
| [56] |
|
| [57] |
Wang XL, Ye YF, Gupta A (2018) Zero-shot recognition via semantic embeddings and knowledge graphs. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 6857–6866 |
| [58] |
Xian YQ, Choudhury S, He Y, Schiele B, Akata Z (2019) Semantic projection network for zero-and few-label semantic segmentation. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 8248–8257 |
| [59] |
Xu Y, Feng YF, Jiang Y (2024) Structure-aware residual-center representation for self-supervised open-set 3D cross-modal retrieval. In: Proceedings of the 2024 IEEE International Conference on Multimedia and Expo (ICME 2024). IEEE, pp 1–6 |
| [60] |
Yang SQ, Wang YX, Wang K, Jui S, van de Weijer J (2022) Attracting and dispersing: a simple approach for source-free domain adaptation. In: 36th Conference on Neural Information Processing Systems (NeurIPS). Springer, pp 5802–5815 |
| [61] |
|
| [62] |
|
| [63] |
Zhuo JB, Wang SH, Cui SH, Huang QM (2019) Unsupervised open domain recognition by semantic discrepancy minimization. In: Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 750–759 |
The Author(s)
/
| 〈 |
|
〉 |