Visual polysemy and synonymy: toward near-duplicate image retrieval

Manni DUAN; Xiuqing WU

doi:10.1007/s11460-010-0099-6

Frontiers of Electrical and Electronic Engineering >

2010 , Vol. 5 >Issue 4: 419 - 429

DOI: https://doi.org/10.1007/s11460-010-0099-6

RESEARCH ARTICLE

Visual polysemy and synonymy: toward near-duplicate image retrieval

Manni DUAN ,
Xiuqing WU

Expand

Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230027, China

Received date: 23 Aug 2009

Accepted date: 26 Apr 2010

Published date: 05 Dec 2010

Copyright

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg

Fold

Abstract

Near-duplicate image retrieval aims to find all images that are duplicate or near duplicate to a query image. One of the most popular and practical methods in near-duplicate image retrieval is based on bag-of-words (BoW) model. However, the fundamental deficiency of current BoW method is the gap between visual word and image’s semantic meaning. Similar problem also plagues existing text retrieval. A prevalent method against such issue in text retrieval is to eliminate text synonymy and polysemy and therefore improve the whole performance. Our proposed approach borrows ideas from text retrieval and tries to overcome these deficiencies of BoW model by treating the semantic gap problem as visual synonymy and polysemy issues. We use visual synonymy in a very general sense to describe the fact that there are many different visual words referring to the same visual meaning. By visual polysemy, we refer to the general fact that most visual words have more than one distinct meaning. To eliminate visual synonymy, we present an extended similarity function to implicitly extend query visual words. To eliminate visual polysemy, we use visual pattern and prove that the most efficient way of using visual pattern is merging visual word vector together with visual pattern vector and obtain the similarity score by cosine function. In addition, we observe that there is a high possibility that duplicates visual words occur in an adjacent area. Therefore, we modify traditional Apriori algorithm to mine quantitative pattern that can be defined as patterns containing duplicate items. Experiments prove quantitative patterns improving mean average precision (MAP) significantly.

Key words： near-duplicate image retrieval; bag-of-words (BoW) model; visual synonymy; visual polysemy; extended similarity function; query expansion; visual pattern

Cite this article

Manni DUAN , Xiuqing WU . Visual polysemy and synonymy: toward near-duplicate image retrieval[J]. Frontiers of Electrical and Electronic Engineering, 2010 , 5(4) : 419 -429 . DOI: 10.1007/s11460-010-0099-6

References

Publishing order | Descend order by publishing year | Descend order by cited within

1	Csurka G, Dance C R, Fan L X, Willamowski J, Bray C. Visual categorization with bags of keypoints. In: Proceedings of European Conference on Computer Vision, Workshop on Statistical Learning in Computer Vision. 2004, 1-22

2	Salton G, Wong A, Yang C S. A vector space model for automatic indexing. Communications of the ACM, 1975, 18(11): 613-620 DOI

3	Wong S K M, Ziarko W, Wong P C N. Generalized vector space model in information retrieval. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 1985, 18-25

4	Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110 DOI

5	Schindler G, Brown M, Szelisk R. City-scale location recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007, 1-7

6	Mitra M, Buckley C, Singhal A, Cardie C. An analysis of statistical and syntactic phrases. In: Proceedings of the 5th International Conference on Recherche d’Information Assistee par Ordinateur. 1997, 200-214

7	Sivic J, Zisserman A. Video data mining using configurations of viewpoint invariant regions. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. 2004, 1: I-488-I-495

8	Quack T, Ferrari V, Van Gool L. Video mining with frequent itemset configurations. In: Proceedings of the 5th International Conference on Image and Video Retrieval. 2006, 360-369

9	Yuan J, Wu Y, Yang M. Discovery of collocation patterns: from visual words to visual phrases. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007, 1-8

10	Quack T, Ferrari V, Leibe B, Van Gool L. Efficient mining of frequent and distinctive feature configurations. In: Proceedings of the 11th IEEE International Conference on Computer Vision.2007, 1-8

11	Srikant R, Agrawal R. Mining quantitative association rules in large relational tables. In: Proceedings of the ACM SIGMOD Conference on Management of Data. 1996, 1-12

12	Agrawal R, Imielinski T, Swami A N. Mining association rules between sets of items in large database. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. 1993, 207-216

13	Deerwester S, Dumais S T, Furnas G W, Landauer T K, Harshman R. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41(6): 391-407 DOI

14	Nistér D, Stewénius H. Scalable recognition with a vocabulary tree. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 2: 2161-2168

Options

Outlines