A novel unsupervised approach for multilevel image clustering from unordered image collection

Lai KANG; Lingda WU; Yee-Hong YANG

doi:10.1007/s11704-013-1266-8

PDF(1338 KB)

Front. Comput. Sci. ›› 2013, Vol. 7 ›› Issue (1) : 69-82. DOI: 10.1007/s11704-013-1266-8

RESEARCH ARTICLE

A novel unsupervised approach for multilevel image clustering from unordered image collection

Lai KANG¹^,³ ,
Lingda WU¹^,² ,
Yee-Hong YANG³

Author information +

History +

Abstract

A novel unsupervised approach to automatically constructing multilevel image clusters from unordered images is proposed in this paper. The whole input image collection is represented as an imaging sample space (ISS) consisting of globally indexed image features extracted by a new efficientmulti-viewimage featurematchingmethod. By making an analogy between image capturing and observation of ISS, each image is represented as a binary sequence, in which each bit indicates the visibility of a corresponding feature. Based on information theory-inspired image popularity and dissimilarity measures, we show that the image content and distance can be quantitatively described, guided by which an input image collection is organized into multilevel clusters automatically. The effectiveness and the efficiency of the proposed approach are demonstrated using three real image collections and promising results were obtained from both qualitative and quantitative evaluation.

Keywords

multilevel image clustering / imaging sample space (ISS) / unordered image collection

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Lai KANG, Lingda WU, Yee-Hong YANG. A novel unsupervised approach for multilevel image clustering from unordered image collection. Front Comput Sci, 2013, 7(1): 69‒82 https://doi.org/10.1007/s11704-013-1266-8

This is a preview of subscription content, contact us for subscripton.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Brown M, Lowe D G. Unsupervised 3D object recognition and reconstruction in unordered datasets. In: Proceedings of the 5th International Conference on 3-D Digital Imaging and Modeling. 2005, 56–63

[2]	Schaffalitzky F, Zisserman A. Multi-view matching for unordered image sets, or “How do I organize my holiday snaps?” In: Proceedings of the 7th European Conference on Computer Vision. 2002, 414–431

[3]	Snavely N, Seitz S M, Szeliski R. Modeling the world from internet photo collections. International Journal of Computer Vision, 2008, 80(2): 189–210 CrossRef Google scholar

[4]	Johnson T, Pierre F G, Raguram R, Frahm J M. Fast organization of large photo collections using CUDA. In: Proceedings of the Workshop on Computer Vision on GPUs, European Conference on Computer Vision. 2010

[5]	Frahm J M, Pierre F G, Gallup D, Johnson T, Raguram R, Wu C, Jen Y H, Dunn E, Clipp B, Lazebnik S, Pollefeys M. Building Rome on a cloudless day. In: Proceedings of the 11th European Conference on Computer Vision. 2010, 368–381

[6]	Agarwal S, Snavely N, Simon I, Seitz S M, Szeliski R. Building Rome in a day. In: Proceedings of the 8th IEEE International Conference on Computer Vision. 2009, 72–79

[7]	Philbin J, Chum O, Isard M, Sivic J, Zisserman A. Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. 2007, 1–8 CrossRef Google scholar

[8]	Sivic J, Schaffalitzky F, Zisserman A. Object level grouping for video shots. International Journal of Computer Vision, 2006, 67(2): 189–210 CrossRef Google scholar

[9]	Oliva A, Torralba A. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 2001, 42(3): 145–175 CrossRef Google scholar

[10]	Mikolajczyk K, Schmid C. Scale and affine invariant interest point detectors. International Journal of Computer Vision, 2004, 60(1): 63–86 CrossRef Google scholar

[11]	Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110 CrossRef Google scholar

[12]	Ke Y, Sukthankar R. PCA-SIFT: A more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Conference on Computer Vision and Pattern Recognition. 2004, 506–513

[13]	Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, Gool L V. A comparison of affine region detectors. International Journal of Computer Vision, 2005, 65(1-2): 43–72 CrossRef Google scholar

[14]	Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(10): 1615–1630 CrossRef Google scholar

[15]	Avidan S, Moses Y, Moses Y. Probabilistic multi-view correspondence in a distributed setting with no central server. In: Proceedings of the 8th European Conference of Computer Vision. 2004, 428–441

[16]	Ferrari V, Tuytelaars T, Gool L V. Wide-baseline multiple-view correspondences. In: Proceedings of the 2003 IEEE Conference on Computer Vision and Pattern Recognition. 2003, 718–725 CrossRef Google scholar

[17]	Tuytelaars T, Gool L V. Wide baseline stereo matching based on local, affinely invariant regions. In: Proceedings of the 11th British Machine Vision Conference. 2000, 412–425

[18]	Yao J, Cham W K. Robust multi-view feature matching from multiple unordered views. Pattern Recognition, 2007, 40(11): 3081–3099 CrossRef Google scholar

[19]	Sivic J, Zisserman A. Video google: a text retrieval approach to object matching in videos. In: Proceedings of the 9th IEEE International Conference on Computer Vision. 2003, 1470–1477 CrossRef Google scholar

[20]	Jiang Y G, Ngo C W, Yang J. Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. 2007, 494–501 CrossRef Google scholar

[21]	Cao Y, Wang C, Li Z, Zhang L, Zhang L. Spatial-bag-of-features. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. 2010, 3352–3359 CrossRef Google scholar

[22]	Marszałek M, Schmid C. Spatial weighting for bag-of-features. In: Proceedings of the 2006 IEEE Conference on Computer Vision and Pattern Recognition. 2006, 2118–2125

[23]	Viitaniemi V, Laaksonen J. Spatial extensions to bag of visual words. In: Proceedings of the 8th ACM International Conference on Image and Video Retrieval. 2009, 1–8 CrossRef Google scholar

[24]	Muja M, Lowe D G. Fast approximate nearest neighbors with automatic algorithm configuration. In: Proceedings of the 4th International Conference on Computer Vision Theory and Applications. 2009, 331–340

[25]	Hartley R, Zisserman A. Multiple view geometry in computer vision. 2 edition. New York: Cambridge University Press, 2003

[26]	Blahut R E. Principles and practice of information theory. Boston: Addison-Wesley, 1987

[27]	Cover T M, Thomas J A. Elements of information theory. Wiley- Interscience, 2006

[28]	Vázquez P P, Feixas M, Sbert M, Heidrich W. Viewpoint selection using viewpoint entropy. In: Proceedings of the 2001 Vision Modeling and Visualization Conference. 2001, 273–280

[29]	Shao H, Svoboda T, Ferrari V, Tuytelaars T, Gool L V. Fast indexing for image retrieval based on local appearance with re-ranking. In: Proceedings of the 10th IEEE International Conference on Image Processing. 2003, 737–740