Visual understanding by mining social media: recent advances and challenges
Xueming WANG, Zechao LI, Jinhui TANG
Visual understanding by mining social media: recent advances and challenges
With the rapid increase in social websites that has dramatically increased the volume of social media, which includes the use of images and videos, visual understanding has attracted great interest in several areas such as multimedia, computer vision, and pattern recognition. Valuable auxiliary resources available on social websites, such as user-provided tags, aid in the tasks of visual understanding. Therefore, several methods have been proposed for exploring the auxiliary resources for tag refinement, image retrieval, and media summarization. This work conducts a comprehensive survey of recent advances in visual understanding by mining social media in order to discuss their merits and limitations. We then analyze the difficulties and challenges of visual understanding followed by several possible future research directions.
social tag / visual understanding / visual representation / tag refinement / image retrieval / summarization
[1] |
Chua T S, Tang J H, Hong R C, Li H J, Luo Z P, Zheng Y T. NUSWIDE: A real-world web image database from national university of singapore. In: Proceedings of ACM International Conference on Image and Video Retrieval. 2009
CrossRef
Google scholar
|
[2] |
Liu D, Yan S C, Hua X S, Zhang H J. Image retagging using collaborative tag propagation. IEEE Transactions on Multimedia, 2011, 13(4): 702–712
CrossRef
Google scholar
|
[3] |
Li Z C, Liu J, Tang J H, Lu H Q. Projective matrix factorization with unified embedding for social image tagging. Computer Vision and Image Understanding, 2014, 124: 71–78
CrossRef
Google scholar
|
[4] |
Liu Q L, Li Z C. Projective nonnegative matrix factorization for social image retrieval. Neurocomputing, 2016, 172: 19–26
CrossRef
Google scholar
|
[5] |
Smeulders A W M, Worring M, Santini S, Gupta A, Jain R. Contentbased image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12): 1349–1380
CrossRef
Google scholar
|
[6] |
Datta R, Joshi D, Li J, Wang J Z. Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys, 2008, 40(2): 5
CrossRef
Google scholar
|
[7] |
Wang M, Ni B B, Hua X S, Chua T S. Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM Computing Surveys, 2012, 44(4): 25
CrossRef
Google scholar
|
[8] |
Mei T, Rui Y, Li S P, Tian Q. Multimedia search reranking: a literature survey. ACM Computing Surveys, 2014, 46(3): 38:1–38:36
|
[9] |
Qi G J, Aggarwal C, Tian Q, Ji H, Huang T. Exploring context and content links in social media: a latent space method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(5): 850–862
CrossRef
Google scholar
|
[10] |
Ma Z G, Nie F P, Yang Y, Uijlings J R, Sebe N.Web image annotation via subspace-sparsity collaborated feature selection. IEEE Transactions on Multimedia, 2012, 14(4): 1021–1030
CrossRef
Google scholar
|
[11] |
Gong Y C, Ke Q F, Isard M, Lazebnik S. A multi-view embedding space for modeling internet images, tags, and their semantics. International Journal of Computer Vision, 2013, 106(2): 210–233
CrossRef
Google scholar
|
[12] |
Kang C C, Xiang S M, Liao S C, Xu C S, Pan C H. Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Transactions on Multimedia, 2015, 17(3): 370–381
CrossRef
Google scholar
|
[13] |
Li K, Yang J Y, Jiang J M. Nonrigid structure from motion via sparse representation. IEEE Transactions on Cybernetics, 2015, 45(8): 1401–1413
CrossRef
Google scholar
|
[14] |
Li Z C, Tang J H, He X F. Robust structured nonnegative matrix factorization for image representation. IEEE Transactions on Neural Networks and Learning Systems, doi: 10.1109/TNNLS.2017.2691725
CrossRef
Google scholar
|
[15] |
Huiskes M, Lew M. The MIR flickr retrieval evaluation. In: Proceedings of ACM International Conference on Multimedia Information Retrieval. 2008, 39–43
CrossRef
Google scholar
|
[16] |
Tang J H, Shu X B, Li Z C, Qi G J, Wang J D. Generalized deep transfer networks for knowledge propagation in heterogeneous domains. ACM Transactions on Multimedia Computing Communications and Applications (TOMM), 2016, 12(4s): 68
CrossRef
Google scholar
|
[17] |
Hua X S, Yang L J, Wang J D, Wang J, Ye M, Wang K, Rui Y, Li J. Clickture: a large-scale real-world image dataset. Mocrosoft Research Technical Report MSR-TR-2013-75. 2013
|
[18] |
Huiskes M, Thomee B, Lew M. New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative. In: Proceedings of ACM International Conference on Multimedia Information Retrieval. 2010, 527–536
CrossRef
Google scholar
|
[19] |
Hua X S, Yang L J, Wang J D, Wang J, Ye M, Wang K, Rui Y, Li J. Clickage: towards bridging semantic and intent gaps via mining click logs of search engines. In: Proceedings of the 21st ACM International Conference on Multimedia. 2013, 243–252
CrossRef
Google scholar
|
[20] |
Sivic J, Zisserman A. Video Google: a text retrieval approach to object matching in videos. In: Proceedings of European Conference on Computer Vision. 2003
CrossRef
Google scholar
|
[21] |
Li Z C, Yang Y, Liu J, Zhou X F, Lu H Q. Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of National Conference on Artificial Intelligence. 2012, 1026–1032
|
[22] |
Yang Y, Ma Z G, Hauptmann A G, Sebe N. Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Transactions on Multimedia, 2013, 15(3): 661–669
CrossRef
Google scholar
|
[23] |
Li Z C, Liu J, Yang Y, Zhou X F, Lu H Q. Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Transactions on Knowledge and Data Engineering, 2014, 9(26): 2138–2150
|
[24] |
Tang J L, Liu H. An unsupervised feature selection framework for social media data. IEEE Transactions on Knowledge and Data Engineering, 2014, 12(26): 2914–2927
CrossRef
Google scholar
|
[25] |
Hong R C, Wang M, Gao Y, Tao D C, Li X L, Wu X D. Image annotation by multiple-instance learning with discriminative feature mapping and selection. IEEE Transactions on Cybernetics, 2014, 44(5): 669–680
CrossRef
Google scholar
|
[26] |
Li Z C, Tang J H. Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Transactions on Image Processing, 2015, 12(24): 5343–5355
CrossRef
Google scholar
|
[27] |
Shi C J, Ruan Q Q, Guo S, Tian Y. Sparse feature selection based on l2,1/2-matrix norm for web image annotation. Neurocomputing, 2015, 151: 424–433
CrossRef
Google scholar
|
[28] |
Chandrilka P, Jawahar C V. Multi modal semantic indexing for image retrieval. In: Proceedings of ACMInternational Conference on Image and Video Retrieval. 2010, 342–349
|
[29] |
Rasiwasia N, Pereira J C, Coviello E, Doyle G, Lanckriet G R, Levy R, Vasconcelos N. A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 251–260
CrossRef
Google scholar
|
[30] |
Hwang S J, Grauman K. Learning the relative importance of objects from tagged images for retrieval and cross-model search. International Journal of Computer Vision, 2012, 100(2): 134–153
CrossRef
Google scholar
|
[31] |
Li Z C, Liu J, Lu H Q. Structure preserving non-negative matrix factorization for dimensionality reduction. Computer Vision and Image Understanding, 2013, 9(117): 1175–1189
CrossRef
Google scholar
|
[32] |
Li Z C, Liu J, Lu H Q. Sparse constraint nearest neighbor selection in cross-media retrieval. In: Proceedings of the 17th IEEE International Conference on Image Processing. 2010, 1465–1468
|
[33] |
Liu X C, Song X N, Jiang J M. The extraction of powerful and attractive video contents based on one class SVM. In: Proceedings of Pacific Rim Conference on Multimedia. 2015, 375–382
CrossRef
Google scholar
|
[34] |
Yan Y, Xu Z W, Liu G W, Ma Z G, Sebe N. Glocal structural feature selection with sparsity for multimedia data understanding. In: Proceedings of the 21st ACM International Conference on Multimedia. 2013, 537–540
CrossRef
Google scholar
|
[35] |
Chartrand R. Exact reconstructions of sparse signals via nonconvex minimization. IEEE Signal Process Letters, 2007, 14(10): 707–710
CrossRef
Google scholar
|
[36] |
Chen X J, Xu F M, Ye Y Y. Lower bound theory of nonzero entries in solutions of l2-lpminimization. SIAM Journal on Scientific Computing, 2010, 32(5): 2832–2852
CrossRef
Google scholar
|
[37] |
Song X N, Zhang J G, Han Y H, Jiang J M. Semi-supervised feature selection via hierarchical regression forWeb image classification. Multimedia Systems, 2016, 22: 41–49
CrossRef
Google scholar
|
[38] |
Wang J J, Gong Y H. Discovering image semantics in codebook derivative space. IEEE Transactions on Multimedia, 2012, 14(4): 986–994
CrossRef
Google scholar
|
[39] |
Kuo Y H, Cheng W H, Lin H T, Hsu W H. Unsupervised semantic feature discovery for image object retrieval and tag refinement. IEEE Transactions on Multimedia, 2012, 14(4): 1079–1090
CrossRef
Google scholar
|
[40] |
Lu Z W, Peng Y X. Image annotation by semantic sparse recoding of visual content. In: Proceedings of the 20th ACM International Conference on Multimedia. 2012, 499–508
CrossRef
Google scholar
|
[41] |
Lu Z W, Peng Y X. Learning descriptive visual representation by semantic regularized matrix factorization. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013, 1523–1529
|
[42] |
Lu Z W, Wang L W, Wen J R. Direct semantic analysis for social image classification. In: Proceedings of AAAI Conference on Artificial Intelligence. 2014, 1258–1264
|
[43] |
Ballan L, Uricchio T, Seidenari L, Bimbo A D. A cross-media model for automatic image annotation. In: Proceedings of ACM International Conference on Multimedia Retrieval. 2014
CrossRef
Google scholar
|
[44] |
Tao L, Ip H, Wang Y L, Shu X. Exploring shared subspace and joint sparsity for canonical correlation analysis. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. 2014, 1887–1890
CrossRef
Google scholar
|
[45] |
Hofmann T. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 2001, 42(1-2): 177–196
CrossRef
Google scholar
|
[46] |
Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022
|
[47] |
Sun L, Ji S W, Ye J P. Canonical correlation analysis for multilabel classification: A least-squares formulation, extensions, and analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(1): 194–200
CrossRef
Google scholar
|
[48] |
Sharma A, Kumar A, III H D, Jacobs D W. Generalized multiview analysis: a discriminative latent space. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 2160–2167
|
[49] |
Murthy V N, Maji S, Manmatha R. Automatic image annotation using deep learning representations. In: Proceedings of ACM Int’l Conf. on Multimedia Retrieval. 2015, 603–606
CrossRef
Google scholar
|
[50] |
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the Neural Information Processing Systems Conference. 2012, 1097–1105
|
[51] |
Andrew G, Arora R, Bilmes J, Livescu K. Deep canonical correlation analysis. In: Proceedings of International Conference on Machine Learning. 2013, 1247–1255
|
[52] |
Frome A, Corrado G, Shlens J, Bengio S, Dean J, Mikolov T. Devise: A deep visual-semantic embedding model. In: Proceedings of the Neural Information Processing Systems Conference. 2013, 2121–2129
|
[53] |
Liu Y, Shi Z C, Li X, Wang G. Click-through-based deep visualsemantic embedding for image search. In: Proceedings of the 23rd ACM International Conference on Multimedia. 2015, 955–958
CrossRef
Google scholar
|
[54] |
Li Z C, Liu J, Tang J H, Lu H Q. Robust structured subspace learning for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(10): 2085–2098
CrossRef
Google scholar
|
[55] |
Tang J H, Zha Z J, Tao D C, Chua T S. Semantic-gap-oriented active learning for multilabel image annotation. IEEE Transactions on Image Processing, 2012, 21(4): 2354–2360
CrossRef
Google scholar
|
[56] |
Li Z C, Liu J, Xu C S, Lu H Q. Mlrank: Multi-correlation learning to rank for image annotation. Pattern Recognition, 2013, 46(10): 2700–2710
CrossRef
Google scholar
|
[57] |
Zhang J G, Han Y H, Jiang J M. Tensor rank selection for multimedia analysis. Journal of Visual Communication and Image Representation, 2015, 30: 376–392
CrossRef
Google scholar
|
[58] |
Tang J H, Shu X B, Qi Q J, Li Z C, Wang M, Yan S C, Jain R. Triclustered tensor completion for social-aware image tag refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1662–1674
CrossRef
Google scholar
|
[59] |
Barnard K, Duygulu P, Forsyth D, Freitas N D, Blei D M, Jordan M I. Matching words and pictures. Journal ofMachine Learning Research, 2003, 3: 1107–1135
|
[60] |
Tang J H, Yan S C, Hong R C, Qi G J, Chua T S. Inferring semantic concepts from community-contributed images and noisy tags. In: Proceedings of the 17th International Conference on Multimedia. 2009, 223–232
CrossRef
Google scholar
|
[61] |
Liu D, Hua X S, Yang L J, Wang M, Zhang H J. Tag ranking. In: Proceedings of the 18th ACM International Conference on World Wide Web. 2009, 351–360
CrossRef
Google scholar
|
[62] |
Liu D, Hua X S, Wang M, Zhang H J. Tag retagging. In: Proceedings of ACM Conference on Multimedia. 2010
|
[63] |
Liu D, Yan S C, Rui Y, Zhang H J. Unified tag analysis with multiedge graph. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 25–34
|
[64] |
Tang J H, Hong R C, Yan S C, Chua T S, Qi G J, Jain R. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent Systems and Technology, 2011, 2(2): 14: 1–15
|
[65] |
Zhuang J F, Hoi S C. A two-view learning approach for image tag ranking. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining. 2011, 625–634
CrossRef
Google scholar
|
[66] |
Zhang X M, Zhao X J, Li Z J, Xia J L, Jain R, Chao W H. Social image tagging using graph-based reinforcement on multi-type interrelated objects. Signal Processing, 2013, 93(8): 2178–2189
CrossRef
Google scholar
|
[67] |
Zhu X F, Nejdl W, Georgescu M. An adaptive teleportation random walk model for learning social tag relevance. In: Proceedings of the 37th ACM SIGIR International Conference on Research and Development in Information Retrieval. 2014, 223–232
CrossRef
Google scholar
|
[68] |
Li Z C, Liu J, Zhu X B, Liu T L, Lu H Q. Image annotation using multi-correlation probabilistic matrix factorization. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 1187–1190
CrossRef
Google scholar
|
[69] |
Zhu G Y, Yan S C, Ma Y. Image tag refinement towards low-rank, content-tag prior and error sparsity. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 461–470
CrossRef
Google scholar
|
[70] |
Feng Z Y, Feng S H, Jin R, Jain A K. Image tag completion by noisy matrix recovery. In: Proceedings of European Conference on Computer Vision, Part I. 2014, 424–438
CrossRef
Google scholar
|
[71] |
Yang Y, Gao Y, Zhang H W, Shao J, Chua T S. Image tagging with social assistance. In: Proceedings of ACM International Conference on Multimedia Retrieval. 2014
CrossRef
Google scholar
|
[72] |
Liu J, Zhang Y F, Li Z C, Lu H Q. Correlation consistency constrained probabilistic matrix factorization for social tag refinement. Neurocomputing, 2013, 119: 3–9
CrossRef
Google scholar
|
[73] |
Li Z C, Liu J, Lu H Q. Nonlinear matrix factorization with unified embedding for social tag relevance learning. Neurocomputing, 2013, 105: 38–44
CrossRef
Google scholar
|
[74] |
Li X, Shen B, Liu B D, Zhang Y J. A locality sensitive low-rank model for image tag completion. IEEE Transactions on Multimedia, 2016, 18(3): 474–483
CrossRef
Google scholar
|
[75] |
Li Z C, Tang J H. Weakly-supervised deep matrix factorization for social image understanding. IEEE Transactions on Image Processing (TIP), 2017, 26(1): 276–288
CrossRef
Google scholar
|
[76] |
Li Z C, Tang J H. Weakly-supervised deep nonnegative low-rank model for social image tag refinement and assignment. In: Proceedings of AAAI Conference on Artificial Intelligence. 2017
|
[77] |
Sang J T, Xu C S, Liu J. User-aware image tag refinement via ternary semantic analysis. IEEE Transactions on Multimedia, 2012, 14(3): 883–895
CrossRef
Google scholar
|
[78] |
Qian Z M, Zhong P, Wang R S. Tag refinement for user-contributed images via graph learning and nonnegative tensor factorization. IEEE Signal Processing Letters, 2015, 22(9): 1302–1305
CrossRef
Google scholar
|
[79] |
Wang J D, Zhou J Z, Xu H, Mei T, Hua X S, Li S P. Image tag refinement by regularized latent dirichlet allocation. Computer Vision and Image Understanding, 2014, 124: 61–70
CrossRef
Google scholar
|
[80] |
Niu Z X, Hua G, Gao X B, Tian Q. Semi-supervised relational topic model for weakly annotated image recognition in social media. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 4233–4240
CrossRef
Google scholar
|
[81] |
Lin J, Yuan J S, Duan L Y, Luo S W, Gao W. Social image tagging by mining sparse tag patterns from auxiliary data. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2012, 7–12
CrossRef
Google scholar
|
[82] |
Lin Z J, Ding G G, Hu M Q, Wang J M, Ye X J. Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 1618–1625
CrossRef
Google scholar
|
[83] |
Qian X M, Hua X S, Tang Y Y, Mei T. Social image tagging with diverse semantics. IEEE Transactions on Cybernetics, 2014, 44(12): 2493–2508
CrossRef
Google scholar
|
[84] |
Wu L, Jin R, Jain A K. Tag completion for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(3): 716–727
CrossRef
Google scholar
|
[85] |
Wu L, Yang L J, Yu N H, Hua X S. Learning to tag. In: Proceedings of the 18th International Conference on World Wide Web. 2009
CrossRef
Google scholar
|
[86] |
Sun A X, Bhowmick S S, Chong J A. Social image tag recommendation by concept matching. In: Proceedings of the 19th ACM International Conference on Multimedia. 2011, 1181–1184
CrossRef
Google scholar
|
[87] |
Garg N, Weber I. Personalized, interactive tag recommendation for flickr. In: Proceedings of ACM Conference on Recommender Systems. 2008
CrossRef
Google scholar
|
[88] |
Li X R, Gavves E, Snoek C G M, Worring M, Smeulders A W. Personalizing automated image annotation using cross-entropy. In: Proceedings of the 19th ACM International Conference on Multimedia. 2011, 233–242
CrossRef
Google scholar
|
[89] |
Liu J, Li Z C, Tang J H, Jiang Y, Lu H Q. Personalized geo-specific tag recommendation for photos on social websites. IEEE Transactions on Multimedia, 2014, 16(3): 588–600
CrossRef
Google scholar
|
[90] |
Rafailidis D, Axenopoulos A, Etzold J, Manolopoulou S, Daras P. Content-based tag propagation and tensor factorization for personalized item recommendation based on social tagging. ACM Transactions on Interactive Intelligent Systems, 2014, 3(4): 26: 1–27
|
[91] |
Li X R, Snoek C G M, Worring M. Learning tag relevance by neighbor voting for social image retrieval. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval. 2008, 180–187
CrossRef
Google scholar
|
[92] |
Liu D, Hua X S, Wang M, Zhang H J. Boost search relevance for tagbased social image retrieval. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2009, 1636–1639
|
[93] |
Gao Y, Wang M, Zha Z J, Shen J L, Li X L, Wu X D. Visual-textual joint relevance learning for tag-based social image search. IEEE Transactions on Image Processing, 2013, 22(1): 363–376
CrossRef
Google scholar
|
[94] |
Sang J T, Xu C S, Lu D Y. Learn to personalized image search from the photo sharing websites. IEEE Transactions on Multimedia, 2012, 14(4): 963–974
CrossRef
Google scholar
|
[95] |
Wang M, Wang K Y, Hua X S, Zhang H J. Towards a relevant and diverse search of social images. IEEE Transactions on Multimedia, 2010, 12(8): 829–842
CrossRef
Google scholar
|
[96] |
Rudinac S, Hanjalic A, Larson M. Finding representative and diverse community contributed images to create visual summaries of geographic areas. In: Proceedings of the 19th ACM International Conference on Multimedia. 2011, 1109–1112
CrossRef
Google scholar
|
[97] |
Jia Y Q, Salzmann M, Darrell T. Learning cross-modality similarity for multinomial data. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 2407–2414
|
[98] |
Pan Y W, Yao T, Mei T, Li H Q, Ngo C W, Rui Y. Click-throughbased cross-view learning for image search. In: Proceedings of the 37th ACM SIGIR International Conference on Research and Development in Information Retrieval. 2014
|
[99] |
Feng F X, Wang X J, Li R F. Cross-modal retrieval with correspondence autoencoder. In: Proceedings of the 22nd ACM International Conference on Multimedia. 2014
CrossRef
Google scholar
|
[100] |
Wang W, Yang X Y, Ooi B C, Zhang D X, Zhuang Y T. Effective deep learning-based multi-modal retrieval. The VLDB Journal, 2016, 25: 79–101
CrossRef
Google scholar
|
[101] |
Wei Y C, Zhao Y, Lu C Y, Wei S K, Liu L Q, Zhu Z F, Yan S C. Cross-modal retrieval with cnn visual features: a new baseline. IEEE Transactions on Cybernetics, 2017, 47(2): 449–460
|
[102] |
Wu L, Hoi S C, Jin R, Zhu J K, Yu N H. Distance metric learning from uncertain side information with application to automated photo tagging. In: Proceedings of the 17th ACM International Conference on Multimedia. 2009
CrossRef
Google scholar
|
[103] |
Wu P C, Hoi S C, Zhao P L, He Y. Mining social images with distance metric learning for automated image tagging. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining. 2011, 197–206
CrossRef
Google scholar
|
[104] |
Li Z C, Liu J, Jiang Y, Tang J H, Lu H Q. Low rank metric learning for social image retrieval. In: Proceedings of the 20th ACM International Conference on Multimedia. 2012, 853–856
CrossRef
Google scholar
|
[105] |
Liu S W, Cui P, Zhu W W, Yang S Q, Tian Q. Social embedding image distance learning. In: Proceedings of the 22nd ACMInternational Conference on Multimedia. 2014, 617–626
CrossRef
Google scholar
|
[106] |
Xia H, Wu P C, Hoi S C. Online multi-modal distance learning for scalable multimedia retrieval. In: Proceedings of the 6th ACM International Conference onWeb Search and DataMining. 2013, 455–464
CrossRef
Google scholar
|
[107] |
Gao X Y, Hoi S C, Zhang Y D, Wan J, Li J T. SOML: Sparse online metric learning with application to image retrieval. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2014, 1206–1212
|
[108] |
Wu P C, Hoi S C, Zhao P L, Miao C Y, Liu Z Y. Online multi-modal distance metric learning with application to image retrieval. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(2): 454–467
CrossRef
Google scholar
|
[109] |
Li Z C, Tang J H. Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Transactions on Multimedia, 2015, 17(11): 1989–1999
CrossRef
Google scholar
|
[110] |
Wu P C, Hoi S C, Xia H, Zhao P L,Wang D Y, Miao C Y. Online multimodaldeep similarity learning with application to image retrieval. In: Proceedings of the 21st ACM International Conference on Multimedia. 2013, 153–162
CrossRef
Google scholar
|
[111] |
Zhuang Y T, Liu Y, Wu F, Zhang Y, Shao J. Hypergraph spectral hashing for similarity search of social image. In: Proceedings of the 19th ACM International Conference on Multimedia. 2011, 1457–1460
CrossRef
Google scholar
|
[112] |
Li P, Wang M, Cheng J, Xu C S, Lu H Q. Spectral hashing with semantically consistent graph for image indexing. IEEE Transactions on Multimedia, 2013, 15(1): 141–152
CrossRef
Google scholar
|
[113] |
Cheng J, Leng C, Li P, Wang M, Lu H Q. Semi-supervised multigraph hashing for scalable similarity search. Computer Vision and Image Understanding, 2014, 124: 12–21
CrossRef
Google scholar
|
[114] |
Tang J H, Li Z C, Zhang L Y, Huang Q M. Semantic-aware hashing for social image retrieval. In: Proceedings of the 5th ACM International Conference on Multimedia Retrieval. 2015, 483–486
CrossRef
Google scholar
|
[115] |
Tang J H, Li Z C, Wang M, Zhao R Z. Neighborhood discriminant hashing for large-scale image retrieval. IEEE Transactions on Image Processing, 2015, 24(9): 2827–2840
CrossRef
Google scholar
|
[116] |
Lin J, Li Z C, Tang J H. Discriminative deep hashing for scalable face image retrieval. In: Proceedings of International Joint Conference on Artificial Intelligence. 2017
CrossRef
Google scholar
|
[117] |
Tang J H, Li Z C, Zhu X. Supervised deep hashing for scalable face image retrieval. Pattern Recognition, 2017, doi: org/10.1016/j.patcog.2017.03.028
|
[118] |
Tang J H, Li Z C. Weakly-supervised multimodal hashing for scalable social image retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 2017, doi: 10.1109/TCSVT.2017.2715227
CrossRef
Google scholar
|
[119] |
Kennedy L, Naaman M, Ahern S, Nair R, Rattenbury T. How flickr helps us make sense of the world: context and content in communitycontributed media collections. In: Proceedings of the 15th ACM International Conference on Multimedia. 2007, 631–640
CrossRef
Google scholar
|
[120] |
Hays J, Efros A A. IM2GPS: estimating geographic information from a single image. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–8
CrossRef
Google scholar
|
[121] |
Yang J C, Luo J B, Yu J, Huang T. Photo stream alignment and summarization for collaborative photo collection and sharing. IEEE Transactions on Multimedia, 2012, 14(9): 1642–1651
CrossRef
Google scholar
|
[122] |
Li Z C, Tang J H, Wang X M, Liu J, Lu H Q. Multimedia news summarization in search. ACM Transactions on Intelligent Systems and Technology, 2016, 7(3): 33:1–33:20
|
[123] |
Liu Y M, Xu D, Tsang I W, Luo J B. Using large-scale web data to facilitate textual query based retrieval of consumer photos. In: Proceedings of the 17th ACM International Conference on Multimedia. 2009, 55–64
CrossRef
Google scholar
|
[124] |
Xu Y M L D, Tsang I W, Luo J B. Textual query of personal photos facilitated by large-scale web data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(5): 1022–1036
CrossRef
Google scholar
|
[125] |
Stefanie N, Ronny P, Uwe K. Photo summary: automated selection of representative photos from a digital collection. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval. 2011, 75:1–75:2
|
[126] |
Hua X S, Lu L, Zhang H J. Optimization-based automated home video editing system. IEEE Transactions on Circuit and System for Video Technology, 2004, 14: 572–583
CrossRef
Google scholar
|
[127] |
Ma Y F, Hua X S, Lu L, Zhang H J. A generic framework of user attention model and its application in video summarization. IEEE Transactions on Multimedia, 2005, 7(5): 907–919
CrossRef
Google scholar
|
[128] |
Andaloussi S J, Mohamed A, Madrane N, Sekkaki A. Soccer video summarization using video content analysis and social media streams. In: Proceedings of IEEE/ACM International Symposium on Big Data Computing. 2014, 1–7
CrossRef
Google scholar
|
[129] |
Khosla A, Hamid R, Lin C J, Sundaresan N. Large-scale video summarization using web-image priors. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2013, 2698–2705
CrossRef
Google scholar
|
[130] |
Xu C S, Zhang Y F, Zhu G Y, Rui Y, Lu H Q, Huang Q M. Using webcast text for semantic event detection in broadcast sports video. IEEE Transactions on Multimedia, 2008, 10: 1342–1355
CrossRef
Google scholar
|
[131] |
Hong R C, Tang J H, Tan H K, Ngo C W, Yan S C, Chua T S. Beyond search: event-driven summarization for web videos. ACM Transactions on Multimedia Computing Communications, and Applications, 2011, 7(4): 35
CrossRef
Google scholar
|
[132] |
Wan J, Wang D Y, Hoi S C, Wu P C, Zhu J K, Zhang Y D, Li J T. Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the 22nd ACM International Conference on Multimedia. 2014, 157–166
CrossRef
Google scholar
|
[133] |
Li G, Ma S B, Han Y H. Summarization-based video caption via deep neural networks. In: Proceedings of the 23rd ACM International Conference on Multimedia. 2015, 1191–1194
CrossRef
Google scholar
|
/
〈 | 〉 |