Advance on large scale near-duplicate video retrieval
Ling SHEN, Richang HONG, Yanbin HAO
Advance on large scale near-duplicate video retrieval
Emerging Internet services and applications attract increasing users to involve in diverse video-related activities, such as video searching, video downloading, video sharing and so on. As normal operations, they lead to an explosive growth of online video volume, and inevitably give rise to the massive near-duplicate contents. Near-duplicate video retrieval (NDVR) has always been a hot topic. The primary purpose of this paper is to present a comprehensive survey and an updated reviewof the advance on large-scaleNDVR to supply guidance for researchers. Specifically, we summarize and compare the definitions of near-duplicate videos (NDVs) in the literature, analyze the relationship between NDVR and its related research topics theoretically, describe its generic framework in detail, investigate the existing state-of-the-art NDVR systems. Finally, we present the development trends and research directions of this topic.
near-duplicate videos / video retrieval / featurerepresentation / video signature / indexing / similarity measurement
[1] |
Khan N, Yaqoob I, Hashem I A T, Inayat Z, Ali W K M, Alam M, Shiraz M, Gani A. Big data: survey, technologies, opportunities, and challenges. The Scientific World Journal, 2014, 2014: 712826
CrossRef
Google scholar
|
[2] |
Wu X, Hauptmann A G, Ngo C W. Practical elimination of nearduplicates from web video search. In: Proceedings of the 15th ACM International Conference on Multimedia. 2007, 218–227
CrossRef
Google scholar
|
[3] |
Davidson J, Liebald B, Liu J, Nandy P, Vleet T V. The youtube video recommendation system. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 293–296
CrossRef
Google scholar
|
[4] |
Yang B, Mei T, Hua X S, Yang L, Yang S Q, Li M J. Online video recommendation based on multimodal fusion and relevance feedback. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. 2007, 73–80
CrossRef
Google scholar
|
[5] |
Koch E, Rindfre J, Zhao J. Copyright protection for multimedia data. In: Proceedings of the International Conference on Digital Media and Electronic Publishing. 1994
|
[6] |
Zhou X, Chen L. Monitoring near duplicates over video streams. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 521–530
CrossRef
Google scholar
|
[7] |
Tamilselvi J J, Gifta C B. Handling duplicate data in data warehouse for data mining. International Journal of Computer Applications, 2011, 15(4): 7–15
CrossRef
Google scholar
|
[8] |
Chen M S, Han J, Yu P S. Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 2002, 8(6): 866–883
CrossRef
Google scholar
|
[9] |
Wu X, Ide I, Satoh S. News topic tracking and re-ranking with query expansion based on near-duplicate detection. In: Proceedings of Pacific-Rim Conference on Multimedia. 2009, 755–766
CrossRef
Google scholar
|
[10] |
Shen H T, Zhou X, Huang Z, Shao J, Zhou X. UQLIPS: a realtime near-duplicate video clip detection system. In: Proceedings of the 33rd International Conference on Very Large Data Bases. 2007, 1374–1377
|
[11] |
Liu J, Huang Z, Cai H, Shen H T, Ngo C W, Wan g W. Near-duplicate video retrieval: current research and future trends. ACM Computing Surveys, 2013, 45(4): 44
CrossRef
Google scholar
|
[12] |
Cherubini M, Oliveira R D, Oliver N. Understanding near-duplicate videos: a user-centric approach. In: Proceedings of the 17th ACM International Conference on Multimedia. 2009, 35–44
CrossRef
Google scholar
|
[13] |
Chou C L, Chen H T, Lee S Y. Pattern-based near-duplicate video retrieval and localization on web-scale videos. IEEE Transactions on Multimedia, 2015, 17(3): 382–395
CrossRef
Google scholar
|
[14] |
Zhang J R, Ren J Y, Chang F, Wood T L, Kender J R. Fast nearduplicate video retrieval via motion time series matching. In: Proceedings of the IEEE International Conference on Multimedia and Expo. 2012, 842–847
CrossRef
Google scholar
|
[15] |
Basharat A, Zhai Y, Shah M. Content based video matching using spatiotemporal volumes. Computer Vision and Image Understanding, 2008, 110(3): 360–377
CrossRef
Google scholar
|
[16] |
Smeulders A W M, Worring M, Santini S, Gupta A, Jain R. Contentbased image retrieval at the end of the early. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12): 1349–1380
CrossRef
Google scholar
|
[17] |
Yan Y, Ooi B C, Zhou A. Continuous content-based copy detection over streaming videos. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 853–862
CrossRef
Google scholar
|
[18] |
Mou L, Huang T, Tian Y, Jiang M, Gao W. Content-based copy detection through multimodal feature representation and temporal pyramid matching. ACM Transactions on Multimedia Computing Communications and Applications, 2013, 10(1): 1–20
CrossRef
Google scholar
|
[19] |
Hong R, Yang Y, Wang M, Hua X S. Learning visual semantic relationships for efficient visual retrieval. IEEE Transactions on Big Data, 2017, 1(4): 152–161
CrossRef
Google scholar
|
[20] |
Saravanan M S G, Sivaprakasam M T, Somasundaram D. A review on content based video retrieval, classification and summarization. Asian Journal of Applied Science and Technology, 2017, 1(9): 40–45
|
[21] |
Xie Q, Huang Z, Shen H T, Zhou X, Pang C. Efficient and continuous near-duplicate video detection. In: Proceedings of the 12th International Asia-Pacific Web Conference. 2010, 260–266
CrossRef
Google scholar
|
[22] |
Nie X, Chai Y, Liu J, Sun J, Yin Y. Spherical torus-based video hashing for near-duplicate video detection. Science China Information Sciences, 2016, 59(5): 059101
CrossRef
Google scholar
|
[23] |
da Silva H B, do Patrocínio Z K, Gravier G, Amsaleg L, Araújo A D A, Guimaraes S J F. Near-duplicate video detection based on an approximate similarity self-join strategy. In: Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing. 2016, 1–6
CrossRef
Google scholar
|
[24] |
Lameri S, Bondi L, Bestagini P, Tubaro S. Near-duplicate video detection exploiting noise residual traces. In: Proceedings of the IEEE International Conference on Image Processing. 2017, 1497–1501
CrossRef
Google scholar
|
[25] |
Washino K, Schwab B H. Video monitoring and conferencing system. U.S. Patent No. 5,625,410. 1997-4-29
|
[26] |
Jiang J, Tong Y, Lu H, Cui B, Lei K, Yu L. GVoS: a general system for near-duplicate video-related applications on storm. ACM Transactions on Information Systems, 2017, 36(1): 3
CrossRef
Google scholar
|
[27] |
Huang Z, Wang L, Shen H T, Shao J, Zhou X. Online near-duplicate video clip detection and retrieval: an accurate and fast system. In: Proceedings of the 25th IEEE International Conference on Data Engineering. 2009, 1511–1514
CrossRef
Google scholar
|
[28] |
Kraaij W, Awad G. TRECVID 2011 content-based copy detection: task overview. Online Proceedings of TRECVid, 2011
|
[29] |
Awad G, Fiscus J, Kraaij W. TRECVID 2011–an overview of the goals, tasks, data, evaluation mechanisms, and metrics. National Institute of Standards and Technology, 2014, 1–58
CrossRef
Google scholar
|
[30] |
Smeaton A F, Over P, Kraaij W. Evaluation campaigns and TRECVid. In: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval. 2006, 321–330
CrossRef
Google scholar
|
[31] |
Law-To J, Chen L, Joly A, Laptev I, Buisson O, Gouet-Brunet V, Boujemaa N, Stentiford F. Video copy detection: a comparative study. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. 2007, 371–378
CrossRef
Google scholar
|
[32] |
Hampapur A, Bolle R M. Comparison of sequence matching techniques for video copy detection. In: Proceedings of SPIE Storage and Retrieval for Media Databases. 2002, 194–202
|
[33] |
Zobel J, Hoad T C. Detection of video sequences using compact signatures. ACM Transactions on Information Systems, 2006, 24(1): 1–50
CrossRef
Google scholar
|
[34] |
Joly A, Buisson O, Frelicot C. Content-based copy retrieval using distortion-based probabilistic similarity search. IEEE Transactions on Multimedia, 2007, 9(2): 293–306
CrossRef
Google scholar
|
[35] |
Yeh M C, Cheng K T. Video copy detection by fast sequence matching. In: Proceedings of the ACM International Conference on Image and Video Retrieval. 2009, 45
CrossRef
Google scholar
|
[36] |
Kraaij W, Awad G, Over P. TRECVID-2008 content-based copy detection task overview (slides). National Institute of Standards and Technology, 2008
|
[37] |
Aigrain P, Zhang H, Petkovic D. Content-based representation and retrieval of visual media: a state-of-the-art review. Multimedia Tools and Applications, 1996, 3(3): 179–202
CrossRef
Google scholar
|
[38] |
Hu W, Xie N, Li L, Maybank S. A survey on visual content-based video indexing and retrieval. IEEE Transactions on Systems Man and Cybernetics, Part C, 2011, 41(6): 797–819
CrossRef
Google scholar
|
[39] |
Hong R, Tang J, Tan H K, Ngo C W, Yan S, Chua T S. Beyond search: event-driven summarization for web videos. ACM Transactions on Multimedia Computing Communications and Applications, 2011, 7(4): 35
CrossRef
Google scholar
|
[40] |
Chua T S, Hong R, Li G, Tang J. From text question-answering to multimedia QA on web-scale media resources. In: Proceedings of the 1st ACM Workshop on Large-Scale Multimedia Retrieval and Mining. 2009, 51–58
CrossRef
Google scholar
|
[41] |
Zhao W L, Ngo C W, Tan H K, Wu X. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Transactions on Multimedia, 2007, 9(5): 1037–1048
CrossRef
Google scholar
|
[42] |
Wu X, Zhao W L, Ngo C W. Near-duplicate keyframe retrieval with visual keywords and semantic context. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. 2007, 162–169
CrossRef
Google scholar
|
[43] |
Geetha P, Narayanan V. A survey of content-based video retrieval. Journal of Computer Science, 2008, 4(6): 734
CrossRef
Google scholar
|
[44] |
Wu X, Zhao W L, Ngo C W. Efficient near-duplicate keyframe retrieval with visual language models. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2007, 500–503
CrossRef
Google scholar
|
[45] |
Yeo C, Zhu Y W, Sun Q, Chang S F. A framework for sub-window shot detection. In: Proceedings of the 11th International Multimedia Modelling Conference. 2005, 84–91
|
[46] |
Satoh S, Takimoto M, Adachi J. Scene duplicate detection from videos based on trajectories of feature points. In: Proceedings of the International Workshop on Multimedia Information Retrieval. 2007, 237–244
CrossRef
Google scholar
|
[47] |
Hong R, Wang M, Xu M, Yan S, Chua T S. Dynamic captioning: video accessibility enhancement for hearing impairment. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 421–430
CrossRef
Google scholar
|
[48] |
Wang M, Hong R, Yuan X T, Yan S, Chua T S. Movie2Comics: towards a lively video content presentation. IEEE Transactions on Multimedia, 2012, 14(3): 858–870
CrossRef
Google scholar
|
[49] |
Birchfield S T, Rangarajan S. Spatiograms versus histograms for region-based tracking. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 1158–1163
|
[50] |
Li J, Wu W, Wang T, Zhang Y. One step beyond histograms: image representation using Markov stationary features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–8
|
[51] |
Shang L, Chan K P, Hua X S. Real-time large scale near-duplicate web video retrieval. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 531–540
CrossRef
Google scholar
|
[52] |
Song J, Yang Y, Huang Z, Shen H T, Luo J. Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Transactions on Multimedia, 2013, 15(8): 1997–2008
CrossRef
Google scholar
|
[53] |
Swain M J, Ballard D H. Color indexing. International Journal of Computer Vision, 1991, 7(1): 11–32
CrossRef
Google scholar
|
[54] |
Bhat D N, Nayar S K. Ordinal measures for image correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(4): 415–423
CrossRef
Google scholar
|
[55] |
Dong W, Wang Z, Charikar M, Li K. Efficiently matching sets of features with random histograms. In: Proceedings of the 16th ACM International Conference on Multimedia. 2008, 179–188
CrossRef
Google scholar
|
[56] |
Ke Y, Sukthankar R. PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004, 506–513
|
[57] |
Ke Y, Sukthankar R, Huston L. Efficient near-duplicate detection and sub-image retrieval. In: Proceedings of ACM International Conference on Multimedia. 2004
CrossRef
Google scholar
|
[58] |
Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110
CrossRef
Google scholar
|
[59] |
Lowe D G. Object recognition from local scale-invariant features. In: Proceedings of IEEE International Conference on Computer Vision. 1999, 1150–1157
CrossRef
Google scholar
|
[60] |
Bay H, Tuytelaars T, Van Gool L. SURF: speeded up robust features. In: Proceedings of European Conference on Computer Vision. 2006, 404–417
CrossRef
Google scholar
|
[61] |
Yang G, Chen N, Jiang Q. A robust hashing algorithm based on SURF for video copy detection. Computers and Security, 2012, 31(1): 33–39
CrossRef
Google scholar
|
[62] |
Hao Y, Mu T, Hong R, Wang M, An N, Goulermas J Y. Stochastic multiview hashing for large-scale near-duplicate video retrieval. IEEE Transactions on Multimedia, 2017, 19(1): 1–14
CrossRef
Google scholar
|
[63] |
Zhao G, Pietikainen M. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 915–928
CrossRef
Google scholar
|
[64] |
Hao Y,Mu T, Goulermas J Y, Jiang J, Hong R, Wang M. Unsupervised t-distributed video hashing and its deep hashing extension. IEEE Transactions on Image Processing, 2017, 26(11): 5531–5544
CrossRef
Google scholar
|
[65] |
Chum O, Philbin J, Zisserman A. Near duplicate image detection: min-hash and TF-IDF weighting. In: Proceedings of the British Machine Vision Conference. 2008, 812–815
CrossRef
Google scholar
|
[66] |
Jing W, Nie X, Cui C, Xi X, Yang G, Yin Y. Global-view hashing: harnessing global relations in near-duplicate video retrieval. World Wide Web, 2019, 22(2): 771–789
CrossRef
Google scholar
|
[67] |
Nie X, Li X, Sun J, Yin Y. UFvH: unified feature video hashing for near-duplicate video retrieval. In: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities. 2017, 17–24
CrossRef
Google scholar
|
[68] |
Law-To J, Buisson O, Gouet-Brunet V, Boujemaa N. Robust voting algorithm based on labels of behavior for video copy detection. In: Proceedings of the 14th ACM International Conference on Multimedia. 2006, 835–844
CrossRef
Google scholar
|
[69] |
Zhang J R, Ren J Y, Chang F, Wood T L, Kender J R. Fast nearduplicate video retrieval via motion time series matching. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2012, 842–847
CrossRef
Google scholar
|
[70] |
Chou C L, Chen H T, Chen Y C, Ho C P, Lee S Y. Near-duplicate video retrieval and localization using pattern set based dynamic programming. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2013, 1–6
|
[71] |
Hua X S, Chen X, Zhang H J. Robust video signature based on ordinal measure. In: Proceedings of International Conference on Image Processing. 2004, 685–688
|
[72] |
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 15th International Conference on Neural Information Processing Systems. 2012, 1097–1105
|
[73] |
Razavian A S, Sullivan J, Maki A, Carlsson S. A baseline for visual instance retrieval with deep convolutional networks. In: Proceedings of International Conference on Learning Representations. 2015
|
[74] |
Razavian A S, Azizpour H, Sullivan J, Carlsson S. CNN features offthe- shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014, 806–813
CrossRef
Google scholar
|
[75] |
Xu Z, Yang Y, Hauptmann A G. A discriminative CNN video representation for event detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1798–1807
CrossRef
Google scholar
|
[76] |
Kordopatis-Zilos G, Papadopoulos S, Patras I, Kompatsiaris Y. Nearduplicate video retrieval by aggregating intermediate CNN layers. In: Proceedings of International Conference on Multimedia Modeling. 2017, 251–263
CrossRef
Google scholar
|
[77] |
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 4489–4497
CrossRef
Google scholar
|
[78] |
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 3104–3112
|
[79] |
Zhang H, Wang M, Hong R, Chua T S. Play and rewind: optimizing binary representations of videos by self-supervised temporal hashing. In: Proceedings of the 2016 ACM Multimedia Conference. 2016, 781–790
CrossRef
Google scholar
|
[80] |
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8), 1735–1780
CrossRef
Google scholar
|
[81] |
Cho K, Van Merriénboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 1724–1734
CrossRef
Google scholar
|
[82] |
Song J, Yang Y, Huang Z, Shen H T, Hong R. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In: Proceedings of the 19th ACM International Conference on Multimedia. 2011, 423–432
CrossRef
Google scholar
|
[83] |
Zhao W L, Tan S, Ngo C W. Large-scale near-duplicate web video search: challenge and opportunity. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2009, 1624–1627
CrossRef
Google scholar
|
[84] |
Jiang Y G, Ngo C W. Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval. Computer Vision and Image Understanding, 2009, 113(3): 405–414
CrossRef
Google scholar
|
[85] |
Liu L, Lai W, Hua X S, Yang S Q. Video histogram: a novel video signature for efficient web video duplicate detection. In: Proceedings of International Conference on Multimedia Modeling. 2007, 94–103
CrossRef
Google scholar
|
[86] |
Huang Z, Shen H T, Shao J, Zhou X. Bounded coordinate system indexing for real-time video clip search. ACM Transactions on Information Systems, 2009, 27(3): 17
CrossRef
Google scholar
|
[87] |
Shen H T, Ooi B C, Zhou X. Towards effective indexing for very large video sequence database. In: Proceedings of the 2005 ACMSIGMOD International Conference on Management of Data. 2005, 730–741
CrossRef
Google scholar
|
[88] |
Kordopatis-Zilos G, Papadopoulos S, Patras I, Kompatsiaris Y. Nearduplicate video retrieval with deep metric learning. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 347–356
CrossRef
Google scholar
|
[89] |
Böhm C, Berchtold S, Keim D A. Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Computing Surveys, 2001, 33(3): 322–373
CrossRef
Google scholar
|
[90] |
Snoek C G M, Worring M. Multimodal video indexing: a review of the state-of-the-art. Multimedia Tools and Applications, 2005, 25(1): 5–35
CrossRef
Google scholar
|
[91] |
Boughorbel S, Tarel J P, Boujemaa N. Generalized histogram intersection kernel for image recognition. In: Proceedings of IEEE International Conference on Image Processing. 2005, 3: III–161
CrossRef
Google scholar
|
[92] |
Wu J, Rehg J M. Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 630–637
|
[93] |
Jagadish H V, Ooi B C, Tan K L, Yu C, Zhang R. iDistance: an adaptive B+-tree based indexing method for nearest neighbor search. ACM Transactions on Database Systems, 2005, 30(2): 364–397
CrossRef
Google scholar
|
[94] |
Bayer R, Mccreight E. Organization and Maintenance of Large Ordered Indexes. Software Pioneers, Springer, Berlin, Heidelberg, 2002, 245–262
CrossRef
Google scholar
|
[95] |
Bohm C, Gruber M, Kunath P, Pryakhin A, Schubert M. Prover: probabilistic video retrieval using the gauss-tree. In: Proceedings of the 23rd IEEE International Conference on Data Engineering. 2007, 1521–1522
CrossRef
Google scholar
|
[96] |
Chen M, Mao S, Liu Y. Big data: a survey. Mobile Networks and Applications, 2014, 19(2): 171–209
CrossRef
Google scholar
|
[97] |
Wang J, Zhang T, Song J, Sebe N, Shen H T. A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 769–790
CrossRef
Google scholar
|
[98] |
Wang J, Shen H T, Song J, Song J, Ji J. Hashing for similarity search: a survey. 2014, arXiv preprint arXiv:1408.2927
|
[99] |
Zhou X, Chen L, Zhou X. Structure tensor series-based large scale near-duplicate video retrieval. IEEE Transactions on Multimedia, 2012, 14(4): 1220–1233
CrossRef
Google scholar
|
[100] |
Wang Y, Belkhatir M, Tahayna B. Near-duplicate video retrieval based on clustering by multiple sequence alignment. In: Proceedings of the 20th ACM International Conference on Multimedia. 2012, 941–944
CrossRef
Google scholar
|
[101] |
Tan H K, Ngo C W, Chua T S. Efficient mining of multiple partial near-duplicate alignments by temporal network. IEEE Transactions on Circuits and Systems for Video Technology, 2010, 20(11): 1486–1498
CrossRef
Google scholar
|
[102] |
Ngo C W, Zhao W L, Jiang Y G. Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation. In: Proceedings of the 14th ACM International Conference on Multimedia. 2006, 845–854
CrossRef
Google scholar
|
[103] |
Donoser M, Bischof H. Diffusion processes for retrieval revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013, 1320–1327
CrossRef
Google scholar
|
[104] |
Bai S, Bai X, Tian Q, Latecki L J. Regularized diffusion process on bidirectional context for object retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(5): 1213–1226
CrossRef
Google scholar
|
[105] |
Mei T, Rui Y, Li S, Tian Q. Multimedia search reranking: a literature survey. ACM Computing Surveys, 2014, 46(3): 38
CrossRef
Google scholar
|
[106] |
Bai S, Bai X. Sparse contextual activation for efficient visual reranking. IEEE Transactions on Image Processing, 2016, 25(3): 1056–1069
CrossRef
Google scholar
|
[107] |
Over P, Awad G, Michel M, Fiscus J, Kraaij W, Smeaton A F. TRECVID 2009- goals, tasks, data, evaluation mechanisms and metrics. TRECVID 2009 papers, 2010, 1–42
|
[108] |
Law-To J, Joly A, Boujemaa N. Muscle-VCD-2007: a live benchmark for video copy detection. Google Scholar, 2007
|
[109] |
Ren J, Chang F, Wood T, Zhang J R. Efficient video copy detection via aligning video signature time series. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval. 2012, 14
CrossRef
Google scholar
|
[110] |
Karpenko A, Aarabi P. Tiny videos: a large data set for nonparametric video retrieval and frame classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(3): 618
CrossRef
Google scholar
|
[111] |
Tan H K, Wu X, Ngo C W, Zhao W L. Accelerating near-duplicate video matching by combining visual similarity and alignment distortion. In: Proceedings of the 16th ACM International Conference on Multimedia. 2008, 861–864
CrossRef
Google scholar
|
[112] |
Wu X, Ngo C W, Hauptmann A G, Tan H K. Real-rime near-duplicate elimination for web video search with content and context. IEEE Transactions on Multimedia, 2009, 11(2): 196–207
CrossRef
Google scholar
|
[113] |
Venna J, Peltonen J, Nybo K, Aidos H, Kaski S. Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research, 2010, 11(1): 451–490
|
[114] |
Hinton G E, Roweis S T. Stochastic neighbor embedding. In: Proceedings of the 15th International Conference on Neural Information Processing Systems. 2003, 857–864
|
[115] |
Maaten L V D, Hinton G. Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9(Nov): 2579–2605
|
[116] |
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1–9
CrossRef
Google scholar
|
[117] |
Ali S R, Sullivan J, Maki A, Carlsson S. A baseline for visual instance retrieval with deep convolutional networks. In: Proceedings of International Conference on Learning Representations. 2015
|
[118] |
Zheng L, Zhao Y, Wang S, Wang J, Tian Q. Good practice in CNN feature transfer. 2016, arXiv preprint arXiv:1604.00133
|
[119] |
Peng Y, Qi J, Yuan Y. CM-GANs: cross-modal generative adversarial networks for common representation learning. ACM Transactions on Multimedia Computing, Communications, and Applications, 2019, 15(1): 22
CrossRef
Google scholar
|
[120] |
Zhang J, Peng Y, Yuan M. SCH-GAN: semi-supervised cross-modal hashing by generative adversarial network. IEEE Transactions on Cybernetics, 2018
|
/
〈 | 〉 |