Joint salient object detection and existence prediction
Huaizu JIANG, Ming-Ming CHENG, Shi-Jie LI, Ali BORJI, Jingdong WANG
Joint salient object detection and existence prediction
Recent advances in supervised salient object detection modeling has resulted in significant performance improvements on benchmark datasets. However, most of the existing salient object detection models assume that at least one salient object exists in the input image. Such an assumption often leads to less appealing saliencymaps on the background images with no salient object at all. Therefore, handling those cases can reduce the false positive rate of a model. In this paper, we propose a supervised learning approach for jointly addressing the salient object detection and existence prediction problems. Given a set of background-only images and images with salient objects, as well as their salient object annotations, we adopt the structural SVM framework and formulate the two problems jointly in a single integrated objective function: saliency labels of superpixels are involved in a classification term conditioned on the salient object existence variable, which in turn depends on both global image and regional saliency features and saliency labels assignments. The loss function also considers both image-level and regionlevel mis-classifications. Extensive evaluation on benchmark datasets validate the effectiveness of our proposed joint approach compared to the baseline and state-of-the-art models.
salient object detection / existence prediction / joint inference / saliency detection
[1] |
Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Snalysis and Machine Intelligence, 1998, 20(11): 1254–1259
CrossRef
Google scholar
|
[2] |
Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 185–207
CrossRef
Google scholar
|
[3] |
Borji A, Sihite D N, Itti L. Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Transactions on Image Processing, 2013, 22(1): 55–69
CrossRef
Google scholar
|
[4] |
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H Y. Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(2): 353–367
CrossRef
Google scholar
|
[5] |
Zhang G X, Cheng M M, Hu S M, Martin R R. A shape-preserving approach to image resizing. Computer Graphics Forum, 2009, 28(7): 1897–1906
CrossRef
Google scholar
|
[6] |
Chen T, Cheng M M, Tan P, Shamir A, Hu S M. Sketch2photo: Internet image montage. ACM Transactions on Graphics (TOG), 2009, 28(5): 124
CrossRef
Google scholar
|
[7] |
Chen T, Tan P, Ma L Q, Cheng M M, Shamir A, Hu S M. Poseshop: human image database construction and personalized content synthesis. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(5): 824–837
CrossRef
Google scholar
|
[8] |
Cheng M M, Mitra N J, Huang X, Hu S M. Salientshape: group saliency in image collections. The Visual Computer, 2014, 30(4): 443–453
CrossRef
Google scholar
|
[9] |
Wang J, Quan L, Sun J, Tang X, Shum H Y. Picture collage. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 347–354
|
[10] |
Abdulmunem A, Lai Y K, Sun X. Saliency guided local and global descriptors for effective action recognition. Computational Visual Media, 2016, 2(1): 97–106
CrossRef
Google scholar
|
[11] |
Zhang J, Han Y, Jiang J. Tucker decomposition-based tensor learning for human action recognition. Multimedia Systems, 2016, 22(3): 343–353
CrossRef
Google scholar
|
[12] |
Hu S M, Chen T, Xu K, Cheng M M, Martin R R. Internet visual media processing: a survey with graphics and vision applications. The Visual Computer, 2013, 29(5): 393–405
CrossRef
Google scholar
|
[13] |
Cheng M M, Hou Q B, Zhang S H, Rosin P L. Intelligent visual media processing: when graphics meets vision. Journal of Computer Science and Technology, 2017, 32(1): 110–121
CrossRef
Google scholar
|
[14] |
Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S. Salient object detection: a discriminative regional feature integration approach. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2083–2090
CrossRef
Google scholar
|
[15] |
Zhao R, Ouyang W, Li H, Wang X. Saliency detection by multi-context deep learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1265–1274
CrossRef
Google scholar
|
[16] |
Li G, Yu Y. Visual saliency based on multiscale deep features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 5455–5463
|
[17] |
Perazzi F, Krähenbühl P, Pritch Y, Hornung A. Saliency filters: contrast based filtering for salient region detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 733–740
CrossRef
Google scholar
|
[18] |
Zhu W, Liang S, Wei Y, Sun J. Saliency optimization from robust background detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 2814–2821
CrossRef
Google scholar
|
[19] |
Li X, Lu H, Zhang L, Ruan X, Yang M H. Saliency detection via dense and sparse reconstruction. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2976–2983
CrossRef
Google scholar
|
[20] |
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324
CrossRef
Google scholar
|
[21] |
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the Neural Information Processing Systems Conference. 2012, 1106–1114
|
[22] |
Borji A. What is a salient object? a dataset and a baseline model for salient object detection. IEEE Transactions on Image Processing, 2015, 24(2): 742–756
CrossRef
Google scholar
|
[23] |
Wang P, Wang J, Zeng G, Feng J, Zha H, Li S. Salient object detection for searched Web images via global saliency. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 3194–3201.
CrossRef
Google scholar
|
[24] |
Boykov Y, Kolmogorov V. An experimental comparison of mincut/ max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(9): 1124–1137
CrossRef
Google scholar
|
[25] |
Borji A, Cheng M M, Hou Q, Jiang H, Li J. Salient object detection: a survey. 2014, arXiv preprint arXiv:1411.5878
|
[26] |
Borji A, Cheng M M, Jiang H, Li J. Salient object detection: a benchmark. IEEE Transactions on Image Processing, 2015, 24(12): 5706–5722
CrossRef
Google scholar
|
[27] |
Han J, Liu N, Zhang D. Visual saliency detection and applications: a survey. Frontiers of Computer Science, 2017
|
[28] |
Achanta R, Hemami S, Estrada F, Süsstrunk S. Frequency-tuned salient region detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 1597–1604
CrossRef
Google scholar
|
[29] |
Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(10): 1915–1926
CrossRef
Google scholar
|
[30] |
Tian Y, Li J, Yu S, Huang T. Learning complementary saliency priors for foreground object segmentation in complex scenes. International Journal of Computer Vision, 2015, 111(2): 153–170
CrossRef
Google scholar
|
[31] |
Fang S, Li J, Tian Y, Huang T, Chen X. Learning discriminative subspaces on random contrasts for image saliency analysis. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(5): 1095–1108
CrossRef
Google scholar
|
[32] |
Margolin R, Tal A, Zelnik-Manor L. What makes a patch distinct? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 1139–1146
CrossRef
Google scholar
|
[33] |
Cheng M M, Mitra N J, Huang X, Torr P H, Hu S M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 569–582
CrossRef
Google scholar
|
[34] |
Borji A, Itti L. Exploiting local and global patch rarities for saliency detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 478–485
CrossRef
Google scholar
|
[35] |
Qi W, Cheng M M, Borji A, Lu H, Bai L F. SaliencyRank: two-stage manifold ranking for salient object detection. Computational Visual Media, 2015, 1(4): 309–320
CrossRef
Google scholar
|
[36] |
Jiang H, Wang J, Yuan Z, Liu T, Zheng N, Li S. Automatic salient object segmentation based on context and shape prior. In: Proceedings of the British Machine Vision Conference (BMVC). 2011
CrossRef
Google scholar
|
[37] |
Felzenszwalb P F, Huttenlocher D P. Efficient graph-based image segmentation. International Journal of Computer Vision, 2004, 59(2): 16–181
CrossRef
Google scholar
|
[38] |
Cheng M M, Liu Y, Hou Q, Bian J, Torr P, Hu S M, Tu Z. HFS: hierarchical feature selection for efficient image segmentation. In: Proceedings of European Conference on Computer Vision. 2016, 867–882
CrossRef
Google scholar
|
[39] |
Yan Q, Xu L, Shi J, Jia J. Hierarchical saliency detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 1155–1162
CrossRef
Google scholar
|
[40] |
Wei Y, Wen F, Zhu W, Sun J. Geodesic saliency using background priors. In: Proceedings of European Conference on Computer Vision. 2012, 29–42
CrossRef
Google scholar
|
[41] |
Yang C, Zhang L, Lu H, Ruan X, Yang M H. Saliency detection via graph-based manifold ranking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3166–3173
CrossRef
Google scholar
|
[42] |
Jiang B, Zhang L, Lu H, Yang C, Yang M H. Saliency detection via absorbing Markov chain. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1665–1672
CrossRef
Google scholar
|
[43] |
Zhang J, Sclaroff S. Saliency detection: a boolean map approach. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 153–160
CrossRef
Google scholar
|
[44] |
Chang K Y, Liu T L, Chen H T, Lai S H. Fusing generic objectness and visual saliency for salient object detection. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 914–921
|
[45] |
Jiang P, Ling H, Yu J, Peng J. Salient region detection by UFO: uniqueness, focusness and objectness. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1976–1983
CrossRef
Google scholar
|
[46] |
Jia Y, Han M. Category-independent object-level saliency detection. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1761–1768
CrossRef
Google scholar
|
[47] |
Cheng M M, Warrell J, Lin W Y, Zheng S, Vineet V, Crook N. Efficient salient region detection with soft image abstraction. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1529–1536
CrossRef
Google scholar
|
[48] |
Mai L, Niu Y, Liu F. Saliency aggregation: a data-driven approach. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1131–1138
CrossRef
Google scholar
|
[49] |
Lu S, Mahadevan V, Vasconcelos N. Learning optimal seeds for diffusion-based salient object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 2790–2797
CrossRef
Google scholar
|
[50] |
Mehrani P, Veksler O. Saliency segmentation based on learning and graph cut refinement. In: Proceedings of the British Machine Vision Conference (BMVC). 2010, 1–12
CrossRef
Google scholar
|
[51] |
Kim J, Han D, Tai Y W, Kim J. Salient region detection via highdimensional color transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 883–890
|
[52] |
Khuwuthyakorn P, Robles-Kelly A, Zhou J. Object of interest detection by saliency learning. In: Proceedings of European Conference on Computer Vision. 2010
CrossRef
Google scholar
|
[53] |
Hou Q, Cheng M M, Hu X, Borji A, Tu Z, Torr P. Deeply supervised salient object detection with short connections. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5300–5309
CrossRef
Google scholar
|
[54] |
Zhang J, Ma S, Sameki M, Sclaroff S, Betke M, Lin Z, Shen X, Price B, Mech R. Salient object subitizing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 4045–4054
CrossRef
Google scholar
|
[55] |
Deng J, Dong W, Socher R, Li L J, Li K, Li F F. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255
CrossRef
Google scholar
|
[56] |
Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 580–587
CrossRef
Google scholar
|
[57] |
Yang S, Luo P, Loy C C, Tang X. From facial parts responses to face detection: a deep learning approach. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3676–3684
CrossRef
Google scholar
|
[58] |
Cimpoi M, Maji S, Vedaldi A. Deep filter banks for texture recognition and segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3828–3836
CrossRef
Google scholar
|
[59] |
Lin T Y, RoyChowdhury A, Maji S. Bilinear CNN models for finegrained visual recognition. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1449–1457
|
[60] |
Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 945–953
CrossRef
Google scholar
|
[61] |
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks. In: Proceedings of European Conference on Computer Vision. 2014, 818–833
CrossRef
Google scholar
|
[62] |
Do T M T, Artières T. Regularized bundle methods for convex and non-convex risks. Journal of Machine Learning Research, 2012, 13: 3539–3583
|
[63] |
Xiao J, Hays J, Ehinger K A, Oliva A, Torralba A. SUN database: large-scale scene recognition from abbey to zoo. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 3485–3492
CrossRef
Google scholar
|
[64] |
Cimpoi M, Maji S, Kokkinos I, Mohamed S, Vedaldi A. Describing textures in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3606–3613
CrossRef
Google scholar
|
[65] |
Shen X, Wu Y. A unified approach to salient object detection via low rank matrix recovery. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 853–860
|
[66] |
Huang H, Zhang L, Zhang H C. Arcimboldo-like collage using internet images. ACM Transactions on Graphics, 2011, 30(6): 155
CrossRef
Google scholar
|
[67] |
Liu H, Zhang L, Huang H. Web-image driven best views of 3D shapes. The Visual Computer, 2012, 28(3): 279–287
CrossRef
Google scholar
|
[68] |
Wei Y, Liang X, Chen Y, Shen X, Cheng M M, Feng J, Zhao Y, Yan S. STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(11): 2314–2320
CrossRef
Google scholar
|
[69] |
Chia A Y S, Zhuo S, Gupta R K, Tai Y W, Cho S Y, Tan P, Lin S. Semantic colorization with Internet images. ACM Transactions on Graphics, 2011, 30(6): 156
CrossRef
Google scholar
|
/
〈 | 〉 |