Please wait a minute...

Frontiers of Computer Science

Front. Comput. Sci.    2017, Vol. 11 Issue (4) : 649-660     DOI: 10.1007/s11704-016-5558-7
E-GrabCut: an economic method of iterative video object extraction
Le DONG(), Ning FENG, Mengdie MAO, Ling HE, Jingjing WANG
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Download: PDF(974 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Efficient, interactive foreground/background segmentation in video is of great practical importance in video editing. This paper proposes an interactive and unsupervised video object segmentation algorithm named E-GrabCut concentrating on achieving both of the segmentation quality and time efficiency as highly demanded in the related filed. There are three features in the proposed algorithms. Firstly, we have developed a powerful, non-iterative version of the optimization process for each frame. Secondly, more user interaction in the first frame is used to improve the Gaussian Mixture Model (GMM). Thirdly, a robust algorithm for the following frame segmentation has been developed by reusing the previous GMM. Extensive experiments demonstrate that our method outperforms the state-of-the-art video segmentation algorithm in terms of integration of time efficiency and segmentation quality.

Keywords interactive video object extraction      video segmentation      GrabCut      GMM     
Corresponding Authors: Le DONG   
Just Accepted Date: 05 July 2016   Online First Date: 07 June 2017    Issue Date: 26 July 2017
 Cite this article:   
Le DONG,Ning FENG,Mengdie MAO, et al. E-GrabCut: an economic method of iterative video object extraction[J]. Front. Comput. Sci., 2017, 11(4): 649-660.
E-mail this article
E-mail Alert
Articles by authors
Mengdie MAO
Ling HE
Jingjing WANG
1 WangM, HongR C, LiG D, Zha Z J, YanS C, ChuaT S. Event driven web video summarization by tag localization and key-shot identification. IEEE Transactions on Multimedia, 2012, 14(4): 975–985
doi: 10.1109/TMM.2012.2185041
2 O’ReillyR C, Wyatte D, HerdS , MingusB, JilkD J. Recurrent processing during object recognition. Frontiers in Psychology, 2013, 4: 124
doi: 10.3389/fpsyg.2013.00124
3 CarreiraJ, LiF X, SminchisescuC . Object recognition by sequential figure-ground ranking. International Journal of Computer Vision, 2012, 98(3): 243–262
doi: 10.1007/s11263-011-0507-2
4 PriyaR, Shanmugam T N. A comprehensive review of significant researches on content based indexing and retrieval of visual information. Frontiers of Computer Science, 2013,7(5): 782–799
doi: 10.1007/s11704-013-1276-6
5 DongX, WenJ T. A pixel-based outlier-free motion estimation algorithm for scalable video quality enhancement. Frontiers of Computer Science, 2015, 9(5): 729–740
doi: 10.1007/s11704-015-4184-0
6 YuanY, MouL C, LuX Q. Scene recognition by manifold regularized deep learning architecture. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(10): 2222–2233
doi: 10.1109/TNNLS.2014.2359471
7 LuX Q, YuanY, ZhengX T. Joint dictionary learning for change detection in multispectral imagery. IEEE Transactions on Cybernetics, 2016, 47(4): 884–897
doi: 10.1109/TCYB.2016.2531179
8 LuX Q, LiX L, MouL C. Semi-supervised multitask learning for scene recognition. IEEE Transactions on Cybernetics, 2015, 45(9): 1967–1976
doi: 10.1109/TCYB.2014.2362959
9 HuangY C, LiuQ S, MetaxasD. Video object segmentation by hypergraph cut. Computer Vision and Pattern Recognition, 2009
10 GrundmannM, KwatraV, HanM, Essa I. Efficient hierarchical graphbased video segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2141–2148
11 BrendelW, Todorovic S. Video Object Segmentation by Tracking Regions. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 833–840
doi: 10.1109/iccv.2009.5459242
12 DongL, FengN, ZhangQ N. LSI: semantic label inference for nature image segmentation. Pattern Recognition, 2016
doi: 10.1016/j.patcog.2016.03.005
13 Vazquez-ReinaA, AvidanS, PfiterH, Miller E. Multiple hypothesis video segmentation from superpixel flows. In: Proceedings of European Conference on Computer Vision. 2010, 268–281
doi: 10.1007/978-3-642-15555-0_20
14 LeeY J, KimJ, GraumanK. Key-segments for video object segmentation. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 1995–2002
doi: 10.1109/iccv.2011.6126471
15 BoykovY Y, JollyM P. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In: Proceedings of the 8th IEEE International Conference on Computer Vision. 2011, 105–112
16 ChangX J, NieF P, MaZ G, Yang Y, ZhouX F . A convex formulation for spectral shrunk clustering. 2014, arXiv preprint arXiv:1411.6308
17 LiuH Q, JiaoL C, ZhaoF. Non-local spatial spectral clustering for image segmentation. Neurocomputing, 2010, 74(1): 461–471
doi: 10.1016/j.neucom.2010.08.021
18 JiaJ H, LiuB X, JiaoL C. Soft spectral clustering ensemble applied to image segmentation. Frontiers of Computer Science in China, 2011, 5(1): 66–78
doi: 10.1007/s11704-010-0161-9
19 ZhaoF, JiaoL C, LiuH Q. Fuzzy c-means clustering with non local spatial information for noisy image segmentation. Frontiers of Computer Science in China, 2011, 5(1): 45–56
doi: 10.1007/s11704-010-0393-8
20 BezdekJ C, Ehrlich R, FullW . FCM: the fuzzy c-means clustering algorithm. Computers & Geosciences, 1984, 10(2–3): 191–203
doi: 10.1016/0098-3004(84)90020-7
21 RotherC, Kolmogorov V, BlakeA . GrabCut-Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 2004, 23(30): 309–314
doi: 10.1145/1015706.1015720
22 KohliP, TorrP H S. Dynamic graph cuts for efficient inference in markov random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(12): 2079–2088
doi: 10.1109/TPAMI.2007.1128
23 SzeliskiR, ZabihR, ScharsteinD , VekslerO, Kolmogorov V, AgarwalaA , TappenM F, RotherC. A comparative study of energy minimization methods for Markov random fields. In: Proceedings of European Conference on Computer Vision. 2006, 16–29
doi: 10.1007/11744047_2
24 BoykovY, Veksler O, ZabihR . Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(11): 1222–1239
doi: 10.1109/34.969114
25 BoykovY, Funka-Lea G. Graph cut and efficient N-D image segmentation. International Journal of Computer Vision, 2006, 70(2): 109–131
doi: 10.1007/s11263-006-7934-5
26 LiY, SunJ, ShumH Y. Video object cut and paste. ACM Transactions on Graphics, 2005, 24(3): 595–600
doi: 10.1145/1073204.1073234
27 WangJ, BhatP, ColburnR A, Agrawala M, CohenM F . Interactive video cutout. ACM Transactions on Graphics, 2005, 24(3): 585–594
doi: 10.1145/1073204.1073233
28 WangJ J, XuW, ZhuS H, Gong Y H. Efficient video object segmentation by graph-cut. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2007, 496–499
doi: 10.1109/icme.2007.4284695
29 YangL,WuX Y, GuoY M, Li S B. An interactive video segmentation approach based on GrabCut algorithm. In: Proceedings of the 4th International Congress on Image and Signal Processing. 2011, 367–370
doi: 10.1109/cisp.2011.6100014
30 TalbotJ F, XuX Q. Implementing GrabCut. Provo, UT: Brigham Young University, 2006
31 MartinD, Fowlkes C, TalD , MalikJ. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of IEEE International Conference on Computer Vision. 2001, 416–423
doi: 10.1109/iccv.2001.937655
32 XiangS M, NieF P, ZhangC S. Semi-supervised classification via local spline regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(11): 2039–2053
doi: 10.1109/TPAMI.2010.35
33 PanY, NieF P, XuD, LuoJ B, ZhuangY T, Pan Y H. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 349(4): 723–742
34 DongL, HeL, ZhangQ N. Discriminative light unsupervised learning network for image representation and classification. In: Proceeding of the 23rd ACM International Conference on Multimedia. 2015, 1235–1238
doi: 10.1145/2733373.2806325
35 WangZ, LuL G, BovikA C. Video quality assessment based on structural distortion measurement. Signal Processing Image Communication, 2004, 19(2): 121–132
doi: 10.1016/S0923-5965(03)00076-6
36 BlankM, Gorelick L, ShechtmanE , IraniM, BasriR. Actions as spacetime shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 29(12): 2247–2253
[1] FCS-0649-15558-LD_suppl_1 Download
Full text