A maximum margin clustering algorithm based on indefinite kernels

Hui XUE; Sen LI; Xiaohong CHEN; Yunyun WANG

doi:10.1007/s11704-018-7402-8

Front. Comput. Sci. ›› 2019, Vol. 13 ›› Issue (4) :813 -827. DOI: 10.1007/s11704-018-7402-8

RESEARCH ARTICLE

A maximum margin clustering algorithm based on indefinite kernels

Hui XUE ¹^,²^,^†
, Sen LI ¹^,²
, Xiaohong CHEN ³
, Yunyun WANG ⁴

Author information +

History +

PDF (907KB)

Abstract

Indefinite kernels have attracted more and more attentions in machine learning due to its wider application scope than usual positive definite kernels. However, the research about indefinite kernel clustering is relatively scarce. Furthermore, existing clustering methods are mainly designed based on positive definite kernels which are incapable in indefinite kernel scenarios. In this paper, we propose a novel indefinite kernel clustering algorithm termed as indefinite kernel maximum margin clustering (IKMMC) based on the state-of-the-art maximum margin clustering (MMC) model. IKMMC tries to find a proxy positive definite kernel to approximate the original indefinite one and thus embeds a new F-norm regularizer in the objective function to measure the diversity of the two kernels, which can be further optimized by an iterative approach. Concretely, at each iteration, given a set of initial class labels, IKMMC firstly transforms the clustering problem into a classification one solved by indefinite kernel support vector machine (IKSVM) with an extra class balance constraint and then the obtained prediction labels will be used as the new input class labels at next iteration until the error rate of prediction is smaller than a prespecified tolerance. Finally, IKMMC utilizes the prediction labels at the last iteration as the expected indices of clusters. Moreover, we further extend IKMMC from binary clustering problems to more complexmulti-class scenarios. Experimental results have shown the superiority of our algorithms.

Keywords

indefinite kernel / maximum margin clustering / support vector machine / kernel method

Cite this article

Download citation ▾

Hui XUE, Sen LI, Xiaohong CHEN, Yunyun WANG. A maximum margin clustering algorithm based on indefinite kernels. Front. Comput. Sci., 2019, 13(4): 813-827 DOI:10.1007/s11704-018-7402-8

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Andrew A M. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge: Cambridge University Press, 2000

[2]	Aronszajn N. Theory of reproducing kernels. Transactions of the American Mathematical Society, 1950, 68(3): 337–404

[3]	Xue H, Chen S, Yang Q. Discriminatively regularized least-squares classification. Pattern Recognition, 2009, 42(1): 93–104

[4]	Xue H, Chen S, Huang J. Discriminative indefinite kernel classifier from pairwise constraints and unlabeled data. In: Proceedings of International Conference on Pattern Recognition. 2012, 497–500

[5]	Huang J, Xue H, Zhai Y. Semi-supervised discriminatively regularized classifier with pairwise constraints. In: Proceedings of Pacific Rim International Conference on Artificial Intelligence. 2012, 112–123

[6]	Wang Z, Chen S, Xue H, Pan Z. A novel regularization learning for single-view patterns: multi-view discriminative regularization. Neural Processing Letters, 2010, 31(3): 159–175

[7]	Haasdonk B, Pekalska E. Indefinite kernel fisher discriminant. In: Proceedings of International Conference on Pattern Recognition. 2008, 1–4

[8]	Ho S S, Dai P, Rudzicz F. Manifold learning for multivariate variablelength sequences with an application to similarity search. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(6): 1333–1344

[9]	Li C, Lin L, Zuo W, Yan S, Tang J. Sold: sub-optimal low-rank decomposition for efficient video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 5519–5527

[10]	Jacobs D W, Weinshall D, Gdalyahu Y. Classification with nonmetric distances: image retrieval and class representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(6): 583–600

[11]	Schleif F M, Tino P. Indefinite proximity learning: a review. Neural Computation, 2015, 27(10): 2039–2096

[12]	Liwicki S, Zafeiriou S, Tzimiropoulos G, Pantic M. Efficient online subspace learning with an indefinite kernel for visual tracking and recognition. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(10): 1624–1636

[13]	Liu C. Gabor-based kernel PCA with fractional power polynomial models for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(5): 572–581

[14]	Wu G, Chang E Y, Zhang Z. An analysis of transformation on nonpositive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22nd International Conference on Machine Learning. 2005, 8

[15]	Alabdulmohsin I, Gao X, Zhang X Z. Support vector machines with indefinite kernels. In: Proceedings of the 6th Asian Conference on Machine Learning. 2015, 32–47

[16]	Graepel T, Herbrich R, Bollmann-Sdorra P, Obermayer K. Classification on pairwise proximity data. In: Proceedings of the 11th Conference on Neural Information Processing Systems. 1998, 438–444

[17]	Roth V, Laub J, Kawanabe M, Buhmann J M. Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1540–1551

[18]	Luss R, d’Aspremont A. Support vector machine classification with indefinite kernels. In: Proceedings of the 20th International Conference on Neural Information Processing Systems. 2007, 953–960

[19]	Waldspurger I, d’Aspremont A, Mallat S. Phase recovery, maxcut and complex semidefinite programming. Mathematical Programming, 2015, 149(1–2): 47–81

[20]	Chen J, Ye J. Training SVM with indefinite kernels. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 136–143

[21]	Auslender A. An exact penalty method for nonconvex problems covering, in particular, nonlinear programming, semidefinite programming, and second-order cone programming. SIAM Journal on Optimization, 2015, 25(3): 1732–1759

[22]	Chen Y, Gupta M R, Recht B. Learning kernels from indefinite similarities. In: Proceedings of the 26th Annual International Conference on Machine Learning. 2009, 145–152

[23]	Gu S, Guo Y. Learning SVM classifiers with indefinite kernels. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 942–948

[24]	Lin H T, Lin C J. A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Neural Computation, 2003, 3: 1–32

[25]	Haasdonk B. Feature space interpretation of SVMs with indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(4): 482–492

[26]	Loosli G, Ong C S, Canu S. Technical report: SVM in Krein spaces. Machine Learning, 2013

[27]	Ong C S. Kernels: regularization and optimization. Doctoral Thesis, The Australian National University, 2011

[28]	Loosli G, Canu S, Ong C S. Learning SVM in Krein spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(6): 1204–1216

[29]	Xu H M, Xue H, Chen X, Wang Y Y. Solving indefinite kernel support vector machine with difference of convex functions programming. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2782–2788

[30]	Xue H, Song Y, Xu H M. Multiple indefinite kernel learning for feature selection. In: Proceedings of International Joint Conferences on Artificial Intelligence. 2017, 3210–3216

[31]	Xu L, Neufeld J, Larson B, Schuurmans D. Maximum margin clustering. Advances in Neural Information Processing Systems, 2005, 17: 1537–1544

[32]	Zhang K, Tsang I W, Kwok J T. Maximum margin clustering made practical. IEEE Transactions on Neural Networks, 2009, 20(4): 583–596

[33]	Zhao B, Kwok J T, Zhang C. Multiple kernel clustering. In: Proceedings of the 2009 SIAM International Conference on Data Mining. 2009, 638–649

[34]	Wang F, Zhao B, Zhang C. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332

[35]	Zhang X L, Wu J. Linearithmic time sparse and convex maximum margin clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42(6): 1669–1692

[36]	Li Y F, Tsang I W, Kwok J, Zhou Z H. Tighter and convex maximum margin clustering. In: Proceedings of International Conference on Artificial Intelligence and Statistics. 2009, 344–351

[37]	Wu J, Zhang X L. Sparse kernel maximum margin clustering. Neural Network World, 2011, 21(6): 551–574

[38]	Hettich R, Kortanek K O. Semi-infinite programming: theory, methods, and applications. SIAM Review, 1993, 35(3): 380–429

[39]	Smola A J, Vishwanathan S V N, Hofmann T. Kernel methods for missing variables. In: Proceedings of the 10th International Workshop on Artificial Intelligence & Statistics. 2005, 325–334

[40]	Joachims T, Finley T, Yu C N J. Cutting-plane training of structural SVMs. Machine Learning, 2009, 77(1): 27–59

[41]	Gan G, Ma C, Wu J. Data Clustering: Theory, Algorithms, and Applications. Philadelphia: SIAM, Society for Industrial and Applied Mathematics, 2007

[42]	Duan K B, Keerthi S S. Which is the best multiclass SVMmethod? An empirical study. In: Proceedings of International Workshop on Multiple Classifier Systems. 2005, 278–285

[43]	Filippone M, Camastra F, Masulli F, Rovetta S. A survey of kernel and spectral methods for clustering. Pattern Recognition, 2008, 41(1): 176–190