A maximum margin clustering algorithm based on indefinite kernels
Hui XUE, Sen LI, Xiaohong CHEN, Yunyun WANG
A maximum margin clustering algorithm based on indefinite kernels
Indefinite kernels have attracted more and more attentions in machine learning due to its wider application scope than usual positive definite kernels. However, the research about indefinite kernel clustering is relatively scarce. Furthermore, existing clustering methods are mainly designed based on positive definite kernels which are incapable in indefinite kernel scenarios. In this paper, we propose a novel indefinite kernel clustering algorithm termed as indefinite kernel maximum margin clustering (IKMMC) based on the state-of-the-art maximum margin clustering (MMC) model. IKMMC tries to find a proxy positive definite kernel to approximate the original indefinite one and thus embeds a new F-norm regularizer in the objective function to measure the diversity of the two kernels, which can be further optimized by an iterative approach. Concretely, at each iteration, given a set of initial class labels, IKMMC firstly transforms the clustering problem into a classification one solved by indefinite kernel support vector machine (IKSVM) with an extra class balance constraint and then the obtained prediction labels will be used as the new input class labels at next iteration until the error rate of prediction is smaller than a prespecified tolerance. Finally, IKMMC utilizes the prediction labels at the last iteration as the expected indices of clusters. Moreover, we further extend IKMMC from binary clustering problems to more complexmulti-class scenarios. Experimental results have shown the superiority of our algorithms.
indefinite kernel / maximum margin clustering / support vector machine / kernel method
[1] |
Andrew A M. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge: Cambridge University Press, 2000
|
[2] |
Aronszajn N. Theory of reproducing kernels. Transactions of the American Mathematical Society, 1950, 68(3): 337–404
CrossRef
Google scholar
|
[3] |
Xue H, Chen S, Yang Q. Discriminatively regularized least-squares classification. Pattern Recognition, 2009, 42(1): 93–104
CrossRef
Google scholar
|
[4] |
Xue H, Chen S, Huang J. Discriminative indefinite kernel classifier from pairwise constraints and unlabeled data. In: Proceedings of International Conference on Pattern Recognition. 2012, 497–500
|
[5] |
Huang J, Xue H, Zhai Y. Semi-supervised discriminatively regularized classifier with pairwise constraints. In: Proceedings of Pacific Rim International Conference on Artificial Intelligence. 2012, 112–123
CrossRef
Google scholar
|
[6] |
Wang Z, Chen S, Xue H, Pan Z. A novel regularization learning for single-view patterns: multi-view discriminative regularization. Neural Processing Letters, 2010, 31(3): 159–175
CrossRef
Google scholar
|
[7] |
Haasdonk B, Pekalska E. Indefinite kernel fisher discriminant. In: Proceedings of International Conference on Pattern Recognition. 2008, 1–4
CrossRef
Google scholar
|
[8] |
Ho S S, Dai P, Rudzicz F. Manifold learning for multivariate variablelength sequences with an application to similarity search. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(6): 1333–1344
CrossRef
Google scholar
|
[9] |
Li C, Lin L, Zuo W, Yan S, Tang J. Sold: sub-optimal low-rank decomposition for efficient video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 5519–5527
|
[10] |
Jacobs D W, Weinshall D, Gdalyahu Y. Classification with nonmetric distances: image retrieval and class representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(6): 583–600
CrossRef
Google scholar
|
[11] |
Schleif F M, Tino P. Indefinite proximity learning: a review. Neural Computation, 2015, 27(10): 2039–2096
CrossRef
Google scholar
|
[12] |
Liwicki S, Zafeiriou S, Tzimiropoulos G, Pantic M. Efficient online subspace learning with an indefinite kernel for visual tracking and recognition. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(10): 1624–1636
CrossRef
Google scholar
|
[13] |
Liu C. Gabor-based kernel PCA with fractional power polynomial models for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(5): 572–581
CrossRef
Google scholar
|
[14] |
Wu G, Chang E Y, Zhang Z. An analysis of transformation on nonpositive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22nd International Conference on Machine Learning. 2005, 8
|
[15] |
Alabdulmohsin I, Gao X, Zhang X Z. Support vector machines with indefinite kernels. In: Proceedings of the 6th Asian Conference on Machine Learning. 2015, 32–47
|
[16] |
Graepel T, Herbrich R, Bollmann-Sdorra P, Obermayer K. Classification on pairwise proximity data. In: Proceedings of the 11th Conference on Neural Information Processing Systems. 1998, 438–444
|
[17] |
Roth V, Laub J, Kawanabe M, Buhmann J M. Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1540–1551
CrossRef
Google scholar
|
[18] |
Luss R, d’Aspremont A. Support vector machine classification with indefinite kernels. In: Proceedings of the 20th International Conference on Neural Information Processing Systems. 2007, 953–960
|
[19] |
Waldspurger I, d’Aspremont A, Mallat S. Phase recovery, maxcut and complex semidefinite programming. Mathematical Programming, 2015, 149(1–2): 47–81
CrossRef
Google scholar
|
[20] |
Chen J, Ye J. Training SVM with indefinite kernels. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 136–143
CrossRef
Google scholar
|
[21] |
Auslender A. An exact penalty method for nonconvex problems covering, in particular, nonlinear programming, semidefinite programming, and second-order cone programming. SIAM Journal on Optimization, 2015, 25(3): 1732–1759
CrossRef
Google scholar
|
[22] |
Chen Y, Gupta M R, Recht B. Learning kernels from indefinite similarities. In: Proceedings of the 26th Annual International Conference on Machine Learning. 2009, 145–152
CrossRef
Google scholar
|
[23] |
Gu S, Guo Y. Learning SVM classifiers with indefinite kernels. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 942–948
|
[24] |
Lin H T, Lin C J. A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Neural Computation, 2003, 3: 1–32
|
[25] |
Haasdonk B. Feature space interpretation of SVMs with indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(4): 482–492
CrossRef
Google scholar
|
[26] |
Loosli G, Ong C S, Canu S. Technical report: SVM in Krein spaces. Machine Learning, 2013
|
[27] |
Ong C S. Kernels: regularization and optimization. Doctoral Thesis, The Australian National University, 2011
|
[28] |
Loosli G, Canu S, Ong C S. Learning SVM in Krein spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(6): 1204–1216
CrossRef
Google scholar
|
[29] |
Xu H M, Xue H, Chen X, Wang Y Y. Solving indefinite kernel support vector machine with difference of convex functions programming. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2782–2788
|
[30] |
Xue H, Song Y, Xu H M. Multiple indefinite kernel learning for feature selection. In: Proceedings of International Joint Conferences on Artificial Intelligence. 2017, 3210–3216
CrossRef
Google scholar
|
[31] |
Xu L, Neufeld J, Larson B, Schuurmans D. Maximum margin clustering. Advances in Neural Information Processing Systems, 2005, 17: 1537–1544
|
[32] |
Zhang K, Tsang I W, Kwok J T. Maximum margin clustering made practical. IEEE Transactions on Neural Networks, 2009, 20(4): 583–596
CrossRef
Google scholar
|
[33] |
Zhao B, Kwok J T, Zhang C. Multiple kernel clustering. In: Proceedings of the 2009 SIAM International Conference on Data Mining. 2009, 638–649
CrossRef
Google scholar
|
[34] |
Wang F, Zhao B, Zhang C. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332
CrossRef
Google scholar
|
[35] |
Zhang X L, Wu J. Linearithmic time sparse and convex maximum margin clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42(6): 1669–1692
CrossRef
Google scholar
|
[36] |
Li Y F, Tsang I W, Kwok J, Zhou Z H. Tighter and convex maximum margin clustering. In: Proceedings of International Conference on Artificial Intelligence and Statistics. 2009, 344–351
|
[37] |
Wu J, Zhang X L. Sparse kernel maximum margin clustering. Neural Network World, 2011, 21(6): 551–574
CrossRef
Google scholar
|
[38] |
Hettich R, Kortanek K O. Semi-infinite programming: theory, methods, and applications. SIAM Review, 1993, 35(3): 380–429
CrossRef
Google scholar
|
[39] |
Smola A J, Vishwanathan S V N, Hofmann T. Kernel methods for missing variables. In: Proceedings of the 10th International Workshop on Artificial Intelligence & Statistics. 2005, 325–334
|
[40] |
Joachims T, Finley T, Yu C N J. Cutting-plane training of structural SVMs. Machine Learning, 2009, 77(1): 27–59
CrossRef
Google scholar
|
[41] |
Gan G, Ma C, Wu J. Data Clustering: Theory, Algorithms, and Applications. Philadelphia: SIAM, Society for Industrial and Applied Mathematics, 2007
CrossRef
Google scholar
|
[42] |
Duan K B, Keerthi S S. Which is the best multiclass SVMmethod? An empirical study. In: Proceedings of International Workshop on Multiple Classifier Systems. 2005, 278–285
CrossRef
Google scholar
|
[43] |
Filippone M, Camastra F, Masulli F, Rovetta S. A survey of kernel and spectral methods for clustering. Pattern Recognition, 2008, 41(1): 176–190
CrossRef
Google scholar
|
/
〈 | 〉 |