A primal perspective for indefinite kernel SVM problem

Hui XUE, Haiming XU, Xiaohong CHEN, Yunyun WANG

PDF(533 KB)
PDF(533 KB)
Front. Comput. Sci. ›› 2020, Vol. 14 ›› Issue (2) : 349-363. DOI: 10.1007/s11704-018-8148-z
RESEARCH ARTICLE

A primal perspective for indefinite kernel SVM problem

Author information +
History +

Abstract

Indefinite kernel support vector machine (IKSVM) has recently attracted increasing attentions in machine learning. Since IKSVM essentially is a non-convex problem, existing algorithms either change the spectrum of indefinite kernel directly but risking losing some valuable information or solve the dual form of IKSVM whereas suffering from a dual gap problem. In this paper, we propose a primal perspective for solving the problem. That is, we directly focus on the primal form of IKSVM and present a novel algorithm termed as IKSVM-DC for binary and multi-class classification. Concretely, according to the characteristics of the spectrum for the indefinite kernel matrix, IKSVM-DC decomposes the primal function into the subtraction of two convex functions as a difference of convex functions (DC) programming. To accelerate convergence rate, IKSVM-DC combines the classical DC algorithm with a line search step along the descent direction at each iteration. Furthermore, we construct a multi-class IKSVM model which can classify multiple classes in a unified form. A theoretical analysis is then presented to validate that IKSVM-DC can converge to a local minimum. Finally, we conduct experiments on both binary and multi-class datasets and the experimental results show that IKSVM-DC is superior to other state-of-the-art IKSVM algorithms.

Keywords

indefinite kernel / support vector machine / multi-class classification / non-convex optimization

Cite this article

Download citation ▾
Hui XUE, Haiming XU, Xiaohong CHEN, Yunyun WANG. A primal perspective for indefinite kernel SVM problem. Front. Comput. Sci., 2020, 14(2): 349‒363 https://doi.org/10.1007/s11704-018-8148-z

References

[1]
Cortes C. Support-vector network. Machine Learning Journal, 1995, 20(3): 273–297
CrossRef Google scholar
[2]
Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge: Cambridge University Press, 2000
CrossRef Google scholar
[3]
Li Y Y, Lu R Q. Locality preserving projection on SPD matrix Lie group: algorithm and analysis. Science China Information Sciences, 2018, 61(9): 092104
CrossRef Google scholar
[4]
Ma L R, Song D D, Liao L J, Wang J. PSVM: a preference-enhanced SVM model using preference data for classification. Science China Information Sciences, 2017, 60(12): 122103
CrossRef Google scholar
[5]
Saigo H, Vert J-P, Ueda N, Akutsu T. Protein homology detection using string alignment kernels. Bioinformatics, 2004, 20(11): 1682–1689
CrossRef Google scholar
[6]
Wang C, Song Y, Li H, Zhang M, Han J. Text classification with heterogeneous information network kernels. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2130–2136
[7]
Vapnik V. The Nature of Statistical Learning Theory. Germany: Springer Science & Business Media, 2013
[8]
Suard F, Rakotomamonjy A, Bensrhair A. Kernel on bag of paths for measuring similarity of shapes. In: Proceedings of European Symposium on Artificial Neural Networks. 2007, 355–360
[9]
Chen Y, Garcia E, Gupta M, Rahimi A, Cazzanti L. Similarity-based classification: concepts and algorithms. Journal of Machine Learning Research, 2009, 10(3): 747–776
[10]
Chen Y, Gupta M. Fusing similarities and kernels for classification, In: Proceedings of the 12th International Conference on IEEE Information Fusion. 2009, 474–481
[11]
Pkalska E, Haasdonk B. Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(6): 1017–1032
CrossRef Google scholar
[12]
Sun H, Wu Q. Least square regression with indefinite kernels and coefficient regularization. Applied and Computational Harmonic Analysis, 2011, 30(1): 96–109
CrossRef Google scholar
[13]
Ying Y, Campbell C, Girolami M. Analysis of SVMwith indefinite kernels. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 2205–2213
[14]
Loosli G, Canu S. Non positive SVM. Technical Report, 2010
[15]
Haasdonk B. Feature space interpretation of SVMs with indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(4): 482–492
CrossRef Google scholar
[16]
Pekalska E, Paclik P, Duin R P. A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, 2001, 2(12): 175–211
[17]
Graepel T, Herbrich R, Bollmann-Sdorra P, Obermayer K. Classification on pairwise proximity data. In: Proceedings of the 13th Conference on Neural Information Processing Systems. 1999, 438–444
[18]
Roth V, Laub J, Kawanabe M, Buhmann J M. Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1540–1551
CrossRef Google scholar
[19]
Luss R, d’Aspremont A. Support vector machine classification with indefinite kernels. In: Proceedings of the 21st Conference on Neural Information Processing Systems. 2008, 953–960
[20]
Chen J, Ye J. Training SVM with indefinite kernels. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 136–143
CrossRef Google scholar
[21]
Chen Y, Gupta M R, Recht B. Learning kernels from indefinite similarities. In: Proceedings of the 26th International Conference on Machine Learning. 2009, 145–152
CrossRef Google scholar
[22]
Gu S, Guo Y. Learning SVM classifiers with indefinite kernels. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 942–948
[23]
Lin H T, Lin C J. A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Neural Computation, 2003, 3: 1–32
[24]
Akoa F B. Combining DC algorithms (DCAs) and decomposition techniques for the training of nonpositive-semidefinite kernels. IEEE Transactions on Neural Networks, 2008, 19(11): 1854–1872
CrossRef Google scholar
[25]
Ong C S, Mary X, Canu S, Smola A J. Learning with non-positive kernels. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 81
CrossRef Google scholar
[26]
Loosli G, Canu S, Ong C S. Learning SVM in Kre˘ın spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(6): 1204–1216
CrossRef Google scholar
[27]
Alabdulmohsin I M, Gao X, Zhang X. Support vector machines with indefinite kernels. In: Proceedings of the 6th Asian Conference on Machine Learning. 2014, 32–47
[28]
Kotsiantis S B, Zaharakis I, Pintelas P. Supervised machine learning: a review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 2007, 160: 3–24
CrossRef Google scholar
[29]
Friedman J. Another approach to polychotomous classification. Technical Report, Department of Statistics, Stanford University, 1996
[30]
Krebel U. Pairwise classification and support vector machines. Advances in Kernel Methods: Support Vector Learning, 1999, 255–268
[31]
Bottou L, Cortes C, Denker J S, Drucker H, Guyon Z, Jackel L D, Le Cun Y, Muller U A, Sackinger E, Simard P, Vapnik U. Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition. 1994, 77–82
[32]
Dietterich T G, Bakiri G. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 1995, 2: 263–286
CrossRef Google scholar
[33]
Allwein E L, Schapire R E, Singer Y. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 2000, 1(12): 113–141
[34]
Platt J C, Cristianini N, Shawe-Taylor J. Large margin dags for multiclass classification. In: Proceedings of the 12th International Conference on Neural Information Processing Systems. 1999, 547–553
[35]
Scholkopf B, Smola A J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Massackusetts: MIT Press, 2001
[36]
Tao P D, An L T H. Convex analysis approach to DC programming: theory, algorithms and applications. Acta Mathematica Vietnamica, 1997, 22(1): 289–355
[37]
Dinh T P, Le Thi H A. Recent advances in DC programming and DCA. In: Nguyen N T, Le-Thi H A, eds. Transactions on Computational Intelligence XIII. Springer, Berlin, Heidelbery, 2014, 1–37
CrossRef Google scholar
[38]
Piot B, Geist M, Pietquin O. Difference of convex functions programming for reinforcement learning. In: Proceedings of the 27th Conference on Neural Information Processing Systems. 2014, 2519–2527
[39]
Xu H M, Xue H, Chen X H, Wang Y Y. Solving indefinite kernel support vector machine with difference of convex functions programming. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2782–2788
[40]
Artacho F J A, Fleming R M, Vuong P T. Accelerating the DC algorithm for smooth functions. Mathematical Programming, 2018, 169(1): 95–118
CrossRef Google scholar
[41]
Nesterov Y. Introductory Lectures on Convex Optimization: A Basic Course. Germany: Springer Science & Business Media, 2013
[42]
Wang Z, Xue X. Multi-class support vector machine. In: Ma Y, Guo G, eds. Support Vector Machines Applications. Springer, Cham, 2014, 23–48
CrossRef Google scholar
[43]
Vapnik V N. Statistical Learning Theory. Wiley New York: JohnWiley & Sons, Inc., 1998
[44]
Weston J, Watkins C. Multi-class support vector machines. Technical Report, Department of Computer Science, Royal Holloway, University of London, 1998
[45]
Crammer K, Singer Y. On the learnability and design of output codes for multi-class problems. Machine Learning, 2002, 47(2): 201–233
CrossRef Google scholar
[46]
Blake C, Merz C J. UCI repository of machine learning databases, 1998
[47]
Ratsch G, Onoda T, Muller K R. Soft margins for AdaBoost. Machine Learning, 2001, 42(3): 287–320
CrossRef Google scholar
[48]
Duin R P, Pekalska E. The dissimilarity representation for pattern recognition: a tutorial. Technical Report, 2009
[49]
Mosek A. The MOSEK optimization software. Google Website, 2010
[50]
Wu G, Chang E Y, Zhang Z. An analysis of transformation on nonpositive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22nd International Conference on Machine Learning. 2005, 1245–1256
[51]
Chang C C, Lin C J. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 27
CrossRef Google scholar
[52]
Zhang M L, Zhou Z H. Exploiting unlabeled data to enhance ensemble diversity. DataMining and Knowledge Discovery, 2013, 26(1): 98–129
CrossRef Google scholar

RIGHTS & PERMISSIONS

2019 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(533 KB)

Accesses

Citations

Detail

Sections
Recommended

/