Constrained clustering with weak label prior
Jing ZHANG , Ruidong FAN , Hong TAO , Jiacheng JIANG , Chenping HOU
Front. Comput. Sci. ›› 2024, Vol. 18 ›› Issue (3) : 183338
Clustering is widely exploited in data mining. It has been proved that embedding weak label prior into clustering is effective to promote its performance. Previous researches mainly focus on only one type of prior. However, in many real scenarios, two kinds of weak label prior information, e.g., pairwise constraints and cluster ratio, are easily obtained or already available. How to incorporate them to improve clustering performance is important but rarely studied. We propose a novel constrained Clustering with Weak Label Prior method (CWLP), which is an integrated framework. Within the unified spectral clustering model, the pairwise constraints are employed as a regularizer in spectral embedding and label proportion is added as a constraint in spectral rotation. To approximate a variant of the embedding matrix more precisely, we replace a cluster indicator matrix with its scaled version. Instead of fixing an initial similarity matrix, we propose a new similarity matrix that is more suitable for deriving clustering results. Except for the theoretical convergence and computational complexity analyses, we validate the effectiveness of CWLP through several benchmark datasets, together with its ability to discriminate suspected breast cancer patients from healthy controls. The experimental evaluation illustrates the superiority of our proposed approach.
clustering / weak label prior / cluster ratio / pairwise constraints
| [1] |
|
| [2] |
Voss J, Belkin M, Rademacher L. The hidden convexity of spectral clustering. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2108−2114 |
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
Liu H, Liu T, Wu J, Tao D, Fu Y. Spectral ensemble clustering. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 715−724 |
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
Levin M S. Towards hierarchical clustering. In: Proceedings of the 2nd international conference on Computer Science: theory and applications. 2007, 205−215 |
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
Parmar M D, Pang W, Hao D, Jiang J, Liupu W, Wang L, Zhou Y. FREDPC: a feasible residual error-based density peak clustering algorithm with the fragment merging strategy. IEEE Access, 2019, 7: 89789−89804 |
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
Nie F, Zhang H, Wang R, Li X. Semi-supervised clustering via pairwise constrained optimal graph. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2021, 437 |
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
Poyiadzi R, Santos-Rodríguez R, Twomey N. Label propagation for learning with label proportions. In: Proceedings of the 28th IEEE International Workshop on Machine Learning for Signal Processing. 2018, 1−6 |
| [33] |
|
| [34] |
Ng A, Jordan M, Weiss Y. On spectral clustering: Analysis and an algorithm. In: Proceedings of the 14th International Conference on Neural Information Processing Systems. 2011 |
| [35] |
Zhang X, You Q. An improved spectral clustering algorithm based on random walk. Frontiers of Computer Science in China, 2011, 5(3): 268–278 |
| [36] |
Yu S X, Shi J. Multiclass spectral clustering. In: Proceedings of the 9th IEEE International Conference on Computer Vision. 2003, 313−313 |
| [37] |
|
| [38] |
Huang J, Nie F, Huang H. Spectral rotation versus k-means in spectral clustering. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence. 2013, 431−437 |
| [39] |
Nie F P, Wang X Q, Jordan M, Huang H. The constrained laplacian rank algorithm for graph-based clustering. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. 2016, 1969−1976 |
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967, 281−297 |
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
Higher Education Press
Supplementary files
/
| 〈 |
|
〉 |