PDF
(654KB)
Abstract
In order to implement the robust cluster analysis, solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation, and therefore affect the accuracy of clustering, a robust cluster analysis method is proposed which is based on the diversity self-paced t-mixture model. This model firstly adopts the t-distribution as the sub-model which tail is easily controllable. On this basis, it utilizes the entropy penalty expectation conditional maximal algorithm as a pre-clustering step to estimate the initial parameters. After that, this model introduces l2,1-norm as a self-paced regularization term and developes a new ECM optimization algorithm, in order to select high confidence samples from each component in training. Finally, experimental results on several real-world datasets in different noise environments show that the diversity self-paced t-mixture model outperforms the state-of-the-art clustering methods. It provides significant guidance for the construction of the robust mixture distribution model.
Keywords
cluster analysis
/
Gaussian mixture model
/
t-distribution mixture model
/
self-paced learning
/
initialization
Cite this article
Download citation ▾
null.
DSP-TMM: A Robust Cluster Analysis Method Based on Diversity Self-Paced T-Mixture Model.
Journal of Beijing Institute of Technology, 2020, 29(4): 531-543 DOI:10.15918/j.jbit1004-0579.20070