Application of two-order difference to gap statistic
Shihong Yue , Xiuxiu Wang , Miaomiao Wei
Transactions of Tianjin University ›› 2008, Vol. 14 ›› Issue (3) : 217 -221.
Application of two-order difference to gap statistic
Gap statistic is a well-known index of clustering validity, but its realization is difficult to be comprehended and accurately determined. A direct method is presented to improve the performance of the Gap statistic, which applies the two-order difference of within-cluster dispersion to replace the constructed null reference distribution in the Gap statistic. Hence, the realization of the Gap statistic becomes easy and is reformulated, and its uncertainty in applications is reduced. Also, the limitation of the Gap statistic is analyzed by two typical examples, that is, the Gap statistic is difficult to be applied to the dataset that contains strong-overlap or uneven-density clusters. Experiments verify the usefulness of the proposed method.
clustering validity / Gap statistic / data structure
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
UCI Machine Learning Repository(University of California, Irvine, CA)[z]. ftp://ftp.cs.cornell.edu/pub/smart |
/
| 〈 |
|
〉 |