Approximate Confidence Intervals for the Number of Clusters
作者:
Roger Peck,
Lloyd Fisher,
John Van Ness,
期刊:
Journal of the American Statistical Association
(Taylor Available online 1989)
卷期:
Volume 84,
issue 405
页码: 184-191
ISSN:0162-1459
年代: 1989
DOI:10.1080/01621459.1989.10478754
出版商: Taylor & Francis Group
关键词: Bootstrap confidence interval;Cluster analysis;K-means clusterings;Simulation study;Strong consistency
数据来源: Taylor
摘要:
We consider clustering for the purpose of data reduction. Similar objects are grouped together in clusters so that one can then work with the few cluster descriptors instead of the many data points. The quality of any given clustering is measured by a loss function that takes into account both the parsimony of the clustering and the loss of information due to clustering. An optimal clustering can be obtained by minimizing the theoretical loss function. It is shown that a sample version of the loss function and optimal clustering converge strongly to their theoretical counterparts as the sample size tends to infinity. We then develop a bootstrap-based procedure for obtaining approximate confidence bounds on the number of clusters in the “best” clustering. The effectiveness of this procedure is evaluated in a simulation study. An application is presented.
点击下载:
PDF (1249KB)
返 回