Title
Finding Multiple Stable Clusterings
Publication Date
6-1-2017
Document Type
Article
Abstract
Multi-clustering, which tries to find multiple independent ways to partition a data set into groups, has enjoyed many applications, such as customer relationship management, bioinformatics and healthcare informatics. This paper addresses two fundamental questions in multi-clustering: How to model quality of clusterings and how to find multiple stable clusterings (MSC). We introduce to multi-clustering the notion of clustering stability based on Laplacian eigengap, which was originally used by the regularized spectral learning method for similarity matrix learning. We mathematically prove that the larger the eigengap, the more stable the clustering. Furthermore, we propose a novel multi-clustering method MSC. An advantage of our method comparing to the state-of-the-art multi-clustering methods is that our method can provide users a feature subspace to understand each clustering solution. Another advantage is that MSC does not need users to specify the number of clusters and the number of alternative clusterings, which is usually difficult for users without any guidance. Our method can heuristically estimate the number of stable clusterings in a data set. We also discuss a practical way to make MSC applicable to large-scale data. We report an extensive empirical study that clearly demonstrates the effectiveness of our method.
Publication Title
Knowledge and Information Systems
Volume
51
Issue
3
First Page
991
Last Page
1021
DOI
10.1007/s10115-016-0998-9
Publisher Policy
pre print, post print (12 month embargo)
Recommended Citation
Hu, Juhua; Qian, Qi; Pei, Jian; Jin, Rong; and Zhu, Shenghuo, "Finding Multiple Stable Clusterings" (2017). School of Engineering and Technology Publications. 202.
https://digitalcommons.tacoma.uw.edu/tech_pub/202