Finding Multiple Stable Clusterings
Multi-clustering, which tries to find multiple independent ways to partition a data set into groups, has enjoyed many applications, such as customer relationship management, bioinformatics and healthcare informatics. This paper addresses two fundamental questions in multi-clustering: how to model the quality of clusterings and how to find multiple stable clusterings. We introduce to multi-clustering the notion of clustering stability based on Laplacian eigengap, which was originally used in the regularized spectral learning method for similarity matrix learning. We mathematically prove that the larger the eigengap, the more stable the clustering. Consequently, we propose a novel multi-clustering method MSC (for Multiple Stable Clustering). An advantage of our method comparing to the existing multi-clustering methods is that our method does not need any parameter about the number of alternative clusterings in the data set. Our method can heuristically estimate the number of meaningful clusterings in a data set, which is infeasible in the existing multi-clustering methods. We report an empirical study that clearly demonstrates the effectiveness of our method.
2015 IEEE International Conference on Data Mining
Hu, Juhua; Qian, Q.; Pei, J.; Jin, R.; and Zhu, S., "Finding Multiple Stable Clusterings" (2015). Institute of Technology Publications. 204.