TY - GEN
T1 - Bayesian cluster ensembles
AU - Wang, Hongjun
AU - Shan, Hanhuai
AU - Banerjee, Arindam
PY - 2009
Y1 - 2009
N2 - Cluster ensembles provide a framework for combining multiple base clusterings of a dataset to generate a stable and robust consensus clustering. There are important variants of the basic cluster ensemble problem, notably including cluster ensembles with missing values, as well as row-distributed or column-distributed cluster ensembles. Existing cluster ensemble algorithms are applicable only to a small subset of these variants. In this paper, we propose Bayesian Cluster Ensembles (BCE), which is a mixed-membership model for learning cluster ensembles, and is applicable to all the primary variants of the problem. We propose two methods, respectively based on variational approximation and Gibbs sampling, for learning a Bayesian cluster ensemble. We compare BCE extensively with several other cluster ensemble algorithms, and demonstrate that BCE is not only versatile in terms of its applicability, but also outperforms the other algorithms in terms of stability and accuracy.
AB - Cluster ensembles provide a framework for combining multiple base clusterings of a dataset to generate a stable and robust consensus clustering. There are important variants of the basic cluster ensemble problem, notably including cluster ensembles with missing values, as well as row-distributed or column-distributed cluster ensembles. Existing cluster ensemble algorithms are applicable only to a small subset of these variants. In this paper, we propose Bayesian Cluster Ensembles (BCE), which is a mixed-membership model for learning cluster ensembles, and is applicable to all the primary variants of the problem. We propose two methods, respectively based on variational approximation and Gibbs sampling, for learning a Bayesian cluster ensemble. We compare BCE extensively with several other cluster ensemble algorithms, and demonstrate that BCE is not only versatile in terms of its applicability, but also outperforms the other algorithms in terms of stability and accuracy.
UR - http://www.scopus.com/inward/record.url?scp=72849137910&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72849137910&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:72849137910
SN - 9781615671090
T3 - Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics
SP - 208
EP - 219
BT - Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics 133
T2 - 9th SIAM International Conference on Data Mining 2009, SDM 2009
Y2 - 30 April 2009 through 2 May 2009
ER -