Clustering with Bregman divergences

Arindam Banerjee; Srujana Merugu; Inderjit Dhillon; Joydeep Ghosh

Clustering with Bregman divergences

Arindam Banerjee, Srujana Merugu, Inderjit Dhillon, Joydeep Ghosh

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

74 Scopus citations

Abstract

A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clustering approaches, such as classical kmeans and information-theoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the basic idea to a very large class of clustering loss functions. There are two main contributions in this paper. First, we pose the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by rate-distortion theory, and present an algorithm to minimize this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an efficient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering algorithm for all Bregman divergences.

Original language	English (US)
Title of host publication	Proceedings of the Fourth SIAM International Conference on Data Mining
Editors	M.W. Berry, U. Dayal, C. Kamath, D. Skillicorn
Pages	234-245
Number of pages	12
State	Published - Jan 1 2004
Event	Proceedings of the Fourth SIAM International Conference on Data Mining - Lake Buena Vista, FL, United States Duration: Apr 22 2004 → Apr 24 2004

Other

Other	Proceedings of the Fourth SIAM International Conference on Data Mining
Country/Territory	United States
City	Lake Buena Vista, FL
Period	4/22/04 → 4/24/04

OpenUrl availability

Full text

Cite this

@inproceedings{025b567c2f41461f8fd60ee454e5dcdd,

title = "Clustering with Bregman divergences",

abstract = "A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clustering approaches, such as classical kmeans and information-theoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the basic idea to a very large class of clustering loss functions. There are two main contributions in this paper. First, we pose the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by rate-distortion theory, and present an algorithm to minimize this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an efficient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering algorithm for all Bregman divergences.",

author = "Arindam Banerjee and Srujana Merugu and Inderjit Dhillon and Joydeep Ghosh",

year = "2004",

month = jan,

day = "1",

language = "English (US)",

pages = "234--245",

editor = "M.W. Berry and U. Dayal and C. Kamath and D. Skillicorn",

booktitle = "Proceedings of the Fourth SIAM International Conference on Data Mining",

note = "Proceedings of the Fourth SIAM International Conference on Data Mining ; Conference date: 22-04-2004 Through 24-04-2004",

}

TY - GEN

T1 - Clustering with Bregman divergences

AU - Banerjee, Arindam

AU - Merugu, Srujana

AU - Dhillon, Inderjit

AU - Ghosh, Joydeep

PY - 2004/1/1

Y1 - 2004/1/1

N2 - A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clustering approaches, such as classical kmeans and information-theoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the basic idea to a very large class of clustering loss functions. There are two main contributions in this paper. First, we pose the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by rate-distortion theory, and present an algorithm to minimize this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an efficient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering algorithm for all Bregman divergences.

AB - A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clustering approaches, such as classical kmeans and information-theoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the basic idea to a very large class of clustering loss functions. There are two main contributions in this paper. First, we pose the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by rate-distortion theory, and present an algorithm to minimize this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an efficient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering algorithm for all Bregman divergences.

UR - http://www.scopus.com/inward/record.url?scp=2942624165&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2942624165&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:2942624165

SP - 234

EP - 245

BT - Proceedings of the Fourth SIAM International Conference on Data Mining

A2 - Berry, M.W.

A2 - Dayal, U.

A2 - Kamath, C.

A2 - Skillicorn, D.

T2 - Proceedings of the Fourth SIAM International Conference on Data Mining

Y2 - 22 April 2004 through 24 April 2004

ER -

Clustering with Bregman divergences

Abstract

Other

OpenUrl availability

Other files and links

Fingerprint

Cite this