Towards K-means-friendly spaces: Simultaneous deep learning and clustering

Bo Yang; Xiao Fu; Nicholas D. Sidiropoulos; Mingyi Hong

Towards K-means-friendly spaces: Simultaneous deep learning and clustering

Bo Yang, Xiao Fu, Nicholas D. Sidiropoulos, Mingyi Hong

Electrical and Computer Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

143 Scopus citations

Abstract

Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the 'clustering-friendly' latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network's ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.

Original language	English (US)
Title of host publication	34th International Conference on Machine Learning, ICML 2017
Publisher	International Machine Learning Society (IMLS)
Pages	5888-5901
Number of pages	14
ISBN (Electronic)	9781510855144
State	Published - 2017
Event	34th International Conference on Machine Learning, ICML 2017 - Sydney, Australia Duration: Aug 6 2017 → Aug 11 2017

Publication series

Name	34th International Conference on Machine Learning, ICML 2017
Volume	8

Other

Other	34th International Conference on Machine Learning, ICML 2017
Country/Territory	Australia
City	Sydney
Period	8/6/17 → 8/11/17

Bibliographical note

Funding Information:
This work is supported by National Science Foundation under Projects NSF IIS-1447788, NSF ECCS-1608961, and NSF CCF-1526078. The GPU used in this work was kindly donated by NVIDIA.

Publisher Copyright:
© Copyright 2017 by the authors(s).

OpenUrl availability

Full text

Cite this

Towards K-means-friendly spaces: Simultaneous deep learning and clustering. / Yang, Bo; Fu, Xiao; Sidiropoulos, Nicholas D. et al.
34th International Conference on Machine Learning, ICML 2017. International Machine Learning Society (IMLS), 2017. p. 5888-5901 (34th International Conference on Machine Learning, ICML 2017; Vol. 8).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Yang, B, Fu, X, Sidiropoulos, ND & Hong, M 2017, Towards K-means-friendly spaces: Simultaneous deep learning and clustering. in 34th International Conference on Machine Learning, ICML 2017. 34th International Conference on Machine Learning, ICML 2017, vol. 8, International Machine Learning Society (IMLS), pp. 5888-5901, 34th International Conference on Machine Learning, ICML 2017, Sydney, Australia, 8/6/17.

@inproceedings{7d6b9533eaaa4b74bcdc06d25e5b2178,

title = "Towards K-means-friendly spaces: Simultaneous deep learning and clustering",

abstract = "Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the 'clustering-friendly' latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network's ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.",

author = "Bo Yang and Xiao Fu and Sidiropoulos, {Nicholas D.} and Mingyi Hong",

note = "Funding Information: This work is supported by National Science Foundation under Projects NSF IIS-1447788, NSF ECCS-1608961, and NSF CCF-1526078. The GPU used in this work was kindly donated by NVIDIA. Publisher Copyright: {\textcopyright} Copyright 2017 by the authors(s).; 34th International Conference on Machine Learning, ICML 2017 ; Conference date: 06-08-2017 Through 11-08-2017",

year = "2017",

language = "English (US)",

series = "34th International Conference on Machine Learning, ICML 2017",

publisher = "International Machine Learning Society (IMLS)",

pages = "5888--5901",

booktitle = "34th International Conference on Machine Learning, ICML 2017",

}

TY - GEN

T1 - Towards K-means-friendly spaces

T2 - 34th International Conference on Machine Learning, ICML 2017

AU - Yang, Bo

AU - Fu, Xiao

AU - Sidiropoulos, Nicholas D.

AU - Hong, Mingyi

N1 - Funding Information: This work is supported by National Science Foundation under Projects NSF IIS-1447788, NSF ECCS-1608961, and NSF CCF-1526078. The GPU used in this work was kindly donated by NVIDIA. Publisher Copyright: © Copyright 2017 by the authors(s).

PY - 2017

Y1 - 2017

N2 - Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the 'clustering-friendly' latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network's ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.

AB - Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the 'clustering-friendly' latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network's ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.

UR - http://www.scopus.com/inward/record.url?scp=85048148613&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048148613&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85048148613

T3 - 34th International Conference on Machine Learning, ICML 2017

SP - 5888

EP - 5901

BT - 34th International Conference on Machine Learning, ICML 2017

PB - International Machine Learning Society (IMLS)

Y2 - 6 August 2017 through 11 August 2017

ER -

Towards K-means-friendly spaces: Simultaneous deep learning and clustering

Abstract

Publication series

Other

Bibliographical note

OpenUrl availability

Other files and links

Fingerprint

Cite this