Abstract
As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.
Original language | English (US) |
---|---|
Title of host publication | Conference Record of the 48th Asilomar Conference on Signals, Systems and Computers |
Editors | Michael B. Matthews |
Publisher | IEEE Computer Society |
Pages | 1046-1050 |
Number of pages | 5 |
ISBN (Electronic) | 9781479982974 |
DOIs | |
State | Published - Apr 24 2015 |
Event | 48th Asilomar Conference on Signals, Systems and Computers, ACSSC 2015 - Pacific Grove, United States Duration: Nov 2 2014 → Nov 5 2014 |
Publication series
Name | Conference Record - Asilomar Conference on Signals, Systems and Computers |
---|---|
Volume | 2015-April |
ISSN (Print) | 1058-6393 |
Other
Other | 48th Asilomar Conference on Signals, Systems and Computers, ACSSC 2015 |
---|---|
Country/Territory | United States |
City | Pacific Grove |
Period | 11/2/14 → 11/5/14 |
Bibliographical note
Publisher Copyright:© 2014 IEEE.
Keywords
- Clustering
- K-means
- big data
- feature selection
- high-dimensional data
- random sampling and consensus
- random sketching and validation