TY - JOUR
T1 - Spectral clustering based on local linear approximations
AU - Arias-Castro, Ery
AU - Chen, Guangliang
AU - Lerman, Gilad
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - In the context of clustering, we assume a generative model where each cluster is the result of sampling points in the neighborhood of an embedded smooth surface; the sample may be contaminated with outliers, which are modeled as points sampled in space away from the clusters. We consider a prototype for a higher-order spectral clustering method based on the residual from a local linear approximation. We obtain theoretical guarantees for this algorithm and show that, in terms of both separation and robustness to outliers, it outperforms the standard spectral clustering algorithm (based on pairwise distances) of Ng, Jordan and Weiss (NIPS'01). The optimal choice for some of the tuning parameters depends on the dimension and thickness of the clusters. We provide estimators that come close enough for our theoretical purposes. We also discuss the cases of clusters of mixed dimensions and of clusters that are generated from smoother surfaces. In our experiments, this algorithm is shown to outperform pairwise spectral clustering on both simulated and real data.
AB - In the context of clustering, we assume a generative model where each cluster is the result of sampling points in the neighborhood of an embedded smooth surface; the sample may be contaminated with outliers, which are modeled as points sampled in space away from the clusters. We consider a prototype for a higher-order spectral clustering method based on the residual from a local linear approximation. We obtain theoretical guarantees for this algorithm and show that, in terms of both separation and robustness to outliers, it outperforms the standard spectral clustering algorithm (based on pairwise distances) of Ng, Jordan and Weiss (NIPS'01). The optimal choice for some of the tuning parameters depends on the dimension and thickness of the clusters. We provide estimators that come close enough for our theoretical purposes. We also discuss the cases of clusters of mixed dimensions and of clusters that are generated from smoother surfaces. In our experiments, this algorithm is shown to outperform pairwise spectral clustering on both simulated and real data.
KW - Detection of clusters in point clouds
KW - Dimension estimation
KW - Higher-order affinities
KW - Local linear approximation
KW - Local polynomial approximation
KW - Nearest-neighbor search
KW - Spectral clustering
UR - http://www.scopus.com/inward/record.url?scp=84856032134&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84856032134&partnerID=8YFLogxK
U2 - 10.1214/11-EJS651
DO - 10.1214/11-EJS651
M3 - Article
AN - SCOPUS:84856032134
VL - 5
SP - 1537
EP - 1587
JO - Electronic Journal of Statistics
JF - Electronic Journal of Statistics
SN - 1935-7524
ER -