Spectral clustering based on local PCA

Ery Arias-Castro, Gilad Lerman, Teng Zhang

Research output: Contribution to journalArticlepeer-review

30 Scopus citations


We propose a spectral clustering method based on local principal components analysis (PCA). After performing local PCA in selected neighborhoods, the algorithm builds a nearest neighbor graph weighted according to a discrepancy between the principal subspaces in the neighborhoods, and then applies spectral clustering. As opposed to standard spectral methods based solely on pairwise distances between points, our algorithm is able to resolve intersections. We establish theoretical guarantees for simpler variants within a prototypical mathematical framework for multi-manifold clustering, and evaluate our algorithm on various simulated data sets.

Original languageEnglish (US)
Pages (from-to)1-57
Number of pages57
JournalJournal of Machine Learning Research
StatePublished - Mar 1 2017

Bibliographical note

Funding Information:
This work was partially supported by grants from the National Science Foundation (DMS 0915160, 0915064, 0956072, 1418386, 1513465). We would like to thank Jan Rataj for helpful discussion around Lemma 3 and Xu Wang for his sharp proofreading. We also gratefully acknowledge the comments, suggestions, and scrutiny of an anonymous referee. We would also like to acknowledge support from the Institute for Mathematics and its Applications (IMA). For one thing, the authors first learned about the research of Goldberg et al. (2009) there, at the Multi-Manifold Data Modeling and Applications workshop in the Fall of 2008, and this was the main inspiration for our paper. Also, part of our work was performed while TZ was a postdoctoral fellow at the IMA, and also while EAC and GL were visiting the IMA.

Publisher Copyright:
© 2017 Ery Arias-Castro, Gilad Lerman, and Teng Zhang.


  • Intersecting clusters
  • Local principal component analysis
  • Multi-manifold clustering
  • Spectral clustering


Dive into the research topics of 'Spectral clustering based on local PCA'. Together they form a unique fingerprint.

Cite this