Abstract
In recent years, a considerable amount of work has been devoted to generalizing linear discriminant analysis to overcome its incompetence for high-dimensional classification (Witten and Tibshirani, 2011, Cai and Liu, 2011, Mai etal., 2012 and Fan etal., 2012). In this paper, we develop high-dimensional sparse semiparametric discriminant analysis (SSDA) that generalizes the normal-theory discriminant analysis in two ways: it relaxes the Gaussian assumptions and can handle ultra-high dimensional classification problems. If the underlying Bayes rule is sparse, SSDA can estimate the Bayes rule and select the true features simultaneously with overwhelming probability, as long as the logarithm of dimension grows slower than the cube root of sample size. Simulated and real examples are used to demonstrate the finite sample performance of SSDA. At the core of the theory is a new exponential concentration bound for semiparametric Gaussian copulas, which is of independent interest.
Original language | English (US) |
---|---|
Pages (from-to) | 175-188 |
Number of pages | 14 |
Journal | Journal of Multivariate Analysis |
Volume | 135 |
DOIs | |
State | Published - Mar 1 2015 |
Bibliographical note
Publisher Copyright:© 2014 Elsevier Inc.
Keywords
- 62H30
- Gaussian copulas
- High-dimension asymptotics
- Linear discriminant analysis
- Semiparametric model