In recent years several sparse linear discriminant analysis methods have been proposed for high-dimensional classification and variable selection. Most of these proposals focus on binary classification and are not directly applicable to multiclass classification problems. Some sparse discriminant analysis methods can handle multiclass classification problems, but their theoretical justifications remain unknown. In this paper, we propose a new multiclass sparse discriminant analysis method that estimates all discriminant directions simultaneously. We show that when applied to the binary case our proposal yields a classification direction that is equivalent to those attained by two successful binary sparse linear discriminant analysis methods, providing a unification of these seemingly unrelated proposals. Our method can be solved by an efficient algorithm that is implemented in an open R package msda available from CRAN. We offer theoretical justification of our method by establishing a variable selection consistency result and finding rates of convergence under the ultrahigh dimensionality setting. We further demonstrate the empirical performance of our method with simulations and data.
Bibliographical noteFunding Information:
The authors thank the Editor, an associate editor, and referees for their helpful comments and suggestions. Zou’s research is partially supported by NSF grant DMS-1505111. Mai’s research is partly supported by CIF-1617691, National Science Foundation.
© 2019 Institute of Statistical Science. All rights reserved.
Copyright 2019 Elsevier B.V., All rights reserved.
- Discriminant analysis
- High dimensional data
- Multiclass classification
- Rates of convergence
- Variable selection