Sufficient dimension reduction (SDR) is a very useful concept for exploratory analysis and data visualization in regression, especially when the number of covariates is large. Many SDR methods have been proposed for regression with a continuous response, where the central subspace (CS) is the target of estimation. Various conditions, such as the linearity condition and the constant covariance condition, are imposed so that these methods can estimate at least a portion of the CS. In this paper we study SDR for regression and discriminant analysis with categorical response. Motivated by the exploratory analysis and data visualization aspects of SDR, we propose a new geometric framework to reformulate the SDR problem in terms of manifold optimization and introduce a new concept called Maximum Separation Subspace (MASES). The MASES naturally preserves the “sufficiency” in SDR without imposing additional conditions on the predictor distribution, and directly inspires a semi-parametric estimator. Numerical studies show MASES exhibits superior performance as compared with competing SDR methods in specific settings.
|Original language||English (US)|
|Journal||Journal of Machine Learning Research|
|State||Published - Jan 1 2020|
Bibliographical noteFunding Information:
We thank the Action Editor and three reviewers for constructive comments that have led to significant improvements of this paper. We would like to acknowledge support for this project from the National Science Foundation (NSF grant DMS-1613154 (Zhang), CCF-1617691 and CCF-1908969 (Mai and Zhang), DMS-1505111 (Zou)). We would like to thank Dr. Andreas Artemiou from the Cardiff University for sending us the R code for the linear PSVM method; and thank Dr. Liping Zhu from the Renmin University of China for his constructive comments and suggestions on the research.
© 2020 Xin Zhang, Qing Mai, Hui Zou. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/17-788.html.
- Categorical data analysis
- Hellinger distance
- Single index models
- Sliced inverse regression
- Sufficient dimension reduction