Abstract
We give a commentary on the challenges of big data for Statistics. We then narrow our discussion to one of those challenges: dimension reduction. This leads to consideration of one particular dimension reduction method—partial least-squares (PLS) regression—for prediction in big high-dimensional regressions where the sample size and the number of predictors are both large. We show that in some regression contexts single-component PLS predictions converge at the usual root-n rate as n,p → ∞ regardless of the relationship between the sample size n and number of predictors p. Asymptotically, PLS predictions then behave as regression predictions in the usual context where p is fixed and n→ ∞ These results support the conjecture that PLS regression can be an effective method for prediction in big high-dimensional regressions.
Original language | English (US) |
---|---|
Pages (from-to) | 62-78 |
Number of pages | 17 |
Journal | Canadian Journal of Statistics |
Volume | 46 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2018 |
Bibliographical note
Publisher Copyright:© 2017 Statistical Society of Canada
Keywords
- Abundant regressions
- MSC 2010: Primary 62J05
- data science
- dimension reduction
- secondary 62F12
- sparse regressions