Principal components regression (PCR) is a well-known method to achieve dimension reduction and often improved prediction over the ordinary least squares. The conventional PCR retains the principal components with large variance and discards those with smaller variance. This operation can easily lead to poor prediction when the response variable is related to principal components with small variance. In this work, we propose a simple remedy named response-guided principal components regression (RgPCR) that selects principal components for regression based on both the variance of principal components and the goodness of fit to the response. RgPCR is easy to implement without using any optimization and works naturally for both low dimensional and high dimensional data. We derive a Cp type statistic for selecting the tuning parameter in RgPCR. In our numerical experiments, RgPCR is shown to enjoy promising performance.
Bibliographical noteFunding Information:
This work is supported in part by NSF DMS 1915‐842.
- penalized regression
- principal components