In this paper, we consider fitting a flexible and interpretable additive regression model in a data-rich setting. We wish to avoid pre-specifying the functional form of the conditional association between each covariate and the response, while still retaining interpretability of the fitted functions. A number of recent proposals in the literature for nonparametric additive modeling are data adaptive, in the sense that they can adjust the level of flexibility in the functional fits to the data at hand. For instance, the sparse additive model makes it possible to adaptively determine which features should be included in the fitted model, the sparse partially linear additive model allows each feature in the fitted model to take either a linear or a nonlinear functional form, and the recent fused lasso additive model and additive trend filtering proposals allow the knots in each nonlinear function fit to be selected from the data. In this paper, we combine the strengths of each of these recent proposals into a single proposal that uses the data to determine which features to include in the model, whether to model each feature linearly or nonlinearly, and what form to use for the nonlinear functions. We establish connections between our approach and recent proposals from the literature, and we demonstrate its strengths in a simulation study.
Bibliographical noteFunding Information:
Daniela Witten was supported by the National Institutes of Health through grant DP5OD009145 and by the National Science Foundation through CAREER Award DMS-1252624. An Associate Editor and two reviewers provided helpful comments that led to improvements in this paper.
© 2018 John Wiley & Sons, Ltd.
- additive model
- feature selection
- nonparametric regression
- sparse model