Abstract
We investigate practical selection of hyper-parameters for support vector machines (SVM) regression (that is, ε-insensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than re-sampling approaches commonly used in SVM applications. In particular, we describe a new analytical prescription for setting the value of insensitive zone ε, as a function of training sample size. Good generalization performance of the proposed parameter selection is demonstrated empirically using several low- and high-dimensional regression problems. Further, we point out the importance of Vapnik's ε-insensitive loss for regression problems with finite samples. To this end, we compare generalization performance of SVM regression (using proposed selection of ε-values) with regression using 'least-modulus' loss (ε=0) and standard squared loss. These comparisons indicate superior generalization performance of SVM regression under sparse sample settings, for various types of additive noise.
Original language | English (US) |
---|---|
Pages (from-to) | 113-126 |
Number of pages | 14 |
Journal | Neural Networks |
Volume | 17 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2004 |
Bibliographical note
Funding Information:The authors thank Dr V. Vapnik for many useful discussions. This work was supported, in part, by NSF grant ECS-0099906.
Keywords
- Complexity control
- Loss function
- Parameter selection
- Prediction accuracy
- Support vector machine regression
- VC theory