Measuring the VC-dimension using optimized experimental design.

X. Shao; V. Cherkassky; W. Li

doi:10.1162/089976600300015222

Measuring the VC-dimension using optimized experimental design.

X. Shao, V. Cherkassky, W. Li

Electrical and Computer Engineering

Research output: Contribution to journal › Article › peer-review

33 Scopus citations

Abstract

VC-dimension is the measure of model complexity (capacity) used in VC-theory. The knowledge of the VC-dimension of an estimator is necessary for rigorous complexity control using analytic VC generalization bounds. Unfortunately, it is not possible to obtain the analytic estimates of the VC-dimension in most cases. Hence, a recent proposal is to measure the VC-dimension of an estimator experimentally by fitting the theoretical formula to a set of experimental measurements of the frequency of errors on artificially generated data sets of varying sizes (Vapnik, Levin, & Le Cun, 1994). However, it may be difficult to obtain an accurate estimate of the VC-dimension due to the variability of random samples in the experimental procedure proposed by Vapnik et al. (1994). We address this problem by proposing an improved design procedure for specifying the measurement points (i.e., the sample size and the number of repeated experiments at a given sample size). Our approach leads to a nonuniform design structure as opposed to the uniform design structure used in the original article (Vapnik et al., 1994). Our simulation results show that the proposed optimized design structure leads to a more accurate estimation of the VC-dimension using the experimental procedure. The results also show that a more accurate estimation of VC-dimension leads to improved complexity control using analytic VC-generalization bounds and, hence, better prediction accuracy.

Original language	English (US)
Pages (from-to)	1969-1986
Number of pages	18
Journal	Neural computation
Volume	12
Issue number	8
DOIs	https://doi.org/10.1162/089976600300015222
State	Published - Aug 2000

Access

10.1162/089976600300015222

OpenUrl availability

Full text

Cite this

@article{fe3f9379d8ba4e78a1978b9b04098b63,

title = "Measuring the VC-dimension using optimized experimental design.",

abstract = "VC-dimension is the measure of model complexity (capacity) used in VC-theory. The knowledge of the VC-dimension of an estimator is necessary for rigorous complexity control using analytic VC generalization bounds. Unfortunately, it is not possible to obtain the analytic estimates of the VC-dimension in most cases. Hence, a recent proposal is to measure the VC-dimension of an estimator experimentally by fitting the theoretical formula to a set of experimental measurements of the frequency of errors on artificially generated data sets of varying sizes (Vapnik, Levin, & Le Cun, 1994). However, it may be difficult to obtain an accurate estimate of the VC-dimension due to the variability of random samples in the experimental procedure proposed by Vapnik et al. (1994). We address this problem by proposing an improved design procedure for specifying the measurement points (i.e., the sample size and the number of repeated experiments at a given sample size). Our approach leads to a nonuniform design structure as opposed to the uniform design structure used in the original article (Vapnik et al., 1994). Our simulation results show that the proposed optimized design structure leads to a more accurate estimation of the VC-dimension using the experimental procedure. The results also show that a more accurate estimation of VC-dimension leads to improved complexity control using analytic VC-generalization bounds and, hence, better prediction accuracy.",

author = "X. Shao and V. Cherkassky and W. Li",

year = "2000",

month = aug,

doi = "10.1162/089976600300015222",

language = "English (US)",

volume = "12",

pages = "1969--1986",

journal = "Neural computation",

issn = "0899-7667",

publisher = "MIT Press Journals",

number = "8",

}

TY - JOUR

T1 - Measuring the VC-dimension using optimized experimental design.

AU - Shao, X.

AU - Cherkassky, V.

AU - Li, W.

PY - 2000/8

Y1 - 2000/8

N2 - VC-dimension is the measure of model complexity (capacity) used in VC-theory. The knowledge of the VC-dimension of an estimator is necessary for rigorous complexity control using analytic VC generalization bounds. Unfortunately, it is not possible to obtain the analytic estimates of the VC-dimension in most cases. Hence, a recent proposal is to measure the VC-dimension of an estimator experimentally by fitting the theoretical formula to a set of experimental measurements of the frequency of errors on artificially generated data sets of varying sizes (Vapnik, Levin, & Le Cun, 1994). However, it may be difficult to obtain an accurate estimate of the VC-dimension due to the variability of random samples in the experimental procedure proposed by Vapnik et al. (1994). We address this problem by proposing an improved design procedure for specifying the measurement points (i.e., the sample size and the number of repeated experiments at a given sample size). Our approach leads to a nonuniform design structure as opposed to the uniform design structure used in the original article (Vapnik et al., 1994). Our simulation results show that the proposed optimized design structure leads to a more accurate estimation of the VC-dimension using the experimental procedure. The results also show that a more accurate estimation of VC-dimension leads to improved complexity control using analytic VC-generalization bounds and, hence, better prediction accuracy.

AB - VC-dimension is the measure of model complexity (capacity) used in VC-theory. The knowledge of the VC-dimension of an estimator is necessary for rigorous complexity control using analytic VC generalization bounds. Unfortunately, it is not possible to obtain the analytic estimates of the VC-dimension in most cases. Hence, a recent proposal is to measure the VC-dimension of an estimator experimentally by fitting the theoretical formula to a set of experimental measurements of the frequency of errors on artificially generated data sets of varying sizes (Vapnik, Levin, & Le Cun, 1994). However, it may be difficult to obtain an accurate estimate of the VC-dimension due to the variability of random samples in the experimental procedure proposed by Vapnik et al. (1994). We address this problem by proposing an improved design procedure for specifying the measurement points (i.e., the sample size and the number of repeated experiments at a given sample size). Our approach leads to a nonuniform design structure as opposed to the uniform design structure used in the original article (Vapnik et al., 1994). Our simulation results show that the proposed optimized design structure leads to a more accurate estimation of the VC-dimension using the experimental procedure. The results also show that a more accurate estimation of VC-dimension leads to improved complexity control using analytic VC-generalization bounds and, hence, better prediction accuracy.

UR - http://www.scopus.com/inward/record.url?scp=0034241362&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034241362&partnerID=8YFLogxK

U2 - 10.1162/089976600300015222

DO - 10.1162/089976600300015222

M3 - Article

C2 - 10953247

AN - SCOPUS:0034241362

SN - 0899-7667

VL - 12

SP - 1969

EP - 1986

JO - Neural computation

JF - Neural computation

IS - 8

ER -

Measuring the VC-dimension using optimized experimental design.

Abstract

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this