Articles «Previous Next»

Oracle inequalities for multi-fold cross validation

Aad W. van der Vaart
Sandrine Dudoit, Division of Biostatistics and Department of Statistics, University of California, Berkeley
Mark J. van der Laan, Division of Biostatistics and Department of Statistics, University of California, Berkeley

Abstract

We consider choosing an estimator or model from a given class by cross validation consisting of holding a nonneglible fraction of the observations out as a test set. We derive bounds that show that the risk of the resulting procedure is (up to a constant) smaller than the risk of an oracle plus an error which typically grows logarithmically with the number of estimators in the class. We extend the results to penalized cross validation in order to control unbounded loss functions. Applications include regression with squared and absolute deviation loss and classification under Tsybakov's condition.

Suggested Citation

Aad W. van der Vaart, Sandrine Dudoit, and Mark J. van der Laan. "Oracle inequalities for multi-fold cross validation" Statistics & Decisions 24.3 (2006): 351-371.