Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

How many times should we repeat a K-fold CV?

Asked 9 years, 8 months ago Modified 9 years, 8 months ago Viewed 19k times

I came across this thread looking at the differences between bootstrapping and cross
validation - great answer and references by the way. What I am wondering now is, if I was to
24 perform repeated 10-fold CV say to calculate a classifier's accuracy, how many times n should
I repeat it?

Does n depend on the number of folds? On the sample size? Is there any rule for this?

(In my case, I have samples as big as 5000, and if I choose anything larger than n = 20 my
computer takes way too long to perform the calculation. )

cross-validation

Share Cite Improve this question edited Apr 13, 2017 at 12:44 asked Jan 17, 2014 at 2:45
Follow Community Bot Neodyme
1 895 1 8 17

Sorted by:
2 Answers
Highest score (default)

The influencing factor is how stable your model - or, more precisely: the predictions of the
surrogates are.
14
If the models are completely stable, all surrogate models will yield the same prediction for the
same test case. In that case, iterations/repetitions are not needed, and they don't yield any
improvements.

As you can measure the stability of the predictions, here's what I'd do:
Your privacy
By clicking “Accept all cookies”,
Set up the you agree Stack
whole procedure Exchange
in a way can store
that saves thecookies
resultson
ofyour
eachdevice
crossand disclose
validation
information in accordance with our Cookie Policy.
repetition/iteration e.g. to hard disk
Start with a large number of iterations
Accept all cookies
After a few iterations are through, fetch the preliminary results and have a look at the
Necessary
stability/variation in the results for eachcookies
run. only

Then decide how many further iterations you want to refine the results.
Customize settings
Of course you may decide to run, say, 5 iterations and then decide on the final number of
iterations you want to do.

(Side note: I typically use > ca. 1000 surrogate models, so typical no of repetitions/iterations
would be around 100 - 125).

Share Cite Improve this answer Follow answered Jan 17, 2014 at 12:07
cbeleites unhappy with
SX
37.2k 3 77 142

Ask a statistician any question and their answer will be some form of "it depends".

It depends. Apart from the type of model (good point cbeleites!), the number of training set
17
points and the number of predictors? If the model is for classification, a large class imbalance
would cause me to increase the number of repetitions. Also, if I am resampling a feature
selection procedure, I would bias myself towards more resamples.

For any resampling method used in this context, remember that (unlike classical
bootstrapping), you only need enough iterations to get a "precise enough" estimate of the
mean of the distribution. That is subjective but any answer will be.

Sticking with classification with two classes for a second, suppose you expect/hope the
accuracy of the model to be about 0.80 . Since the resampling process is sampling the
accuracy estimate (say p ), the standard error would be sqrt[p*(1-p)]/sqrt(B) where B is
the number of resamples. For B = 10 , the standard error of the accuracy is about 0.13 and
with B = 100 it is about 0.04. You might use that formula as a rough guide for this particular
case.

Also consider that, in this example, the variance of the accuracy is maximized the closer you
get to 0.50 so an accurate model should need less replications since the standard error
should be lower than models that are weak learners.

HTH,

Max

Share Cite Improve this answer Follow answered Jan 18, 2014 at 2:16
Your privacy
topepo
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose
5,950 1 20 24
information in accordance with our Cookie Policy.

3 I'd be extremely wary here to apply any kind of standard error calculation in this context, because there
are 2 sources of variance here (model instability + finite set of test cases), and I think resampling
validation will not get around the finite test set variance: consider cross validation. In each run, all test
cases are tested exactly once. Thus variance between the runs of iterated CV must be due to instability.
You won't observe (nor reduce!) the variance due to the finite test set this way, but of course the result
is still subject to it. – cbeleites unhappy with SX Jan 20, 2014 at 8:34
Your privacy
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose
information in accordance with our Cookie Policy.

You might also like