Bootstrapping is a resampling method: we treat our sample as the population, draw many new samples with replacement from it, compute the statistic of interest in each new sample, and use that bootstrap distribution to estimate standard errors and confidence intervals—without assuming a particular distribution (e.g. normality).
Bootstrapping is especially helpful when:
| Situation | Why bootstrap helps |
|---|---|
| Non-normal data | You want a CI for the median, IQR, or another non-mean statistic where formulas are messy or assume normality. |
| Complex statistics | The statistic has no simple formula for its SE (e.g. ratio of medians, custom estimator). |
| Small samples | Large-sample (e.g. normal) approximations may be unreliable; the bootstrap uses the data’s shape. |
| Checking robustness | You can compare bootstrap CIs to model-based CIs to see if normality assumptions matter. |
You can use the bootstrap for the mean too—it will usually agree with the usual normal-based CI when the sample is reasonable. The real power is for anything else (median, trimmed mean, regression coefficients, etc.).
The idea: if we could repeat the study many times we’d get a distribution of the statistic; we can’t, so we simulate that by resampling from our one sample.
Below we use a tiny sample of \(n = 12\) values so you can see what “resample with replacement” means. The top panel shows the original sample (each value once). The bottom panel shows one bootstrap sample: we drew 12 times with replacement, so some values appear more than once and some not at all. Each dot is one draw; repeated values are stacked vertically so you can see multiplicity.
Top: original sample. Bottom: one bootstrap sample—same n, drawn with replacement; some values repeat, some are omitted.
In the bootstrap sample, the mean (and other statistics) will be slightly different from the original. Repeating this thousands of times gives the bootstrap distribution of the mean.
We now take \(B = 5000\) bootstrap samples from the same data and compute the mean of each. The histogram below is the bootstrap distribution of the sample mean. The vertical dashed line is the mean of the original sample; the shaded region is the 95% percentile interval (2.5th to 97.5th percentiles of the bootstrap distribution).
Bootstrap distribution of the sample mean (B = 5000). Dashed line: original sample mean. Green vertical lines: 95% percentile CI (2.5th–97.5th).
So from one sample we get an entire distribution of possible values of the mean. The spread of that distribution is the bootstrap standard error; the 2.5th and 97.5th percentiles give a 95% confidence interval that does not assume normality.
Formulas for the standard error of the median are awkward and often assume normality. The bootstrap treats the median like any other statistic: resample, compute the median, repeat. Below we show the bootstrap distribution of the median for the same 12 observations. Again, the dashed line is the original sample median; the shaded region is the 95% percentile interval.
Bootstrap distribution of the sample median. Green vertical lines: 95% percentile CI. No normality assumption.
The bootstrap distribution of the median is discrete (with only a few possible values when \(n\) is small); the percentile interval still gives a plausible range. For larger \(n\), the bootstrap distribution of the median becomes smoother.
From the same bootstrap distribution you can build different 95% confidence intervals:
| Type | How it’s built | When it’s useful |
|---|---|---|
| Percentile | Take the 2.5th and 97.5th percentiles of the bootstrap distribution. | Default choice; easy to explain and often works well. |
| Normal approximation | Mean of bootstrap distribution ± 1.96 × (bootstrap SE). | When the bootstrap distribution is roughly symmetric; then similar to percentile. |
| BCa (bias-corrected and accelerated) | Adjusts the percentiles for bias (bootstrap distribution shifted from original) and skewness (acceleration). | When the statistic is biased or the bootstrap distribution is skewed; gives better coverage in theory. |
In R, boot (package boot)
can compute percentile, normal, and BCa intervals. For teaching, the
percentile method is the one we used in the figures
above.
Conceptually, bootstrapping does this:
The next figure summarises the flow: from original sample to many bootstrap samples to one bootstrap distribution (here, of the mean), with the 95% percentile CI marked.
From one sample to bootstrap distribution and 95% CI. Green vertical lines: 2.5th and 97.5th percentiles (percentile interval).