The p-value is the probability of seeing a difference as large as (or larger than) the one we observed, if the two groups were actually the same—i.e. under the null hypothesis. A small p-value suggests the data are unlikely under “no difference,” so we tend to say there is a difference. A large p-value suggests the observed difference could easily have arisen by chance, so we don’t have strong evidence against “no difference.”
Suppose we compare a continuous outcome (e.g. a lab value, a score) between Group A and Group B. The null hypothesis is that the two groups have the same distribution—same mean, same spread. So any difference we see in our sample is just random variation.
The p-value answers: If the two groups were really the same, how often would we get a difference at least as big as the one we got?
A density plot shows the distribution of values in each group. Where the two curves sit and how much they overlap gives an intuitive picture of how different the groups are.
| Less overlap | More overlap |
|---|---|
| Groups look clearly different | Groups look similar |
| Difference in means is large relative to spread | Difference in means is small or spread is large |
| We’d expect a smaller p-value | We’d expect a larger p-value |
So: more overlap → harder to claim “they’re different” → higher p-value. Less overlap → easier to claim “they’re different” → lower p-value.
Below we simulate two groups (e.g. a treatment and control) with the same sample size but different true means and spread:
Scenario 1: Little overlap between groups. We expect a small p-value.
Scenario 2: Substantial overlap between groups. We expect a larger p-value.
P-values for our two scenarios:
So the intuition holds: more overlap in the densities → higher p-value; less overlap → lower p-value.
A 95% confidence interval (95% CI) for the difference between two means is an interval that, in repeated sampling, would contain the true difference about 95% of the time.
We can also think in terms of separate 95% CIs for each group mean. Rough rule of thumb:
Important: CI overlap is not exactly the same as the p-value. Two CIs can overlap a bit and still have p < 0.05. So use overlap as an intuitive guide, and rely on the actual p-value (or the CI for the difference) for decisions.
Below we show the group means and 95% CIs for each group in the same two scenarios. Notice: less overlap in the densities goes with CIs that don’t overlap; more overlap in the densities goes with CIs that overlap.
Scenario 1: 95% CIs for each group mean. Little overlap in CIs → small p-value.
Scenario 2: 95% CIs for each group mean. Substantial overlap in CIs → larger p-value.
Summary
| Concept | Less overlap (Scenario 1) | More overlap (Scenario 2) |
|---|---|---|
| Density curves | Clearly separated | Overlap a lot |
| 95% CIs for each group | Little or no overlap | Overlap substantially |
| 95% CI for the difference | Does not include 0 | Includes 0 |
| P-value | Small (e.g. < 0.05) | Larger (e.g. ≥ 0.05) |