What is the central limit theorem in simple terms?

The central limit theorem says that if you take many samples from any population and calculate the average of each sample, those averages will form a bell-shaped (normal) distribution - even if the original data is not bell-shaped. This works as long as the sample size is large enough (usually 30 or more).

Why is the central limit theorem important?

It is the reason most statistical methods work. Because sample means are approximately normal, we can use normal-distribution-based tools like confidence intervals and hypothesis tests on data from any population, as long as sample sizes are adequate.

How large does the sample need to be for the CLT to work?

A common rule of thumb is n = 30 or more. However, if the population is already close to normal, smaller samples work fine. If the population is very skewed, you may need larger samples (50 or more) for the CLT to kick in.

What Is the Central Limit Theorem?

Definition

The central limit theorem (CLT) states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the shape of the original population distribution. This is true as long as the samples are independent and the sample size is sufficiently large.

How It Works

No matter what the original data looks like - skewed, uniform, bimodal - the averages of repeated samples will form a bell curve.

Example

Rolling a single die gives a flat (uniform) distribution - each number from 1 to 6 is equally likely.

But if you roll 30 dice and record the average, then repeat this 1,000 times, the distribution of those averages will be bell-shaped, centered around 3.5.

The more dice per roll, the closer the distribution of averages gets to a perfect normal curve.

Why It Matters

The central limit theorem is arguably the most important theorem in statistics. It justifies the use of confidence intervals, hypothesis tests, and many other methods that assume normality. Without the CLT, these tools would only work on data that is already normally distributed, which is rare in the real world.

The CLT also explains why averages are more reliable than individual measurements. The variability of sample means decreases as sample size increases (by a factor of 1/square root of n), which is why larger studies produce more precise estimates.

Key Takeaway

The central limit theorem guarantees that sample averages are approximately normal for large enough samples. This is why most statistical methods work regardless of the shape of the original data.

What Is the Central Limit Theorem?

Definition

How It Works

Why It Matters

Related Lessons