What are the most common statistical mistakes?

Confusing correlation with causation, ignoring sample size, p-hacking, survivorship bias, and misinterpreting percentages are very common.

P-hacking is manipulating data or analysis until you get a statistically significant result, inflating false positive rates.

What is the base rate fallacy?

The base rate fallacy is ignoring the overall prevalence of an event when judging probability, leading to overestimation of rare outcomes.

What is the ecological fallacy in statistics?

The ecological fallacy assumes that group-level trends apply to individuals. Average income in a city does not tell you any one person's income.

Common Statistical Mistakes

Everyone Makes These Mistakes

Statistical mistakes aren't limited to students or beginners. Journalists, politicians, business executives, and even some scientists make them regularly. These errors often aren't intentional. They come from shortcuts in thinking that feel logical but lead us astray.

Learning to recognize these mistakes protects you in two ways: you'll catch errors when other people make them, and you'll avoid making them yourself when evaluating information.

Mistake 1: Confusing Correlation with Causation

This is the single most common statistical error, and it's everywhere. When two things tend to happen together, it's tempting to assume one causes the other. But correlation (two things moving together) is not the same as causation (one thing making the other happen).

Example

There's a strong statistical correlation between ice cream sales and shark attacks. When ice cream sales go up, so do shark attacks. Does ice cream attract sharks? Of course not. Both increase during summer because more people go to the beach in warm weather. The warm weather is the hidden factor driving both.

Real-world consequences of this mistake are serious. For years, studies showed that people who took vitamin supplements tended to be healthier. Many people concluded that supplements cause better health. But later, more carefully designed experiments found that the supplements themselves provided little benefit. The people who took them were simply more health-conscious overall: they also exercised more, ate better, and visited their doctors regularly.

Mistake 2: Cherry-Picking Data

Cherry-picking means selecting only the data points that support your argument and ignoring the ones that don't. It's like a student showing their parents only the tests they did well on.

This happens frequently in business and politics. A company might report "revenue grew every quarter this year" while omitting that profits fell. A politician might say "crime dropped 15% since I took office" by choosing a starting date that was an unusual spike.

The antidote to cherry-picking is asking for the full picture. What does the complete dataset look like? What time period covers the whole story? Are there data points being conveniently left out?

Mistake 3: Small Sample Sizes

Small groups produce unreliable results. If you flip a coin ten times and get seven heads, you might think the coin is rigged. But if you flip it 10,000 times, you'll almost certainly get close to 50% heads. Small samples are noisy. They bounce around and can give extreme results just by chance.

Example

A news article reports: "Study finds that people who eat walnuts have better memory." You check the study and discover it involved 18 participants over two weeks. With such a small group, a couple of naturally sharp-minded people ending up in the walnut group by chance could explain the entire result. Compare this to a study of 2,000 people over two years, and the findings carry much more weight.

Be especially cautious with statistics about small groups. "The best-performing school in the state" might be a tiny school where a few gifted students pull up the average. Year to year, small schools often swing between top and bottom rankings just because of natural variation.

Mistake 4: Ignoring Base Rates

The base rate is how common something is in the general population. Ignoring it leads to wildly wrong conclusions, especially when dealing with rare events.

Imagine a medical test that's 99% accurate for detecting a rare disease that affects 1 in 10,000 people. If you test positive, what are the chances you actually have the disease? Most people guess 99%. The real answer is about 1%. Here's why: out of 10,000 people tested, the test correctly identifies the 1 person who has the disease. But it also gives false positives to about 100 healthy people (1% of 9,999). So out of 101 positive results, only 1 person actually has the disease.

This isn't just a math puzzle. It has real implications for medical screening, criminal justice, and security systems. Whenever a test or claim involves something rare, always consider the base rate.

Mistake 5: Percentage Points vs. Percentages

This is a subtle but important distinction that trips up even experienced professionals. A "percentage point" change and a "percent" change are very different things.

Example

Suppose an interest rate rises from 2% to 3%. You could describe this two ways. "The rate increased by 1 percentage point" (from 2% to 3%). Or "The rate increased by 50%" (because 1 is 50% of 2). Both statements are true, but they give completely different impressions. A politician wanting to downplay the change says "just one percentage point." An opponent wanting to dramatize it says "a 50% increase." Same data, different framing.

When you hear a claim using percentages, pause and ask: percentage of what? Is it percentage points (an absolute difference) or a percentage change (a relative difference)?

Mistake 6: Averages That Hide the Story

An average can paint a misleading picture when the underlying data is unevenly spread. If nine people in a room earn $50,000 a year and one person earns $5 million, the average income is $545,000. That number describes nobody in the room accurately.

When someone reports "the average," ask which average they mean (mean, median, or mode), and whether the data might be skewed by extreme values. For income, home prices, and many other real-world measurements, the median (middle value) is usually more informative than the mean (arithmetic average).

Spotting These Mistakes in the Wild

You now have a mental toolkit for catching the most common statistical errors. Here's a quick reference:

Two things happening together doesn't mean one causes the other.
Look for the data that's missing, not just the data that's shown.
Be skeptical of findings from very small studies.
When something is rare, positive results are often wrong.
Check whether "percent" means percentage points or a relative change.
Ask which kind of average is being used and whether extremes might distort it.

Key Takeaway

Statistical mistakes are easy to make and easy to miss. The most important ones to watch for are confusing correlation with causation, cherry-picking data that supports a conclusion, drawing big conclusions from small samples, ignoring how rare something is, and mixing up percentage points with percentages. You don't need to be a math expert to catch these. Just slow down and ask a few critical questions before accepting a claim.

Common Statistical Mistakes

Everyone Makes These Mistakes

Mistake 1: Confusing Correlation with Causation

Mistake 2: Cherry-Picking Data

Mistake 3: Small Sample Sizes

Mistake 4: Ignoring Base Rates

Mistake 5: Percentage Points vs. Percentages

Mistake 6: Averages That Hide the Story

Spotting These Mistakes in the Wild

Related Lessons