Do Two Things Move Together?
In everyday life, we constantly notice patterns. Taller people tend to weigh more. Students who study more hours tend to get better grades. Cities with more police tend to have more crime. (Wait - does that last one mean police cause crime?)
Correlation is how statisticians measure and describe these relationships. It tells you whether two things tend to move together, and how strongly. But as that third example hints, it doesn't tell you why.
What Is Correlation?
Correlation measures the strength and direction of a linear relationship between two variables. When one variable goes up, does the other tend to go up too? Go down? Or is there no consistent pattern?
The most common measure is the correlation coefficient, usually written as r. It's a single number between -1 and +1.
Reading the Correlation Coefficient
- r = +1: Perfect positive correlation. As one variable increases, the other increases by a perfectly predictable amount. Every point falls exactly on an upward line.
- r = -1: Perfect negative correlation. As one goes up, the other goes down in a perfectly predictable way.
- r = 0: No linear relationship at all. Knowing one variable tells you nothing about the other.
In practice, you'll almost never see exactly +1, -1, or 0. Real data is messy. Here's a rough guide:
- 0.7 to 1.0 (or -0.7 to -1.0): Strong relationship
- 0.4 to 0.7 (or -0.4 to -0.7): Moderate relationship
- 0.1 to 0.4 (or -0.1 to -0.4): Weak relationship
- 0.0 to 0.1 (or 0.0 to -0.1): Essentially no relationship
Height and weight - Among adults, height and weight have a positive correlation of roughly r = 0.5 to 0.7. Taller people tend to weigh more, but there's plenty of variation. A 5'6" person might weigh more than a 6'0" person.
The correlation is positive (both go up together) and moderate to strong (the pattern is noticeable but not perfect).
Positive vs. Negative Correlation
Positive correlation means both variables move in the same direction. When one goes up, the other tends to go up. When one drops, the other tends to drop.
- Hours studied and exam scores (more study, higher scores)
- Temperature and ice cream sales (hotter days, more ice cream sold)
- Experience and salary (more years working, higher pay - generally)
Negative correlation means they move in opposite directions. When one goes up, the other tends to go down.
- Exercise and resting heart rate (more exercise, lower heart rate)
- Price and demand (higher price, fewer people buy)
- Absences and grades (more missed classes, lower grades)
Correlation Does Not Mean Causation
This is the single most important rule in statistics, and it comes up constantly with correlation. Just because two things are correlated does not mean one causes the other.
Ice cream sales and drowning deaths are positively correlated. When ice cream sales go up, drowning deaths also go up. Does ice cream cause drowning?
Of course not. Both are caused by a third variable: hot weather. When it's hot, more people buy ice cream AND more people go swimming (leading to more drowning incidents). The ice cream and drowning are related, but neither causes the other.
This is called a confounding variable - a hidden factor that influences both things you're measuring.
There are several reasons two things can be correlated without one causing the other:
- A third variable causes both. (Hot weather causes both ice cream sales and swimming.)
- Reverse causation. Maybe A doesn't cause B - instead, B causes A. Cities with more crime might hire more police, not the other way around.
- Pure coincidence. With enough data, you'll find random correlations. The number of films Nicolas Cage appeared in correlates with swimming pool drownings - obviously, that's meaningless.
What Correlation Misses
The correlation coefficient only measures linear (straight-line) relationships. If the relationship between two variables is curved, the correlation coefficient can be misleading.
For example, stress and performance have a curved relationship: a little stress improves performance, but too much stress hurts it. The correlation coefficient might show r = 0, suggesting no relationship, when there clearly is one - it's just not a straight line.
This is why it's always a good idea to plot your data before relying on a single number.
Correlation in Everyday Life
You encounter correlations all the time, often without realizing it:
- Your doctor might note that your cholesterol level correlates with heart disease risk.
- A business might find that customer satisfaction scores correlate with repeat purchases.
- A school might discover that attendance correlates with graduation rates.
In each case, the correlation is useful information - but you need to investigate further before concluding one thing causes the other.
Correlation measures whether two things tend to move together (positive correlation) or in opposite directions (negative correlation), on a scale from -1 to +1. It's a powerful tool for spotting patterns, but it has a crucial limitation: correlation does not prove causation. Two things can be correlated because of a hidden third factor, reverse causation, or pure coincidence. Always ask "why" before jumping to conclusions about what causes what.