Patterns in Randomness
Flip a coin once, and the result feels completely random. Flip it 1,000 times, and a pattern appears: roughly half will be heads. Roll a die once, and anything could happen. Roll it 10,000 times, and each number shows up about equally.
A probability distribution describes these patterns. It tells you all the possible outcomes of a random event and how likely each one is. Think of it as a complete map of chance - instead of asking about one specific outcome, you can see the whole picture at once.
What Is a Probability Distribution?
A probability distribution answers the question: "If I were to repeat this random event many times, what would the results look like?"
It can be shown as a table, a formula, or - most commonly - a graph. The graph shows possible outcomes along the bottom and their probabilities up the side.
Roll two dice and add the numbers. The possible totals range from 2 to 12. But they are NOT all equally likely:
- A total of 2 can only happen one way: 1+1. Probability: 1/36.
- A total of 7 can happen six ways: 1+6, 2+5, 3+4, 4+3, 5+2, 6+1. Probability: 6/36.
- A total of 12 can only happen one way: 6+6. Probability: 1/36.
If you graphed this, you would see a triangle shape - low at the edges (2 and 12), highest in the middle (7). That graph IS the probability distribution for the sum of two dice.
Two Types of Distributions
Distributions come in two flavors, depending on the kind of data:
Discrete Distributions
These deal with countable outcomes. How many heads in 10 coin flips? How many customers visit a store today? How many defective items in a shipment? The outcomes are specific numbers (0, 1, 2, 3...) with gaps between them.
Continuous Distributions
These deal with measurable outcomes that can take any value in a range. A person's height could be 170.0 cm, 170.1 cm, 170.15 cm - any value is possible. Temperature, time, and weight are all continuous. Instead of asking "what is the probability of being exactly 170.0 cm tall?" (which is essentially zero for continuous data), we ask about ranges: "What is the probability of being between 165 and 175 cm?"
The Normal Distribution: The Famous Bell Curve
Of all probability distributions, the normal distribution - also called the bell curve - is by far the most important. When you graph it, it forms a smooth, symmetrical shape that looks like a bell: tall in the middle and tapering off equally on both sides.
The bell curve is defined by just two numbers:
- The mean (average): This is the center of the bell - the peak. It tells you where most values cluster.
- The standard deviation: This measures how spread out the values are. A small standard deviation means the bell is tall and narrow (values are tightly packed). A large standard deviation means the bell is short and wide (values are more spread out).
Adult women in the US have an average height of about 162 cm (5'4"), with a standard deviation of about 7 cm. This means:
- Most women (about 68%) are within one standard deviation of the mean: between 155 and 169 cm (about 5'1" to 5'7").
- Almost all women (about 95%) are within two standard deviations: between 148 and 176 cm (about 4'10" to 5'9").
- Being shorter than 141 cm or taller than 183 cm is very rare - less than 0.3% of the population.
This is why clothing stores stock the most inventory in medium sizes and less in the extremes. The bell curve tells them where most customers fall.
The 68-95-99.7 Rule
One of the most useful facts about the normal distribution is the 68-95-99.7 rule (sometimes called the "empirical rule"). For any bell curve:
- 68% of values fall within 1 standard deviation of the mean.
- 95% of values fall within 2 standard deviations of the mean.
- 99.7% of values fall within 3 standard deviations of the mean.
This rule gives you a quick way to judge whether a value is typical or unusual. If something falls more than 2 standard deviations from the mean, it is in the outer 5% - quite rare. More than 3 standard deviations? Extremely rare.
A standardized test has a mean score of 500 and a standard deviation of 100. Using the 68-95-99.7 rule:
- About 68% of test-takers score between 400 and 600.
- About 95% score between 300 and 700.
- About 99.7% score between 200 and 800.
If someone scores 750, they are more than 2 standard deviations above the mean - placing them in the top 2-3% of test-takers. That single number tells you a lot, thanks to the bell curve.
Why Is the Bell Curve Everywhere?
Here is the remarkable thing: the bell curve shows up in an astonishing number of real-world situations. Heights, blood pressure, test scores, measurement errors, daily temperatures, the weight of apples from an orchard - all tend to follow a bell curve. Why?
The answer comes from a deep mathematical result called the Central Limit Theorem. In simple terms, it says:
When you add up many small, independent, random effects, the total tends to form a bell curve - no matter what the individual effects look like.
A person's height, for example, is influenced by hundreds of genetic and environmental factors, each contributing a small amount. Add them all up, and you get a bell curve. Test scores depend on knowledge, preparation, focus, test difficulty, and luck - many small factors that combine into a bell-shaped distribution.
A factory makes bolts that are supposed to be exactly 10 cm long. In reality, each bolt is slightly different due to tiny variations in the metal, the machine, temperature, and other factors. If the factory measures 10,000 bolts, the lengths will form a bell curve centered around 10 cm, with most bolts very close to that target and a few outliers on either side.
Quality control teams use this: if a bolt is more than 3 standard deviations from the mean, something has likely gone wrong with the machine.
Other Important Distributions
The bell curve is the most famous, but it is not the only distribution. Here are a few others you might encounter:
The Uniform Distribution
Every outcome is equally likely. A fair die has a uniform distribution: each face has a 1/6 chance. If you graph it, you get a flat line - no peaks, no valleys.
The Skewed Distribution
Not everything is symmetrical. Income distribution, for example, is right-skewed: most people earn a moderate amount, but a small number earn enormously more. The "tail" stretches far to the right. This is why the median income is often a better measure than the mean - the extreme high earners pull the average upward.
The Binomial Distribution
This describes the number of successes in a fixed number of yes/no trials. How many heads in 20 coin flips? How many of 100 customers will buy something? The binomial distribution gives the probability for each possible count. Interestingly, when the number of trials is large enough, the binomial distribution starts to look like a bell curve.
What Distributions Tell Us in Practice
Understanding distributions is not just academic. They have direct, practical value:
- Spotting unusual events. If a measurement falls far outside the expected distribution, something noteworthy may be happening. A factory bolt that is way too long, a student score that is far from the mean, a stock price that moved far more than expected - distributions help you spot these.
- Making predictions. If you know a distribution, you can estimate the probability of future outcomes. Insurance companies use distributions to set premiums. Weather services use them to forecast temperatures.
- Setting standards. "Normal" ranges for blood pressure, cholesterol, and other health measures are based on the distribution of values in healthy populations. If your measurement falls outside the "normal" range, it means you are in the tails of the distribution.
A pediatrician tells parents their child is in the "75th percentile" for height. This means the child is taller than 75% of children the same age. The doctor knows this because they have the height distribution for children - a bell curve - and can see exactly where any individual child falls on it.
Distributions and Everyday Decisions
You interact with probability distributions more often than you might think:
- When a package says "delivery in 3-5 business days," the company is describing the middle of a distribution. Most packages arrive in that window, but some arrive earlier and some later.
- When a recipe says "bake for 25-30 minutes," the actual time depends on your oven, the pan, the altitude - many small factors. The range reflects a distribution of possible baking times.
- When a commute "usually takes 20 minutes," that is the peak of a distribution. Some days it takes 15, some days 40, and the distribution shows how likely each travel time is.
A probability distribution maps out all possible outcomes and their likelihoods. The normal distribution (bell curve) is the most common, defined by its mean and standard deviation. Thanks to the 68-95-99.7 rule, you can quickly tell whether a value is typical or unusual. The bell curve appears everywhere because many real-world outcomes result from the combination of many small, random factors. Understanding distributions gives you a powerful lens for interpreting data, spotting outliers, and making informed predictions in everyday life.