When flipping a coin, getting 10 consecutive tails might seem rare. However, after 10 tosses, the chance of tails on the next flip remains 50 percent.Statistics blends both mathematics and probability. The goal of statistics is to analyze observable real-world phenomena — such as measuring the height of oak trees or determining the effectiveness of a vaccine in preventing disease — without the need to examine every oak tree or vaccinate every person before evaluating a drug's efficacy.
Since probability deals with events that are driven by chance, we must acknowledge that no matter how carefully we measure with statistics, we can never capture the entire picture of the process.
Why Do We Use Statistics?
Imagine flipping a coin four times and getting three heads and one tail. Without statistics, one might assume the probability of getting heads is 75 percent. However, the true probability of heads in a coin flip is always 50-50. With 40 coin flips, the outcome would likely approach an even 1:1 ratio, and statistics would reveal this trend.
"A significant portion of statistics is about making inferences from a sample — the actual data points — to characteristics of the entire population — all potential data points," explains John Drake, a research professor at the Center for the Ecology of Infectious Diseases at the University of Georgia, via email. "For example, if we're interested in the height of oak trees, we can't measure every oak tree in the world, but we can take a sample. We can compute the average height from our sample, but this average may not match the average height of all oak trees."
Confidence Intervals
Since it's impractical to measure every oak tree globally, statisticians estimate a range of possible heights based on probability and available data. This estimated range, known as a confidence interval, consists of two numbers: one likely smaller and the other likely larger than the true value. The true value is likely somewhere in between these two numbers.
"A '95 percent confidence interval' indicates that, out of 100 times the interval is constructed in this way, it will contain the true value 95 times," explains Drake. "If we measured samples of oak trees 100 times, 95 of those intervals would encompass the population mean, or the average height of all oak trees. Therefore, a confidence interval measures the precision of an estimate. The estimate becomes more accurate as more data is collected, which is why confidence intervals shrink as more data is gathered."
A confidence interval helps determine how reliable an estimate is. With only four coin flips, our 75 percent estimate has a wide confidence interval because the sample size is small. If we flipped the coin 40 times, the confidence interval would narrow considerably.
The true significance of a confidence interval lies in the repetition of an experiment. For the four coin flips, a 95 percent confidence interval means that if we conducted the coin flip experiment 100 times, in 95 of those instances, the probability of getting heads would fall within that interval.
The Boundaries of Statistics
Statistics has its limitations. A well-designed study is essential — statistics can only answer the questions you pose.
Imagine you're examining the effectiveness of a vaccine, but your study doesn't include children. You can create a confidence interval based on your data, but it won't reveal how well the vaccine works for children.
"For the sample to be valid, it must not only be large enough, but also representative," says Drake. "This typically means having a random or stratified random sample. If the 1,000 participants in your vaccine trial reflect the larger population, it's reasonable to conclude the true efficacy of the vaccine falls within the confidence interval. However, if the sample doesn't include children, there's no statistical support to generalize to that group of people."
Florence Nightingale is regarded as one of the greatest statisticians of all time, utilizing the methods she developed to save countless lives during the Crimean War.
