Image by Sreejith via Fingent.com
One of the most common kinds of numbers people come across in the media are poll results. We tend to take them at face value, and most of us don’t really understand or appreciate the statistical certainty attached to the numbers. What’s more, the information that supports the quality of these numbers is often found in small print or is buried near the end of an article.
Picture by CBS News
Take the above poll. Where is the information about how many people were surveyed? It turns out that 1003 people were surveyed and that the margin of error was plus or minus four percent. What is also missing, however, is another key number. The fraction 19/20 should have appeared somewhere to let readers know that the level of confidence in the poll results is 95%, meaning if you repeated the poll 20 times, you would likely get the same result 19 times. In other words, there is only a 5% probability that these numbers, given their range of error, would not hold true. This is a pretty standard level of confidence and demonstrates the true predictive power of statistics.
Ironically, the poll regarding weather forecasts employs similar statistical/probability models to those used by meteorologists in actually forecasting the weather. They look at all the presentable weather factors and compare historical data with current information to make a prediction. Did you ever notice that something like the probability of precipitation is always a base ten number like 30%, 50%, 80%, etc?
What the meteorologists are doing is rounding their number to make it appear less specific, like 70% for example, when in fact the actual statistical number–with its confidence interval–is probably something like 68%. The problem with reporting a number as specific as 68% is that many people–who are already weather forecast skeptics–will only become more suspicious of the prediction.
This all goes back to generally low statistical literacy among the wider public and a lack of awareness of the rigor involved in obtaining these numbers. Right now, particularly among the bottom 35% of poll respondents, the idea that weather people throw darts at numbers isn’t too far from the realm of possibility.
Which is why it’s important to want to understand all the data that is constantly being thrown our way!
Sometimes, when that data is overwhelming, we can enlist the power of data visualization to help clarify what is being presented to us.
Below is a visual of both the popularity and actual scientific efficacy of vitamin and herbal supplements. The size of a circle represents its popularity and the higher up the picture it appears represents how effective it actually is. The visual helps us to quickly organize information, so we can make more informed choices using the data.
This image is from a great TED X Talk on the Power of Data Visualization by David McCandless.
One of the best books for understanding the power of probability and statistics is Struck By Lightning by Jeffrey S. Rosenthal, a math professor at the University of Toronto–my alma mater!
Picture by Amazon
Th book helps people understand if events and incidents really are a cause for concern–i.e. crime rates–or fall within expected probability outcomes. It does this by examining the data over longer periods of time and perhaps identifying outliers.
It is no surprise, then, that there is a big push in math education for more courses focused on data science/statistics/probability. Not just for everyday practicality, but also to illuminate how and where these numbers come from. Numbers and math are indeed everywhere, and just as we need to get used to more and more information coming at us, we also need to be able to assess the credibility of a claim or inference. Beyond that, it reminds us once again just how important it is that all students have as much access to high quality math education as possible!