Andrew Flowers (@andrewflowers) is a freelance data journalist and a former quantitative editor at FiveThirtyEight.com . He writes about economics, trade, welfare, sports and more.
+ Full Transcript
ohn Bailer: Hello and I’d like to welcome you to today’s Stats and Short Stories episode. Stats and Shorts Stories is a partnership between Miami University and the American Statistical Association. Today’s guest is Andrew Flowers. Andrew is a freelance data journalist and a former quantitative editor at FiveThirtyEight.com. I’m John Bailer, I’m chair of the Department of Statistics at Miami University, and I’m joined by my colleague, Richard Campbell, chair of the Department of Media Journalism and Film, and we’re delighted to be speaking with Andrew on our short episode today. Welcome Andrew.
Andrew Flowers: Great to be here.
Richard Campbell: So Andrew, I have a question, just a general one about our past election. I remember how shocked my daughter was when she thought that Hilary Clinton had a 70% chance of winning toward the end and Trump had a 30% chance, and a lot of pollsters and I think statisticians took a beating after that election, because I think they partly didn’t understand the numbers and one of the things that I wanted to ask you is what doesn’t the general public understand about numbers and probability in a story as big as that last election was?
Flowers: I do think the general public takes numbers as they are tied to political stories differently than they do in other domains, but first I should say that the election model at FiveThirtyEight is really to be credited to Nate Silver, my former boss and the politics team there, they’re the experts on this, I’m not. That said, as a close colleague of theirs, and an avid consumer of their work and someone who can speak to broader principles of data journalism, I repeat what I mentioned, that I think the public takes probabilities with elections a lot differently than it does with other types of probabilities, so for example, sports. A classic example, from recent times is, in January, this Alabama football team played for the National Championship and they were thought to be historically great, and according to our numbers they were, and by our numbers I mean FiveThirtyEight’s probabilities that they would win against Clemson in the National Championship game, well they lost, it was an upset. I think sports fans get that, they get that this is unlikely to happen, but it does, and so Alabama was favored roughly 70-30 over Clemson, and that’s similar to what, I think, Clinton’s odds were according to the FiveThirtyEight model on the eve of the election over Trump. Now, the short answer to why does the public not get politics probabilities, is 30% odds happen 30% of the time and if your model is well calibrated as Nate and others have justified with the FiveThirtyEight model, then this was not a six sigma event having Trump win. But to peel back the onion one or two more layers, I think there’s an emotional investment in politics that you don’t find in sports, there’s a repeated experience of upsets and predictions gone awry in sports that you don’t have in elections that only happen every few years, that emotional aspect and that kind of disconnect of experience most news consumers have I think it makes sense that they would have a harder time digesting political probabilities. But the other thing I’ll say is that FiveThirtyEight wasn’t the only election prediction model this past year, and in past years too, but this past election there were other models that had Clinton’s chances in the 90’s, and I think that’s what contributed to this frustration. Even if FiveThirtyEight was being beat up in the weeks ahead of the election for being too generous in their odds of Trump’s potential victory, the confusion in part is attributable to these other outlets who were way too overconfident in the construction of their election model, and Nate’s written about this a lot, but to my understanding a lot of election modelers didn’t take into account the state to state correlations in polls and polling errors. So, if Trump was going to beat his polls against Clinton in Pennsylvania, for example, for whatever reason there were polling errors there, it’s probably likely that he was going to beat his polls, given the demographics of Pennsylvania, it was probably likely that Trump was going to beat his polls in Wisconsin, and Michigan and so on. So yeah, that’s essentially what happened is the FiveThirtyEight model kind of, the reason it was more conservative was it took into account those state to state polling error correlations, and other models didn’t, they were less conservative in their approach, but the deeper issue, and this is kind of the core of it, is stuff that Nate and others have written a lot about which is polling. Polling is not a perfect predictor, we know that and in some measure the 2016 election polls were off by less than the 2012 election polls, so to wrap it all together, I think political junkies, news consumers have an emotional investment in politics in a way that they don’t have, in some sense, with sports, so it makes upsets harder to digest. And secondly, election prediction, election modeling is still in its early stages and some of these overconfident models really got burned and burned their readers by being too overconfident by not having a statistically rigorous way of dealing with polling data.
Campbell: Thank you
Bailer: Thank you so much Andrew. It has been our pleasure to have Andrew Flowers join us on Stats and Short Stories. Stats and Stories is a partnership between Miami University’s Department of Statistics and Media Journalism and Film and the American Statistical Association. Stay tuned and keep following us on Twitter or iTunes, if you’d like to share your thoughts on our program, send your email to firstname.lastname@example.org and be sure to listen for future episodes where we discuss the statistics behind the stories and the stories behind the statistics.