The Rule Of Numbers And The Role Of The Press: What Should The Journalist Do In The Age Of Big Data?  | Stats + Stories Episode 16 / by Stats Stories

Trevor Butterworth is Director of Sense About Science USA, which advocates for evidence and transparency in science and technology in the public interest. He is also editor of, a collaboration between the American Statistical Association and Sense About Science USA that promotes statistical literacy in the news media. He has been a journalist in the US for over 15 years, writing about data and statistics and how they are interpreted in our so-called "knowledge economy," especially in relation to risk and regulation. He’s written for the New Yorker online , Harvard Business Review, The Financial Times, The Wall Street Journal, and many other publications. 

+ Full Transcript

Bob Long: You may have heard headlines in some American media about how sugary drinks are fattening the world, or the claim that America leads the world in soda related deaths. Or maybe you’ve seen charts displayed by members of Congress who want to defund Planned Parenthood; charts that seem to show abortions are on the rise while the agency is doing less cancer screening or other preventative services. You may wonder about the accuracy of these claims or what studies were used to reach these conclusions. Welcome to Stats and Stories; it’s a program where we explore the statistics behind the stories and the stories behind the statistics. Our focus today is becoming a science journalist and the importance of understanding the scientific data on many vital topics, from sugary drinks to climate change. Before we talk with our special guest, we asked Stats and Stories reporter Austin Fast to help us understand data journalism.

Austin Fast: A century old bit of advice from newspaper publisher Joseph Pulitzer suggests you want statistics to tell you the truth; you can find truth there if you know how to get at it. So, how can we get at the truth? Today, more than ever, journalists are turning to a set of skills known as data journalism or computer assisted reporting. Liz Lucas is director of the database library at NICAR, National Institute for Computer Assisted Reporting at the University of Missouri. She says Data has become something that any good journalist simply cannot ignore.

Liz Lucas: Data really allows you to say well how much of a problem is this, how long has this been a problem. Those are questions that really help a story that are very hard to get without data.

Fast: Data allows journalists to add context and scope to better explain issues in their stories. NICAR holds boot camps three times a year in addition to newsroom trainings to show reporters from all stages in their careers that you don’t have to be a programming genius to be a data journalist.

Lucas: The most important thing that we emphasize is what we call the data state of mind. It’s not a specific piece of software, it’s not anything like that, it’s knowing, as a reporter how to use data in your reporting. And that really entails thinking about what data might exist, and thinking about how you can get it. So that data state of mind is crucial, if you don’t have that, even if you run across data and you don’t know how to think about how to make it work or think about how to incorporate it into your reporting you’re going to get stuck even if you’re a wiz at Excel.

Fast: One survivor of NICAR’s boot camps is Joanna Lynn; she’s a data reporter for the Center for Investigative Reporting in California’s San Francisco Bay area. Lynn says data journalism doesn’t necessarily require sophisticated algorithms or calculus like you might think, most of the time it’s just your basic elementary school arithmetic that can organize information and provide a structure to find patterns and outliers. One of Lynn’s first data stories looks at schools in California that had been promised funding for infrastructure projects, but none of the cash was getting doled out.

Lynn: It was overwhelming, there was just a ton of information and it was all sort of separated into different places, and I think if I hadn’t had the ability to look at a spreadsheet and kind of combine this information and clean it up a little bit and sort it and filter it I wouldn’t have known how to approach it at all.

Fast: By combining the dozens of entries for various schools Lynn found which schools had the most at stake. She then used that information to guide her next steps in chasing down the story.

Lynn: It can help you identify your sources and zero in on the examples that are relevant for you. Organizing the information that way allowed me to really target where I was going to spend my time and who I should focus on rather than just staring at this list and picking at random, or picking ones that were nearby.

Fast: Americans are constantly bombarded with research studies, some well researched, and others clearly biased. What’s the best way for journalists not to get used by special interest groups or by the politicians? Both Lucas and Lynn suggest one thing, always get the raw data.

Lynn: You know you would never want to take what a source tells you at face value and the same is true when they start including numbers or data in their conversations with you.

Fast: That was Lynn’s advice, for Lucas it’s all about asking tough questions.

Lucas: What’s the data behind that number? How did you get that number? Right, make them show their work; make them show you the data that they’re using. That way you can the best that you can, try to figure out what some of the potential problems might be with how they’re reporting something.

Fast: In the end, numbers, decimals, and percentages don’t have to be deadly to reporters and readers, as long as data journalists are making sense of them for the rest of us. For Stats and Stories I’m Austin Fast.

Long: Joining me on Stats and Stories for our discussion of Science Journalism are Miami University Statistics Department chair John Bailer and Media, Journalism and Film chair, Richard Campbell, and our special guest today is Trevor Butterworth, the editor of a joint project between the American Statistical Association and Sense about Science USA, promoting statistical literacy in the media and society in general. He’s also the director of Sense about Science USA and Trevor has written for publications such as Financial Times, The Wall Street Journal, Washington Post, Forbes, Harvard Business Review and Trevor, we welcome you to our show today.

Butterworth: Thank you for having me.

Long: So I’ve got a question for you of how you got drawn into the whole field of Science journalism. Was it something you set out to do or something that kind of happened along the way?

Butterworth: You know, I give a lot of talks to scientific groups, particularly chemists, and I tell them that no one goes into journalism to write about chemistry. And that chemistry unlike physics doesn’t have a big hole in the ground where they’re trying to recreate the beginning of the universe, and unlike biology it doesn’t have all these exciting sort of findings about what will make you live longer or love better. So, chemistry is really sort of the poor man out. And I suppose somebody had to wade in and write about chemistry. So, I’m a reluctant science journalist, and I’ll say why, because we have this notion, or we’ve had for many years this notion of two cultures. The idea that you’re either with the arts, and humanities wing of society or you’re a scientist, a mathematician, a statistician, and obviously this came about through C.P. Snow’s famous book of the same name. But for me the issue, I, in fact, even though I have been a science writer for Newsweek, I reject the label science writer because I believe we live, all live in one culture, and we all depend on one method, which is the scientific method for producing valid findings. So I guess that’s a non-answer to your question, but I do think it gets at the heart of what I really believe in, which is that statistics and science is something that we all should be engaged in no matter what major we had in college.

Long: John Bailer, I’ll go to you for the next question.

John Bailer: Well, you’re going to find no opposition here in terms of that perspective that you just shared, Trevor. Thanks a lot for that broad view. So, what led you to where you are, and what you did? Tell us a little about your background.

Butterworth: Right, well I have one of those classic degrees that’s supposed to be super useless, which is a degree in art history. I did the classic wandering around the arts and humanities and ended up, after several tries at graduate school, I ended up in journalism school, and weirdly enough it was in journalism school that I began to see how journalism reflected so many of my interests in graduate school. Principally what counts as valid knowledge? And this to me, obviously there are all sorts of academic debates in the humanities as to what valid knowledge was, and you had really interesting perspectives in philosophy for instance, and I guess, had I been good at math I probably would have become a philosopher. But I saw that one didn’t need to be an academic to begin asking really interesting questions that were of deep relevance to the public. I suppose really what I felt in graduate school was that there were lots of disputes that really weren’t reaching out to the concerns that people had. And yet, in the public realm, there were all sorts of claims to knowledge and claims to evidence where the evidence often wasn’t there. And that, to me was what was interesting about journalism. It was, it was actually a nineteenth century idea of journalism. Journalism was conceived, in many ways, as a science, as a social science. One in which journalists would shed new light on a dramatically changing economy and political environment, by thinking about things in a really concrete analytical way. And of course the famous expression of that comes from none other than Joseph Pulitzer in his 1904 article The College of Journalism in which he extols the romance of statistics, in which he talks about journalists being able to find truth in statistics, but that it wasn’t necessarily easy. And of course the interesting thing, I guess my perspective as a historian, you know I spent some time in graduate school in history, was that Pulitzer would have no idea about how complex statistics were about to become in the next twenty years.

Long: Richard Campbell, go to you for the next question.

Richard Campbell: Alright Trevor, welcome. You hit on something there about the origins of journalism starting more as a social science, but the truth is that the real tool that journalists have are narratives, are stories. So I think the challenge for us, with our students, often is how do you tell stories that are about complicated numbers, and what are the challenges there?

Butterworth: A great question. I’m reminded of a quote from Best Newspaper Journalism in 1982, you know one of those companions that come out, or at least did come out, maybe they’re historical artifacts now, but they used to come out, every year around the holiday season, and one of the comments by one of the journalists was that I never let two paragraphs with numbers bump into each other, because I think numbers are deadly. And that, I think, was very much a conventional view, and there was some truths to that, at some point numbers aren’t reducible to words. Numbers are numbers, and it’s really difficult to tell a story when you have this, essentially this chasm between conventional storytelling and even just the ability to describe a numerical concept in words. So yeah, that’s a real problem. But I also look, you know, Tom Rosenstiel director of the American Press Institute and before that a giant in the field of sort of journalism criticism with what he did at the Pew Center for the people and the press and he said something quite recently which I thought was, you know, really actually captured the moment we’re in right now. We’re entering a new age of empiricism, with big data, with data analytics, we have become a quantitative, quantified society in ways that are not only resulting in the multiplication of knowledge but, are really more complex than ever before, and the question is, is it true? And that’s the question of journalism, are all these numbers true? And I think one of the problems we are in right now is that certain data fits certain narratives really easily, usually you know, heroes and villains. Something’s going to kill you; something’s going to save your life. Very, very simple kind of old stories, actually kind of an updated yellow journalism you know, we’re holding power accountable, they want to make us drink this horrible drink and it’s killing us and duh, duh, duh. But actually when you start think about things statistically, which, when I say that, I mean when I go and talk to a statistician and say “help me out here, I don’t know what I’m reading” you actually find that the narrative, the combination of having that quantitative view of the world, plus all the old shoe leather reporting, plus all that sitting and reading the history books, you can find really, really interesting stories that you wouldn’t have seen if you had just gone with the conventional narrative.

Campbell: So, get at that a little bit more because I think you’re right, I think that a lot of journalists are going to be attracted to the kind of narrative that’s going to allow them to have a villain and you know, who’s the good guy, who’s the bad guy. So, what are some of those stories that they’re missing by not looking more closely that are also interesting and compelling stories?

Butterworth: Well you have, right now, the National Institutes of Health, warning of reproducibility crisis in science, at least in biomedicine, and that’s a tremendously interesting story in itself, how have we funded so much research that can’t be reproduced? Well, there may be some good reasons for that. It’s not often easy to reproduce a study, it doesn’t mean that something’s gone wrong, but when you have papers being published with say, like Amgen and Nature, where they were only able to reproduce 6 of something like 51 landmark cancer studies. What is going on there? And, that, the key thing is, that when you ask that question, what are the tools you need as a journalist to actually answer that question. And I think the challenge is making the news media realize this isn’t something you can do on your own, you actually need a statistician to come along, or several and hold your hand and work on it together.

Long: I’m Bob Long, you’re listening to Stats and Stories and we are exploring the topic of becoming a science journalist. Joining me are Statistics Department chair, John Bailer, Media, Journalism and Film chair, Richard Campbell and special guest, Trevor Butterworth, the director of Sense about Science and the editor of which is a joint project between the American Statistical Association and Sense about Science USA. Trevor, one thing I noticed that I think we’re kind of getting at, you’ve got the statistical part, that a lot of times people like yourself want to really share with people out there because you know it’s important on the other side of the coin you have the editors who, a lot of time, have their own idea. Can you give us an example of the tension that exists there between what they want as a headline or a story and what you think the story should be?

Butterworth: So, at a magazine, well yes, when I was writing for Newsweek I pitched what I thought was a really interesting story about sleep research. And actually, a great book had come out by a German sleep researcher; he coined the term social jetlag. And there was an amazing, really interesting study by a guy called Orfeu Buxton at Harvard, which simulated shift work in a group of people and showed that they would develop biomarkers that are commonly described as pre-diabetes, which is itself somewhat controversial but also very interesting because a significant amount of the American workforce is on shift work of some kind, so if that’s correlated with weight gain then maybe we’ve got a really big problem here. So, I’d arranged the interview, I’m talking to the guy in Munich I’m talking to this other guy, and you’ve got a fairly narrow window of time to do this, and then I get an email from my editor, can you drop the sleep story and write about dinosaur farts instead? And, if I can labor the story a little bit, I was incredulous, and particularly because I was one of these people who had no interest in dinosaurs, I knew nothing about paleontology. So I protested, I’ve got all these interviews lined up, ok he said, you can do that next week, but you got to write about dinosaur farts. And the reason was that a major study, according to other news media, had claimed that dinosaur farting may have caused global warming in the Mesozoic period. So I’m going like, where do you begin? Where do you begin with that? So I googled “paleontologist” and then the first email I get, I email him and say, this is kind of strange… I get an email back quite quickly and he goes “We’re all talking about this study, it’s kind of crazy, it’s not really, you know.” So we have this conversation, on email, and he points out, “You know what else occurred during the Mesozoic period, well the Atlantic Ocean formed and that might just have had a little more of an effect on climate than dinosaur farts.” So anyway, I got the study, it wasn’t a major study, it was one of those short little notes that had been written in a journal, 700 words. And then the next day I’m on the phone to the researcher and I’ve been prepped, I’ve been prepped, I know how to ask the awkward questions. And immediately the researcher says “well, you know, this is just a back of the napkin calculation. I did what I tell my students never to do, which is to extrapolate. And basically he extrapolated from an elephant to an Apatosaurus. Which, just in case you’re interested, that’s 200 liters of methane a day. So it clearly would not have been fun to stand downwind of an Apatosaurus, which I believe is now called a Brontosaurus. But the point was that the story that I ended up doing was different to a lot to the other stories because I’d spoken to a scientist who had clued me in as to the weaknesses of this study. And, what I’ve found, and what I’ve seen is that often when you take the time to read a scientific study, to find out its limitations and reading that limitations section is the most important thing in any paper. And then when you have an honest conversation with a scientist they will be much more open about the limitations and uncertainties of their research, if there’s good reason for that. Whereas, there is another tendency which is if you keep prodding the scientist to say, “Well what does this really mean? Give me the takeaways.” They will be forced to be more certain than they would be if they were simply talking to other scientists. And I think that’s a real danger that happens; this desire for narrative simplicity is just ruining science.

Long: John Bailer, we’ll go to you for the next question.

Bailer: One aspect of what you just described makes me think that when people look at scientific papers that they’re tempted to just look at the abstract, introduction and discussion, that in essence the nuance is found in the heart of this, in the methods, and in some of the results and the characterization of that. And one concern that I often have when I think about how people are looking at evidence and processing evidence is that they’re going to have this idea, here’s my idea for this story, or here’s my argument that I wish to make and once I find studies that are consistent with it, full stop. So how do you help people avoid that temptation?

Butterworth: Well, that’s a great question and it is a huge temptation, to cherry pick the studies to fit your narrative. Journalists are under a lot of pressure to turn out stories that people will read. I think that you have to create a culture in which that’s called lying. I mean it is, that’s fraud, its fraud, it’s a kind of, maybe not full-blown academic fraud but its fraudiness. You know, Stephen Colbert had this concept of truthiness, we need fraudiness, and that needs to be it’s like, no sorry, you don’t run this kind of thing. I have a great example, again the great example is to, I gave this, so there was a study that came out a few years ago, it was in pediatrics and again, there are thousands of studies on BPA, but this study claimed that gestational exposure to BPA in mothers led to bad behavior in girls, age three. So, it gets lots of air play, the head of the National Institute of Environmental Health Sciences comes out and tells reporters ‘it’s a really good study, it’s really well done, sample size is good’ etc. etc. Now I know from my contacts and the EPA and the FDA that they did not believe that to be the case. So, I just handed the study over to a statistician, I said, ‘what do you make of this, what do you make of the design?’ Actually, I’m jumping ahead, I said ‘what do you make of the statistics?’ and he comes back and he said ‘you know what? All the fancy statistics in the world can’t solve a study that’s been badly designed.’ He said ‘this study wasn’t designed to answer the question.’ Which is kind of interesting, you know, why did nobody pick that up? But when he looked through the data, he saw something that you would completely miss if you were just looking at the abstract, not even looking at the abstract, if you were reading the press release, as most journalists would have been reading the press release, and that was that gestational exposure to BPA, improved behavior in boys. But that wasn’t mentioned anywhere. So, of course he said this was meaningless, it still wasn’t designed to answer this question, but it was interesting that nobody mentioned this finding that was completely at odds with the thrust of the- that’s what you get when you give somebody who knows how to read data, access to the data.

Long: You’re listening to our discussion of becoming a science journalist, I’m Bob Long along with our regular panelists, Miami University Media, Journalism and Film chair Richard Campbell and Statistics Department chair John Bailer and special guest, Trevor Butterworth. John, go back to you and then we’ll go to Richard as we wrap up our discussion here.

Bailer: Thanks, Richard. Just a quick follow up. One aspect of this that seems critically important is the issue of variables. What’s measured and how measured and I think that a lot of times people get hung up in some of the methods that are applied without thinking about questions of design, as you had recently mentioned, but also, just the nature of how things are measured or if they make any sense at all? I think, how do you encourage that kind of sensitivity as part of the critical reading of such evidence?

Butterworth: Well, that’s, again, well, one of the things we’re doing at STATS, with the stats project is that we’ve got a volunteer team of statisticians from around the country, who are ready to help journalists read through the data, whatever data they have, when they have a problem, or there’s something they don’t understand, email us and we’ll talk you through it. And when I say “we” they will talk you through it, I’m not going to talk you through it. And that’s been really, the journalists that have used us, have found this service invaluable, because it gives them insight. You know, simply talking through the numbers is a great way to get a journalist’s brain thinking. And we’ve done this on an ad hoc basis, now with the American Statistical Association, we’ve really kind of pumped this up and we really think this could be a game changer. But ultimately, what we want to see is the statistician being in the newsroom. We want to see the future of news being statistical, at least in parts of those new stories that need that kind of analysis. But it is a challenge, but I do think that we are seeing, in this new age of Empiricism, new news organization being created; with new journalists Julia Belluz has been doing terrific work at Vox. And they’re thinking about science and data in a new way, not so much a new way from your perspective as a statistician, but from a new perspective as a journalist. They’re seeing the value in doing this. They want- You know Nate Silver’s famous line about punditry being useless, because you know, Pundit’s forecasts were just like the Philip Tetlock research on the chimpanzee throwing darts at a board. So, I think that represents a shift, a cultural shift. A cultural shift born of a more technologically empowered generation who are more comfortable with technology, and the tools, the data visualization tools, all of these things, there’s a shift in culture that’s come about with digital media that may be helping bring about this greater awareness. It’s a huge problem, but I do think there’s a realization, that we’ve got a lot of these stories wrong, let’s stop getting them wrong.

Long: Richard Campbell, go back to you.

Campbell: We started out; we learned that you were trained first in art history, a very different field. And, we have an obligation here, I think, to train our journalism students as best we can. And one of the things we do is we ask them to take a second major, so they all double major. If you were in my position, what would you be asking us to be teaching students in this new technologically advanced world? Because I think you’re exactly right, there’s a window that’s open here, to change the way we do things.

Butterworth: I think it goes without saying, I think the most valuable thing that any journalist could be taught today is quantitative reasoning around statistics. I mean, really, it’s so vital to so many sectors of society. But I will, I will sort of shift gears a little bit and give a nod to Art History, or to art, because from the New story I did, I went to Covance, a big laboratory just outside of Indianapolis, that does, I think, the majority of the world’s analytical work on blood samples. I talked to their chief data officer and he was telling me of the kinds of people he needs to do the data analytics on all of this, to start to build, sort of a google of blood, I think was the phrase I came up with, which he obviously liked. And he said that they had to be top flight computer science, statisticians, mathematicians, really all the hard academic disciplines, but he said that they also had to have a sense of art. They had to have a sense of design, they had to understand how to design beautifully, and he actually used those words. Coding had to be beautiful, and I thought that that was also very interesting, so the aesthetics and the analytics, you’ve got to combine both of those.

Long: John Bailer, got time for one more question, we’ll go to you.

Bailer: Wow, like it. Aesthetic and analytics, we had a guest a number of episodes ago that talked about numbers as narrative plot elements, which we also liked very much as an image, and a sense of this. Now you had mentioned the idea, earlier, about a quantitative, quantified society, and you know I wonder how much that pushes down in terms of expectation for community, are people ready for the stories that have this increased level of quantification and quantitation of information?

Butterworth: That’s again, if I’m supplying answers you want to hear, you’re supplying really difficult questions. That’s a really good one. I think, again, there’s a wider issue of science communication here, and how people understand science as something beneficial, how people, in fact Umberto Eco gives a great analogy when he looks at, in his essay Science, Technology and Magic, he says, people largely understand science through technology which is a series of instantaneous effects, which, in other words, is the same as magical thinking, because that’s what magic is, instantaneous effects. And that obviously magic can be good, magic can be bad. One of the areas in which ‘good magic’ or at least the hope for, the desire for ‘good magic’ is of course through all of personalized medicine and precision medicine, and all the data initiatives that are going on to really sort of open up the inner cosmos of the human body. If that’s communicated well I think that people will understand why their data is so important to humanity, to the common benefit of all. That’s one of the reasons why, as an aside, Sense about Science is running the All Trials campaign in the U.S. which is all Trials registered, all results reported, we’re all about the importance of data. Good data and bad data, doesn’t matter, we need to know all the data in order to figure out answers to our problems. In medicine, the quantified future is, difficult to forecast how imminent, but everyone’s trying to get there. And I think there are huge benefits to that. But like any movement, people will react against things that seem alienating, I mean we’ve had great reaction against the enlightenment dream, which was you know; the enlightenment dream was science for the benefit of mankind. We had the reaction against that in the 20th century, which was that science was something alienating, impersonalizing instrumental reason working against people’s authentic experience. So, we have to be careful as well, to not to go so far into the glories of data that people think we’re creating a matrix.

Long: Trevor Butterworth, we want to thank you very much for joining us on Stats and Stories to share your insights on the importance science journalism today. If you’d like to share your thoughts on our program you can send your emails to Be sure to listen for future editions of Stats and Stories where we’ll look at the statistics behind the stories and the stories behind the statistics.