COVID By Numbers | Stats + Stories Episode 210 / by Stats Stories

Dr. Anthony Masters (@anthonybmasters) is a Chartered Statistician, a Statistical Ambassador for the Royal Statistical Society, and a frequent blogger and explainer of statistical ideas. In his voluntary role as a Statistical Ambassador, Dr. Masters has contributed to BBC and Full Fact articles, among others, and he writes about statistics, survey research, and coding in R on Medium.

Check out their weekly column here

Sir David Spiegelhalter (@d_spiegel) is the Chair of the Winton Centre for Risk and Evidence Communication and has dedicated his work to improving the way that quantitative evidence is used in society. He is the former President of the Royal Statistical Society as well as a three-time former guest on Stats and Stories.

Sex by Numbers (2017) / I'd Give That Study 4 Stars (2017)

The Statistics of the Year (2018)

Episode Description

There's so much data out there about COVID-19 it can be hard to make sense of it all. Over the last year, a couple of statisticians have been working to help the readers of the Guardian, get a handle on the numbers. Dr. Anthony Masters and Sir David John Spiegelhalter have a new book out based on their weekly blog titled Covid By Numbers: Making Sense of the Pandemic with Data, which is the focus of this episode of Stats and Stories.

+Timestamps

How do you come up with topics? - 7:55

Media coverage of COVID generally - 9:39

Communicating uncertainty - 14:29

What stories are we missing? - 18:50

Public need for data - 22:43

Words of wisdom for reporting stats - 26:09


+Full Transcript

Rosemary Pennington
There's so much data out there about COVID. It can be hard to make sense of it all. Over the last year, a couple of statisticians have been working to help the readers of The Guardian, get a handle on the numbers. They have a new book out COVID by the numbers, and that's the focus of this episode of stats and stories, where we explore the statistics behind the stories and the stories behind the statistics. I'm Rosemary Pennington. Stats and Stories is a production of Miami University's Department of Statistics and media journalism and film, as well as the American Statistical Association. Joining me is regular panelist John Bailer, Chair of Miami Statistics Department. We do have two guests joining us today: Anthony Masters is a chartered statistician, a statistical ambassador for the Royal Statistical Society, and a frequent blogger and explainer of statistical ideas in his voluntary role as a statistical Ambassador. Masters has contributed to BBC and full fact articles among others. And he writes about stats survey research and coding and is on medium. David Spiegelhalter is the chair of the Winton Center for Risk and Evidence Communication, and has dedicated his work to improving the way that quantitative evidence is used in society. He's the former president of the Royal Statistical Society, as well as a three time former guest on Stats and Stories. Masters and Spiegelhalter have spent much of the pandemic writing about COVID stats in a series of pieces for The Guardian. They've got a new book out COVID By The Numbers making sense of the pandemic with data, which uses stats to help readers better understand COVID Anthony and David, thank you so much for joining us today.

I wonder if you could just talk about how your column in The Guardian got started.

David Spiegelhalter
Okay, I'll start on that one then. Because it was first of all, I got asked to do 113. Last January, and it was already three, they gave me 350 words. Now, have you ever tried to write about complex statistical issues? And 350 words? It is difficult. I mean, I always say it's twice as difficult as 700 words, which is a sort of standard blog length. And 350 is really difficult. Anyway, I had to go and then this Oh, whoa, can we have some more. And I'm never going to be able to keep this up. And, so I already have been working with Anthony on some FAQs for the road statistical Society website, which we wanted to put up. I'm on this sort of COVID task force for the RSS. And we ordered some FAQs and asked for volunteers. That's the volunteers to start writing FAQs about all these other statistics. COVID was a COVID death mean, and all that kind of stuff. So I knew he could do it. And he wanted to do it. So I said as he would you would you like to share the writing with me. And in other words, do it.

And, and we're still going after 40-42, we have 43 or so we've done these things, we're going to stop. So we can't keep on going. But we keep on getting. And this is not only online, you can find all these online, on the Guardian, but it's actually in the Sunday version of the garden, which is the observer. And it's in print, you know, it's a body that is in the column, comment and Analysis section in the center of the paper. And we get to the bottom of a page. So we get this, we get this actual printed column every week. And so we've been doing that, and I don't we haven't really it's amazing that I don't think we've repeated any, any topics yet.

Anthony Masters
Yeah, yeah. As I recall, I think the suggestion, though, would be a small number of columns, that they eventually would stop asking for them. But yeah, rather different future has unraveled,

David Spiegelhalter
We keep on asking, do you want us to stop? Hoping they're going to say yes. And as I answer usually does the first draft we talk about them on the Tuesday perhaps Tuesday or Wednesday and just as draft we edit them and get them in Friday evening or Saturday morning for the Sunday and and they really accept them without editing we got an early on we got in such arguments because they started writing the headlines and writing inappropriate headlines we got really cross and so we not only get an up pretty well unedited article, we write the headlines. Can you believe it? lievable I've never known that at all. But we got into such a challenge. Yeah, we got into such arguments.

John Bailer
But I'm curious what was the first article where you kind of assigned a topic or was it just a general invitation to write about this?

David Spiegelhalter
Oh, good. I can't remember what it was about. Vaccinations? Or was it on excess deaths. Oh, all right. Okay. Yeah.

Anthony Masters
When was the first one that David wrote? And then we wrote one on vaccine effectiveness?

David Spiegelhalter
Yeah, explaining vaccine effectiveness was right back in the day, you know, early clinical trials, Pfizer trials and explaining vaccine effectiveness, which everyone was getting wrong all the time in the media about what it meant. So, and then I mean, that basically, we got, they got more and more sophisticated the arguments about our last one was on the proportion of pregnant women who've been vaccinated. But we get into some quite subtle issues about the difficulties of estimating case rates in unvaccinated populations, we kind of think one of the ways we might try to get out of it is making them increasingly obscure and difficult. And we've got, we've, we've actually got one ready, which Anthony's written on collider bias, which is this issue as the fact that smoking appears to be protective for COVID, and so on. And so we think we're gonna, when we really get fed up, we'll we'll that one out, and then nobody will ever asked us to do it.

Anthony Masters
Well, yeah, I'm kind of planning on that probably being with a lot. But yeah, it says, yeah, and then a kind of lessons learned for the last one. Yeah, yeah. The collider bias. One is a properly difficult topic, I have read into the draft a chi recommendation for people to go, you know, calm their mind, get a cup of tea kind of thing, like, like, because it's like, you do need to think whilst reading it.

Rosemary Pennington
I wonder, besides this one, has there been a story that you wanted to tell that was really difficult to get into that 350 word space?

David Spiegelhalter
Oh, I mean, anything on diagnostic tests and false positive rates? And, you know, essentially, Bayes or something is, is difficult. It really is. Do you know, the standard way you do it, if you had images and time it's a well, what it means for 1000 people and so many, you'd go through a sort of argument, expected frequency argument, and we just haven't got space to do that. And so it has to be done in a slightly more hand wavy way. Many people have, you know, have obviously written those articles. Yeah. But they're a lot longer. And so we've had to do that. What else do you think? Well, what's been the most when we struggled the most?

Anthony Masters
Modeling? I think like few Yeah, try and model the future in terms of scenarios and things like that. And like try it, try to get people to understand that some of the ball outputs are very assumption driven by, as in, you know, how many contacts people would like to have, and how that changes? And yeah, that was incredibly difficult.

David Spiegelhalter
Yeah. Trying to get the idea of producing scenarios, assumption based scenarios rather than straight prediction.

John Bailer
You know, you talk a little bit about your workflow about the Tuesday, Friday deadlines, when do you come up with a sort of the general topic? So Anthony, is that something that when you're thinking about this, do you have an idea of what you're going to be contributing? Well, obviously, tomorrow's done but how about next Tuesday, a week from now?

Anthony Masters
Yes, so often, I'm looking at the news, looking at social media, trying to find what people are discussing, because that's what the column is directed towards. It's about numbers in the headlines and other news stories and trying to ascertain actually, the statistical problems, you know, behind what may seem a very simple number. So yeah, I undertake some very rudimentary media monitoring, and then try and work out a good topic from there. And sometimes there were just lots of good topics in a particular week. So might be actually we write, write about that, say, two weeks after it appeared. And the headlines are a major news story. But yeah, yes. About sort of scouring the knees for ideas, essentially.

David Spiegelhalter
Yeah, for example, and we haven't decided next week's but, you know, the announcement today that 5 million, the pastors record, no, in 5 million COVID deaths? Well, you know, you know that, but that's deaths that have been recorded as COVID, which is an absolutely hopeless underestimate of the total mortality. So to you know, we could write one on how spurious that is, but in a way, it's going over some graph, because we have done excess deaths before. And so, you know, inevitably you do start mentioning some of the same things again, but it's hardly surprising after 40 Something articles.

Rosemary Pennington
David, you mentioned earlier about some frustration you may have had in some of the coverage of of COVID stats, and I wonder if you could maybe, give us a bit of a rundown of what you see, maybe some of the bigger missteps journalists are making and what they might do to address address those issues.

David Spiegelhalter
Okay, should I go here? Shall I go first on that one, how do you store up yours? I gotta say I think the journalists have done pretty well. On the whole, I always say this. They really have done some of them fantastically well. And, you know, I think, you know, the financials, we've mentioned the Financial Times in this country have been excellent, but others have been very good at no BBC, The Economist. And he Right, right through will we help, everybody will help all the tabloids, any newspaper, there's some radio outlets, I will not help. But basically, I'll help just about everybody. You know, I've gone into a sulk about some people, I just have been stitched up, I will not help them again, but I will help nearly everybody else. And they are very good, they are very good. The problem is, if this kind of a story gets out to the hands of the specialist journalists and the health journalists who've got very adept at understanding the limitations of the numbers of what they mean. So, you know, occasionally we still get messages saying that the daily number of deaths announced, people might say, Oh, it's the number of deaths recorded, you know, that happened in the last 24 hours, or, Oh, look, it's gone up from the day before. And it as we write it in the book, or you know, a lot and everybody knows it, it's it depends so much on the day of the week, you know, roughly speaking, there's twice as many deaths on Wednesdays are there on Sundays, and you know, and that's just because of the way they're recorded, and so that we know the record when the you know, the records come in. So that's nonsensical. And I guess the one that we've probably it's still ongoing, you know, most trouble is about fact, which is essentially the countering the anti Vax arguments. And we've had to do two on that one, which one were the on the fact that back in was it back in June or something when the number of deaths COVID deaths out of the people dying with COVID, more were vaccinated that were not vaccinated. And this was picked on and we got it that I think we were one of the first to get a spot and write an article saying this is not a problem. This is exactly what you'd expect, from a vaccine that nearly all the high risk people have had, and yet isn't perfect. You know, somebody said, a small proportion of a big number can be bigger than a large proportion, larger proportion of a small number. So I thought it was quite a neat way of saying it. I wish I'd thought of that one. Anyway, we got absolutely hammered because anti vaxxers picked the headline, about, you know, where we pointed out that more people who are dying, were vaccinated, not vaccinated and went with it. And we got all sorts of flack coming back abuse on Twitter threats, and all this kind of stuff. So, that was quite fun. And that was that was our attempt at pre-empting misunderstanding, which is something I deeply believe in. But we may have preempted it a bit too. Bit too strong, I think. Because we seem to promote it. The misunderstanding and yeah.

Anthony Masters
Yeah, that episode was particularly difficult, right? Yeah. I think the reason why some people jumped on our school was because it's on the Guardian website, even though it's in the Observer newspaper. So people were going like all The Guardian reports. This one, of course, actually, the figure had been reported by the organization formerly known as Public Health England at that time. And yes, so you can't. And I noticed, like, for instance, BBC reality check, wrote a similar article, full flatterer similar article, but neither kind of suffered. The same kind of backlash, which I would infer represents some issue of privacy, and probably some issue around the headline that was written for that one for that one. But, yeah, yeah, yeah. So it's always tremendously difficult when you try and sort of, I guess unpick people's misunderstandings is that they then just reassert the misunderstanding.

David Spiegelhalter
But I still, I still believe it's the right thing to do to preempt it, and to get in there. And tell it, tell it how it is, even if some people will deliberately take what you say and misrepresent.

Rosemary Pennington
You're listening to Stats and Stories. And today we're talking COVID by the numbers with David Spiegelhalter and Anthony Masters.

John Bailer
Yeah, that seems like there's so many challenges and trying to communicate this system where there's, it's difficult to define endpoints, it's difficult to think about, like you were mentioning, even defining denominators for some of the calculations that you want to do. And then you're making predictions where there's uncertainty in this prediction. And a lot of times the story seems simplest or right when there's just a single number that's to be reported. And you know, so what are some of the things that you've when you raise this In stories that you write, you know, how do you deal with these, this communicating uncertainty and kind of the fact that there's we're building on grounds that are not fully set?

Anthony Masters
Yeah. Okay, I'll go first. Yes. So I think our articles actually included credible and confidence intervals in them, which is, you know, quite unusual, even for specialist publications to try and get people to recognize that actually, there isn't a single point where there is uncertainty around each estimate. And that she, you know, the world is a lot less sharp, and things are, things are much fuzzier, like, secure when you're looking at like, say, observational data and so on. Then a single news headline that, you know, like, oh, this vaccine has 95% efficacy, or, or whatever it may be. And, yeah, it's also, you know, I guess, ways of using language to try and describe uncertain situations as in, you know, try point out that way, you know, that, that therefore, means that, you know, like, as we might expect, this may cases, or you know, this this may hospital missions in future or, or it may, or may fall for a range from this number to that number. And yet, yeah, trying to express that you're uncertain is really quite difficult. But I think it's, yeah, I think the better way to write about statistics is to pay to be upfront that as you don't know, and it is better to give an uncertain answer rather than one with a false sense of certainty.

David Spiegelhalter
But I think the, you know, numerical uncertainty, while you know, more complex is still easier than the ambiguity of terms. And the fact that is not even as you said, the is not, John, when it's not even really clear, often what you're actually talking about. And there's two ways which can happen, first of all, that, even if, you know, how a COVID, death, you know, defined, you know, etc, is that the actual object isn't clear exactly what it means or how it is. But the most common misunderstandings, just, you know, what's the denominator when people start talking about, you know, proportion of, you know, false positive rate? Well, it's such an ambiguous term. What do you mean, do you mean, you know, the proportion of people who have not got the disease who falsely claim have got the disease, the one minus the specificity, as we would say? Or do you mean out of the people with a positive test? How many turns out to be false? And so, you know, to have to lay that out in detail. Crucially, what is the denominator out, when you saw people talking about a percentage or a rate out of what the actual group you're talking about? That is just the most common mistake made, I think, and you just have to keep on banging home. And as I'd know, I don't like using percentages, who almost ever I just much prefer out of 100x is, we would expect blah, blah, blah, which takes more words, and your other 350 words, you don't want to waste them. But you do tend to use a few in order to have absolute clarity who you're actually talking about. So out of 100 people who test positive, we would expect this many to be false positives, etc. To be absolutely clear. Yeah.

Rosemary Pennington
I wonder, given that you guys are combing through this data so frequently, if there are stories you think journalists are missing or under reporting that you think they should spend more time with?

Anthony Masters
Oh, I would say I've only seen a very small handful of articles about the continued and persistent shift in England and Wales, certainly. And it may be the case of countries as well, to dying at home, where people are now choosing not to go into hospitals, possibly out of fear of infection, or they believe that they may have a better experience when they're at home. For the end of their life, but yeah, yeah, this this seems somewhat unexplored or under explored,

David Spiegelhalter
and we've been writing about this, and it's something I've noticed you feel very strongly about, for personal reasons, I suppose the end of life care, I think is extraordinarily important. And it's something that in general, there are not enough statistics. about, you know, it's difficult data to measure the quality of end of life care. What do we mean? How do we assess it, where's the data, and yet is absolutely vital, we learn all about what's being done in hospitals. But as Anthony said, there's a systematic shift, nothing to do with these aren't COVID deaths. And it happened even when there was almost no COVID. Around that a third more people are dying at home than dead before the pandemic, and there's actually no sign of that changing. So some journalists have picked this up and gone with it. And there have been articles and everything. But it, I think one of the problems is that with these experiences happen to just individual families or individuals, even on their own, there's no institutions involved, there's no organization involved that can speak on behalf, nobody speaks on behalf particularly apart from, you know, some of the you know, maybe they are hospice organizations, or they're the ones who organize his care I've have seen, so Mary Curie in this country, organize end of life care, and they have been now making a thing about it. But nobody really speaks on behalf of these, you know, fragmented individuals having these experiences.

John Bailer
But I've really been enjoying reading your book, and I'm about two thirds of the way through. And I've, I just want to say just how much I delight in some of the turns of phrase that you employ. I mean, I, you know, one part you were discussing is that the shadow of the pandemic is cast far into the future, man, just that's just nice. Nice touches and nice collaboration. It's one of these.

Anthony Masters
Yeah, I must say, I always laugh when I see people, I say, praising David for, you know, very nice turns of phrase in the columns. And then, you know, I occasionally, you know, always want to go, Yeah, I actually write that. I get it. Yeah. But the assumption seems to be if it's very well written, yes. David, if it's clunky, and, you know, wooden infantry, I wrote it.

David Spiegelhalter
It's so untrue. And we really do, we really did divvy up the writing. I mean, writing this book was a real challenge. You know, they're very short chapters. They're only you know, we were told 25,000 words. Okay. Yeah. Well, we handed it in. Well, we were told about 25,000. Wilson's contract we signed, we rapidly negotiated a contract, a very rapid turnaround had to be written in a few months, about 20. And we handed in 38,000. We thought, you know, we're stuck. That's about 25,000. And we said, on the log scale.

John Bailer
Well, no, I think that's what I just wanted to say how much I've been enjoying, and kudos to both. So I would like to just ask you to explore a little bit at the very end. As part of your postscript, you talked about some of the lessons learned from the UK his experience and I, I thought there was a lot of wisdom embedded in those that list of 10 items, ranging from invest in public health data to publish the evidence and so on up throughput evaluation at the heart of policy. I mean, if you could, if you could have one item on a wishlist fulfilled into the future. What might that be?

David Spiegelhalter
I mean, we should say we stole those that-

Anthony Masters
We didn't steal; we have a citation for the Cisco side.

David Spiegelhalter
Yeah, exactly what we did. Oh, no, we did. We didn't say where we stole them from. But we-

John Bailer
But they're very good. No, no, yeah, they are very good. And they're very important to highlight.

David Spiegelhalter
I suppose I contributed a bit to them. It's from the Royal Statistical Society COVID In 19 working group and we got these out and yeah, I think they were very good. So we thought well, why invent our own when we could use somebody else's? It's very difficult. The whole thing about transparency, publishing the evidence, obviously, obviously, unbelievably important. Building decision makers statistical skills, or insight, data literacy. I don't know. I wouldn't like to choose one. Aunty What do you reckon? What do you reckon?

Anthony Masters
If you’d have to make a drink that for the benefit of the tape that you know, both authors were looking at the book but the so so

John Bailer
Well, I was gonna say if you deliver 25,000,

Anthony Masters
Yeah, yeah, this is really tough estate sales. I think in juicier sort of issues about the underlying data. That is to say, published journal articles and preprints and things like that, I think, greater openness and last about the data, like the actual online data set, what what, what everything means and so on, would be probably a gross benefit, but it's still really tough. Because even now you have, you know, numbers floating into the public round that database has a proper source and things like that. It's really tough to choose.

David Spiegelhalter
I think we know, you're going to read slightly between the lines. But, um, and because, you know, we were trying not to make this into a polemic. But basically, I think, you know, I think what Anthony says is absolutely right, is the transparency, the openness about the data is the number one thing, and in particular, getting the release of data out of any political control. And that's kind of implicit. We don't, we don't go into that. But people have been going into that. And I feel very strongly that, you know, we shouldn't have to wait for approval by number 10 Downing Street, before we get a look at the evidence behind the decisions that are being made. And that, I think, is no number one for me.

John Bailer
I was gonna say, if you delivered 38,000, when asked for 25, I was certain that that one would not be constraining.

I thought you might think that you'd have somewhere between two and five. So you know, but I will say good, though, they are very good.

Anthony Masters
Yeah. Yeah, it's about what you're asking for is for us to tear them rather. Yes. Yeah. Robin, right.

John Bailer
So, you know, one question I would, when I looked at, like building decision makers, statistical skills, and you know, in Rosemary was asking the question about, you know, what journalism, how about journalists coverage here, and you talked about the specialist journalist versus maybe some of the non specialists and, and how things were being reported? Do you have any sort of words of wisdom for trying to help gear up and build up these decision makers skills?

David Spiegelhalter
Yeah, I'm, I actually got some experience with that back in last year, with number 10, down, I was just being a bit rude about number 10 Downing Street. But number 10 Downing Street, set up his own, you know, data science group, which has been extremely effective. And one of the first things they did was develop a data masterclass for not only the senior civil servants, but politicians, ministers and diplomats and things like that. And I took part. I went down to number 10 Downing Street through the front door, met Larry the cat. And he, which is great, and it's a film there, and they got all sorts of people in the film did it really quick, quickly, but extremely effectively. And that it has now been taken over by the Office of National Statistics, which is great. And it's been very successful, you know, I don't know, 2000 people have done it or something like that, seniors have notes. Of course, it's not how to do the stats, it's back basic, you know, what it can do. And, you know, it also goes up into machine learning and gets some examples of, you know, really cool stuff that's being done. And so, and it'll need revision, but the idea was that we would then have sort of levels, grades, people, this is just a basic one, that people go up to a higher level. And this seems to be popular, supported at a high level. And so I think this is a very, very exciting development that would never have happened without COVID would never have happened. We'd been I mean, everyone's been banging on about this for years that we want senior civil servants not to be trained to do it, but to know what questions to ask and what to expect, and not to have unrealistic expectations about what the data could say at the same time, realizing how strong it can be. And particularly linking data sources and things like that. So COVID is really also there's, I mean, my conflict of interest here is that I'm on the board of the UK statistics authority now as a non Executive Director, which oversees the work of the Office of National Statistics, etc, etc. And there's Integrated Data Services starting out, which is, again, really I think it's been born from COVID, which is a, you know, a initiative to that to bring together data sources from right across government and commercial organizations in order to answer important questions, as has been done with COVID. And so it's a really exciting time.

Rosemary Pennington
Well, that's all the time we have for this episode of Stats and Stories. Anthony and David, thank you so much for joining us today.

David Spiegelhalter
Oh, thanks. Thanks very much.

John Bailer
It's been great. Thank you.

Rosemary Pennington
Stats and Stories is a partnership between Miami University's Department of Statistics and media, journalism and film and the American Statistical Association. You can follow us on Twitter at Stats and Stories, Apple podcast or other places where you find podcasts. If you'd like to share your thoughts on the program, Send your email to statsandstories@miami.oh.edu Or check us out at statsandstories.net and be sure to listen for future editions of Stats and Stories where we discuss the statistics behind the stories and the stories behind the statistics.