Days of R COVID Lives | Stats + Stories Episode 366 / by Stats Stories

Gavin Freeguard is a freelance consultant focused on among other things data, information, and policy. He formerly worked at the Institute for Government in the UK, where he served for a time as the programme director, head of data and transparency. Years ago, Freeguard was the deputy director of The Orwell Prize, which recognizes the best of British political writing. He authored a 5-part series in Significance about reproduction rates and COVID.

Check out his Significance Article Series

Episode Description

Early in the COVID pandemic, as we figured out how to live our lives solely at home, news stories began to be filled with stories about COVID’s spread and reproduction rates. Soon, social media were filled with amateur epidemiologists trying to make sense of those rates and sometimes making a mess of it. A series of articles in Significance examined the discourse around reproduction rates during COVID and it’s the focus of this episode of Stats and Stories with guest Gavin Freeguard.

+Full Transcript

Rosemary Pennington
Early in the COVID pandemic, as we figured out how to live our lives solely at home, new stories began to be filled with stories about COVID spread and reproduction rates. Soon, social media were also filled with amateur epidemiologists trying to make sense of those rates and sometimes making a mess of it. A series of articles and significance examined the discourse around reproduction rates during COVID, and it's the focus of this episode of stats and stories, where we explore the statistics behind the stories and the stories behind the statistics. I'm Rosemary Pennington. Stats and stories is a production of the American Statistical Association in partnership with Miami University departments of statistics and media, journalism and film. Joining me is regular panelist John Bailer, emeritus professor of statistics at Miami University. Our guest today is Gavin Freeguard, a freelance consultant focused on, among other things, data, information and policy. He formerly worked at the Institute for government in the UK, where he served for a time as the program director, head of data and transparency. Years ago, free guard was also the deputy director of the Orwell prize, which recognizes the best of British political writing. He authored a five part series in significance about reproduction rates in COVID, which he's here to talk with us about today. Gavin, thank you so much for joining us.

Gavin Freeguard
Well, thank you very much for having me great to be talking to you

Rosemary Pennington
why reproduction rates and how did this become a five part series.

Gavin Freeguard
So it's a long story, which I'll try to keep relatively short. So as you, as you said, I spent a lot of time at the Institute for government thinking about how the UK Government thinks about data and uses data, and one of the things that we often find is that people hear the D word and they can switch off. Their eyes can glaze over, and sometimes can be quite difficult to bring the importance of data to life for people for whom it's not their their sort of main job or main occupation, and I came up with this idea that wouldn't it be good if you could bring some narrative storytelling to some of those stories behind the data, actually delving in behind the numbers, understanding the choices that are made, the things that shape how these often neutral things as we see them, are actually put together and actually have a lot of assumptions and stories baked in. So I mentioned this idea, which was tentatively called The Birth of a number one day to Natalie banner, who at that point was running understanding patient data, which is a great organization in the UK, which has lots of research on how the public feels about the use of their health data, and lots of resources to help people talk about it in a sort of normal but useful way. And then pandemic struck, and Natalie, having thought this was quite a good idea, quite a good idea, quite an interesting idea, got in touch and said, Have you thought about doing a COVID one of these? I thought, well, that's that's a rather good idea. That might actually be quite a useful thing to talk about in public. And that's how we settled upon using the R number as the way to do that.

John Bailer
I love that. You know, early on you talk about it being the breakout star of the in the early weeks of the the COVID lockdown, and you know, and in the first part of this, you have these, this wonderful collection of quotes that ranged from magic and mythic, mystical to iconic. Also, this one magic figure governed our lives, the apex upon which jobs and lives hinged. I mean, certainly it sounds like they're soft selling the importance of this, but, but, you know, Can Can you tell us a little bit about this, this one number and kind of what it what, how it's defined?

Gavin Freeguard
Sure, so. And the r number, which is the reproduction number, is essentially the average number of people that will be infected by one person who has a disease, I think, for the UK, official description of it is the average number of secondary infections produced by a single infected person. So if the r number equals one, then sort of diseases at a stable rate, because if I had COVID, I would give it to one other person. On average, if the r number is two, I'm giving it to two people, and it's doubling. It's rate, it's it's rising exponentially. And if the r number is below one, then the disease is falling in the population, which is what all of those lockdowns and all of the vaccines and all of the other efforts to try to control populations, and in sort of early 2020, were aiming for to try to keep that r number below

Rosemary Pennington
one. So I know, I know this r number was used prior to COVID. How did was this developed, and how is it used typically?

Gavin Freeguard
So it sort of started with quite a different disease, and. Malaria way back in sort of early 20th century, where, through various British epidemiologists, including sir Ronald Ross, who'd spent some time in India as a surgeon, was bothered by all these mosquitoes, and realized that moving a water tank actually stopped them from bothering him, and that led to Nobel Prize winning work on understanding that mosquitoes were spreading malaria. He then turned to mathematics to try to also further understand how diseases spread, and sort of thinking about diseases in a conceptual way, rather than just based on observations in the field. One of the successes of the head of what became the Ross Institute, and is now part of the London School of Hygiene and Tropical Medicine, who were actually doing quite a lot of the R number modeling during COVID pandemic. So George McDonald, one of his successes, then comes along, revisits Ross's work. Says the whole field has gone too mathematical. We need to go back to what Ross was actually saying. Goes back to malaria. Goes back to what Ross had written. Looks at some of the things that he himself had written about malaria, and ends up defining in 1952 the basic reproduction rate in a very similar way to the way that we've just talked about it. So it's been there as a concept since the sort of 1950s you'll find it on gov.uk which is the UK Government website talking about, think, poultry influenza in Vietnam in the mid 2000s and the Ebola outbreaks in the sort of mid 2010s as well. And it's been there as an epidemiological value, but it never quite took off until the way, as you were describing the sort of iconic magic apex, all of those big words about how it suddenly came to prominence in early 2020.

John Bailer
Yeah, you know, you're just, you're describing it as the star of a disaster movie seemed, seemed right, dead on. But, but, you know,

Gavin Freeguard
I think I just watched contagion

John Bailer
that will do it. We were living it, and you decided to watch it too. I mean, you're a glutton for punishment there. I

Gavin Freeguard
think, I think you can see a spike in the general population that suddenly started watching contagion, which actually has a very good explanation of the R number quite early on. Well, you

John Bailer
know, in one aspect of the R number, that's that there are, there are different flavors of it, that it's, there's not just a single number. And I think that makes it kind of, that makes it even harder to convey when there's nuance to it. So there was this R sub zero, r naught, the natural one, the R r sub e, the effective number, and the R sub t and r that was excluding immunity. Can Can you help kind of flesh out a little bit about what those why those were distinctions that were made and are important?

Gavin Freeguard
Yes, so starting with that first one, the R naught, as you say, that's the kind of natural underlying if you did nothing else, that would be the r number for a particular disease. And I think that's in the very early days, what people were trying to get hold of with COVID 19. I think most of the time people would then have been referring to the next definition you gave, which was the R E, sometimes the RT, sometimes the ref, which is the effective r number at any given time. So that takes into account all of the counter measures that people were trying. So that will be things like social distancing, things like lockdowns, and also sort of immunity in the population to a certain extent as well. And then that final one, sometimes we will find RT, which takes into account the counter measures, but does not assume that anyone in the population has immunity. So it's a little bit of a twist on that one. Confusingly, there's also the various other letters that related to some of that as well. And I suppose the thing with all all of those different definitions is you can't measure it directly. It's the sort of missing statistic, if you'd like, is, at any point how many people even have COVID in the population. And it's even more difficult to, therefore, try to work out what the r number is. So you can try and get there by proxy, things like it's often called dots. So the duration of time for which a person is infectious, the opportunities they have to spread it, the probability that they will transmit it, and the susceptibility of the population. So breakout staff of a number we actually can't properly get at, and we can't really properly get at how many people had COVID at any one moment in time, you know?

John Bailer
So that that provides a nice transition to thinking about you're having to do some modeling then to estimate this number, and you've described some of the data that you need. And the lament is, is important that it wasn't really there directly, that you could could get the kind of data. So, you know, I think it's impressive how many models were built up very quickly, and some of the data that was used, and you mentioned that there were something like in the UK, maybe 12 different modeling teams that were producing products that were being used for decisions. And the data ranged from deaths and hospitals. To vaccine uptake and ICU admissions. Can you describe a little bit kind of how that these different modeling groups came to be, and how did they get the data that they were then incorporating in some of the models they were developing?

Gavin Freeguard
Yes. So as you say, there were about a dozen modeling groups at various points. So some of those were things like Public Health England or the UK Health Security Agency, the sort of official government bodies working with various universities. At the London School of Hygiene and Tropical Medicine, you had the University of Cambridge with pH University of Oxford, with UK HSA, all sorts of different groups. And some of those had sort of grown out of previous and epidemiological modeling experiences. So particularly thinking about flu outbreaks, they often took different approaches. So first of all, they would use different data sets, and you've mentioned some of those. I mean both deaths, hospitalizations, cases, serological data, so from blood donated for transplant, for instance, school attendance, bed occupancy in hospitals. Various studies that were trying to understand how sort of social distancing and social contact between different parts of the population, Google mobility data so people who did not had not got that switched off on their phones and they were using sort of phone location to help understand how much the population was moving about. So all of that, all of those different data sets are available, different modeling groups would take different cuts of those data sets. Some of them would use what are sometimes called compartmentalized models. They take the data and then try to break it down by different regions or different demographics, so they had sort of different things going on within their model. Others used deterministic models. Those are when you plug the numbers in, you will always get the same results. Others use stochastic ones, which have a little bit of randomness thrown in, so you'll get slightly different outcomes, even if you put the same numbers in, and you've got all of these different modeling groups using all of this different data in this different data in all sorts of different ways. And then they bring it all of those different models together. And that's spreadsheets being sent, sent by email to part of the Ministry of Defense who had the computational capacity and software to sort of crunch it in a particular way, and then you have a zoom call every Tuesday where that modeling having been done, they present the probable r number back to all of the modeling groups and they discuss it. And normally there'd be quite a lot of consensus. Sometimes people would have slight outlying results, and they might be asked to explain those away, but once those modeling groups had agreed, actually, that that's a reasonable consensus on the R number, it would then go through approvals, administrative approvals, and get published at 4pm on a Friday on the UK Government website. There's quite, quite a lot of process involved in that. You're

Rosemary Pennington
listening to stats and stories, and we're talking about reproduction rates and COVID with Gavin Free guard,

John Bailer
yeah, thank you for that. That's that, that workflow. I mean, that, to me, was really fascinating. I mean, there's, there was an amazing amount of buildup of these models very quickly that that was impressive right from the start, just how, how many, many groups around the world ended up developing models that were useful, and some were building on some previous work that they had done. There have been previous concerns about pandemics, and there had been some at least foundational work, but the fact that this was available so quickly was really striking to me. The idea that you were then processing weekly these types of models with updated data, then going through and trying to then make decisions that, I think this, this leads, kind of, you know, you're doing a great job of leading us through your story here. Gavin, I gotta, you know, you get, you get full marks. I mean, this is,

Gavin Freeguard
I think that's the benefit of being in five parts, doing the hard work.

John Bailer
But, you know, then it gets to the, you know, you go from this workflow, then ultimately, there's kind of two aspects that you wonder about. One is, what was done with that number from the government's perspective, and then ultimately, how did the media interact with that number? So, so we've gone from the the where we started was describing this quantity, how it was then estimated, the teams are producing it in the workflow that then a number is presented. So now, what? That's

Gavin Freeguard
a very good question, because to a certain extent, the answer is clear, which is, politicians were talking about the r number quite a lot. It was there in the press conferences that were happening from number 10 Downing Street. It was coming up in Parliament, there was a very strong impression given that keeping the r number below one was real government priority. It's in all sorts of different government strategy documents, lots of political discussion. There are definitely moments when it's not entirely clear exactly. Particularly how not just the r number, but particularly the r number and other sort of statistics were actually being used to make political decisions, because the bit of a lack of transparency. For all the transparency about who was calculating the number, how it was being calculated, the data going into it, and all of that being published and made available brilliantly on the UK Government website when it actually gets to the political decision making. There's a lot in the rhetoric, but it's not absolutely clear who was taking responsibility for making what decisions based on which particular numbers were pointing in a direction at a particular point in time. So I think a bit more transparency around exactly that might have been a bit useful.

Rosemary Pennington
So how were journalists in the UK covering this? And do you have examples where you think news media were doing it particularly well, and maybe some where you thought they failed?

Gavin Freeguard
So in terms of how the media were talking about it, I think certainly at the start and throughout, there were definitely a few moments of putting the weight on the R number that it couldn't quite bear. I was speaking to one of the epidemiologists who said they'd done a very quick bit of working to answer a question, and within 24 hours that had become a sort of scientist says x as a massive, sort of tabloid newspaper headline. I spoke to another of the epidemiologists who at one point had come up with a a sort of forecast of an R of 1.01 in a particular region of the UK. And they were really worried that that number becoming a headline had led to schools being kept closed, when actually I mean one point, not one. And as we've already heard, is a sort of estimate of estimates, and has lots of different data limitations built into it. And yet, a lot of decision making weight was being put on that number, not just from politicians, but from the media. In the middle of that, I think to be a bit more optimistic. I think certainly as that, as the pandemic went on, our office for statistics regulation, which is the sort of, sort of part of the UK statistics authority, definitely thought that journalists got better at conveying the fuzziness of the R number and talking about it in a range and being much more circumspect in how they were reporting it full fact, our sort of major fact checking charity here in the UK, similarly, thought they'd seen an improvement, but they did point out that there were still some misinterpretations going on. They pointed to one article in the Daily Express which had been looking at some pre print research papers, those which hadn't yet gone through an academic peer review process. And I think the journalist had accidentally ended up writing that COVID had been genetically engineered for the efficient spreading in the human population, which was definitely not what the academic paper was trying to say. Oh, my gosh. Oh boy.

John Bailer
You know this, this is a really hard story to tell. I mean, this is a, you know, as you were, just as you're talking about this. There's the there's this issue of just uncertainty, the fact that when you when a a single number often has this, this shine, as if there's an excess of precision associated with it, not that there's uncertainty in models that are used, or the input of data, as you've already discussed, there's the aspect of it being exponential growth when that runs. You know, kind of maybe people understand that when they're investing for retirement in the long term, but not necessarily in terms of how fast a disease might go through a population. So there's, there's a real story of statistical literacy, numerical literacy that comes into this and scientific literacy about how diseases work. You even you know you wrote in one of your in one of your segments, that this increasing familiarity, it wasn't clear whether it was just there was an increase in literacy or if it was just an increase in familiarity of seeing it used. I mean, what? What's your takeaway as you reflect back on it,

Gavin Freeguard
I think probably a bit of both in terms of that sort of literacy and familiarity, which I think by the end, had definitely bred a lot of contempt of, Oh God, it's the r number again. But I think, I mean, it's remarkable that not just the r number, but so many other epidemiological terms, and it had to become common parlance for large parts of the population. And actually, there is polling suggesting that I think more than 40% of the UK population thought they could give a clear explanation of what the r number was, which is not something you would expect from what was normally an obscure statistic. I mean, to go back to something else you were saying as well, that sort of one number versus all of the uncertainty, I think there's a real tension there, because having that one number which does give you a baseline on which to build on, which people are broadly agreed, which can encapsulate part of the spread of the disease really simply and really quickly, is incredibly valuable. I think one of the real challenges was that people perhaps became too reliant on that single number. There's a great quote from Jeremy Farrah, who. Was leading the Wellcome Trust and during the pandemic, he said, well actually, you've got to zoom out and look at the whole oil painting. You can't just look at that one little bit. And you need to see, rather than it being a breakout start, that it has to be part of an ensemble cast of different numbers and metrics which will help you different parts of the pandemic. And going back to that, there's a wonderful quote as well on that sort of exponential point, because we are quite bad as humans are trying to understand that. And I was speaking to Flis benay, who was co chair of the Welsh Government's sort of advisory group on a lot of this. And she said the way that she tried to explain this to politicians was to say, imagine that you've got you've got a sports stadium and you've got a puddle in the middle of that sports stadium, and that puddle is doubling in size every five seconds. So the first few minutes, it's just a puddle, it's not really anything, but soon your stadium is half full, and then five seconds later, that stadium is going to be completely full. And if you put that to most people, that's not what they think. If you say that, you know, let's say the puddle is doubling in size every day. And after 10 days, the puddle has filled half the stadium. You ask somebody, well, how many days is it going to take for the puddle to fill the whole stadium? Most people will say 20 days, because they're thinking geometrically in reality, it's the next day that it's suddenly going to take up. So how you talk about those things not just conveying the uncertainty, but also unfamiliar concepts in a way that policymakers and the general public can understand? I think it's real challenge.

Rosemary Pennington
So I used to be a prolific Twitter user and during the pandemic, didn't we all Yeah, during the pandemic, it was really fascinating to watch, like there were so many epidemiologists and public health officials who were in that space attempting to sort of talk people through things like reproduction rates, other kinds of concerns related to COVID but there did seem to be this sort of strange industry that rose up of people who became armchair epidemiologists, who were sort of breaking down these these stats and other information in ways that perhaps were not exactly the way the people who are trained in those fields might have done. And I wonder, sort of, given this, this work that you have done, if you have thoughts on how social media has complicated this, this work to try to help people understand what reproduction rates are,

Gavin Freeguard
I think, as you suggested, it's very much a mixed picture. Because I think some of the work that a lot of the epidemiologists were doing was absolutely brilliant. You know, they were trying to talk to the public where they were, people like Megan Cowell, who I quote in the piece, but also people involved in the COVID 19 dashboard in the UK, were taking a lot of time to talk to their followers about what was going on, explaining things that people were raising, a sort of brilliant open communication approach, which I think made a huge difference. You're always going to get some people who are not going to respond in kind, I suppose. And I think in a sense, it was natural, you know, people were overwhelmed by what was happening. And I think trying to find certainty in those numbers. And as we've discussed, you know, it's much, more complicated than that, but there's also quite an interesting quote that somebody said to me, which was, it's not entirely clear how the r number suddenly became the statistic in the UK, and I think obviously it was quite prominent globally, but there was something about the UK Government talking about it in a way that was perhaps much more pronounced than elsewhere. And somebody who was working within the system sort of said to me, they thought it might be that because armchair epidemiologists were talking so much about R and it was getting a fair bit of media coverage, that might have been one of the reasons why it suddenly came to the top, partly because it was a bit more prominent, partly because, actually, people could understand that single number to a certain extent, and that may have helped crowd out some of the other statistics, which would also have been helpful. So quite an interesting sort of nuanced set of consequences. I think, to those social media discussions,

John Bailer
it's very interesting to see kind of how it also the use of this number changed with time. I mean, the use of single numbers is also not uncommon. You know, GDP, percent, unemployment, other kinds of indices are commonly used by by countries to try to gage action, but, but you know, the one thing that's that matters is that you have data, and you know that this, these data are available to help support those decisions. And you described in one of your in one of your articles, the idea of a Frankenstein data sets that there's, there was a lot of the information that was that was desired was indeed missing, and that that how important it was to think, and I think this is a apologies if this is not a good quote, but a paraphrasing of it, but to think, to think more carefully and practically about where data. Are in other vital epidemiological statistics come from and invest in it. It always seems like there's all the great intentions to support the types of background and foundational work is present in the midst of of this the crisis, but quickly dissipates in the absence. So what? What are sort of lessons that have been learned from your work and your review of this that you would like to have not disappear from the public eye? I

Gavin Freeguard
think, I think the one that you just described absolutely be, and we saw this happen in the UK and the US to sort of to differing extents. So for instance, the journalists at the Atlantic suddenly realized that the US, federal government was relying on the data the Atlantic was pulling together, and that caused them to say, well, pandemic plans stressed the importance of data driven decision making, but largely assumed that detailed and reliable data would simply exist. They were less concerned with how those data would actually be made similarly in the UK. And Adam Kucharski, who's one of the epidemiologists at London School of Hygiene and Tropical Medicine, said some people working in the wide tech field were quite naive about how patchy epidemiological data is. They were complaining it wasn't being shared, when actually it wasn't being collected. And if you look at the UK sort of pandemic preparation. There was a big exercise back in 2016 called Operation, or called Exercise sickness, which was supposed to simulate a flu outbreak. That simulation takes place in week seven of an outbreak. So again, that is assuming that all of the data that you'd actually need is there. Those assumptions were not tested, and you mentioned the sort of Frankenstein data sets, and that sort of comes from people desperately trying to repurpose what was available in really ingenious and under incredible pressure to try to try to mean something. But even looking at Public Health England, which was our main public health authority at the start of the pandemic, they were getting death data as individual emails about individual patients from individual hospitals. They were turning that then into spreadsheets. They were then cross referencing that with everything they could get their hands on. At one point, they used a system they had for their research projects, which they use to make sure that somebody who'd died during that research being conducted would not be contacted so their families would not be upset by receiving correspondence. They use that system to try to work out who died overnight at one point. So, again, ingenious, but they should never have been put in that position, and you read even the sort of evidence given to given to select committees in the UK Parliament from some of the people who were at the center of government at number 10 Downing Street in the early days of the pandemic. And they talk about wheeling in whiteboards and the head of the NHS having to read numbers off the phone and Microsoft Word documents and random Excel spreadsheets and random emails, random phone calls, all of these different things bombarding the center of government because those data pipelines had not been thought about.

Rosemary Pennington
It's astounding to think about kind of years from now looking back on this and figuring out how we survived all of it, because everyone just sort of felt like they were doing what they could as they could do it with what they had in such a sort of frantic, as in, like, crisis mode situation. It's really remarkable what people were able to put together.

Gavin Freeguard
Yeah, I mean, it really is, I think it's, you know, it is a story of incredible industry and ingenuity from a lot of a lot of those scientists, a lot of those civil servants, love, those public servants and politicians as well, for all that might criticize their lack of openness and definitely some of the decisions that they made, again, trying to react and adapt in a really uncertain world. I mean, it's we're now five years on from it really starting to happen in terms of lockdowns, and in fact, I think it was, it's just five years to the day, almost since the r number for COVID first got published on the UK Government website. Again, at that point, it wasn't the the main thing. It was one number among many. But yeah, it is remarkable to look back on it. And I suppose we do have to ask how, how much has changed and how much better prepared do we think we might be this time round? I have to say some of the answers I've had to that question from people who are much more embedded in the system and me are not not optimistic. Well,

Rosemary Pennington
that's all the time we have for this episode of stats and stories. Gavin, thank you so much for

Gavin Freeguard
joining us. Thank you. It's been great to talk to you. Stats and

Rosemary Pennington
stories is a partnership between the American Statistical Association and Miami University departments of statistics and media, journalism and film. You can follow us on Spotify, Apple podcast or other places where you find podcasts you. If you'd like to share your thoughts on the program, Send your email to stats stories@amstat.org or check us out@statsandstories.net and be sure to listen for future editions of stats and stories where we discuss the statistics behind the stories and the stories behind the statistics you

Transcribed by https://otter.ai