How Esports Stats are Tracked | Stats and Stories at JSM by Stats Stories

BrianMacdonaldHeadShot.jpg

Brian McDonald is currently the Director of Sports Analytics in the Stats & Information Group at ESPN. He was previously the Director of Hockey Analytics with the Florida Panthers Hockey Club, an Associate Professor in the Department of Mathematical Sciences at West Point, an Adjunct Professor in the Department of Management Science at the University of Miami, and an Adjunct Professor in Sports Analytics in the College of Business at Florida Atlantic University. He received a Bachelor of Science in Electrical Engineering from Lafayette College, Easton, PA, and a Master of Arts and a Ph.D. in Mathematics from Johns Hopkins University, Baltimore, MD.

+ Full Transcript

Tarran: Hi, it’s Brian Tarran from Significance Magazine and I’m here at JSM 2019 in Denver, Colorado with Brian McDonald, Director of Sports Analytics at ESPN, hi Brian?

McDonald: Hi, how are you?

Tarran: Very good, thank you. Welcome to JSM and today we’re going to be talking about e-sports data, which is something that I’m sort of interested in from a personal perspective because I play a lot of games, but not competitively because I’m pretty rubbish at them. So, tell me, e-sports data, how long have you been involved and interested in this area?

McDonald: I've only been working with e-sports data for about a year or so. I just kind of got interested in it because of the booming popularity of e-sports. I’ve done a lot of sports analytics work, so I’ve worked with sports data a lot. E-sports seemed very similar to that just because there’s a lot of team games- like five-on-five type team games, so that has some similarities with hockey and basketball, it seemed. But it also seemed different enough so that it wouldn’t be exactly the same as those. So, I thought it would be interesting to start studying those things, and with a quick search of the web, I didn’t really see a lot of the analysis that has been done in sports. I haven’t really seen it done in e-sports.

Tarran: That’s quite interesting because you’d imagine it being all based on computers, that you’d have access to the data and that would almost be built into the sport from the ground up.

McDonald: Yeah, I think the data- definitely, there are benefits of the data. In theory, it should be pretty much perfect data, in theory. So that’s one of the benefits. But it seemed like a lot of time was spent on other things other than analyzing that data. Some folks spend time creating AI robots to play the game very well, and things like that, but as far as analyzing player performance, which is one of the things that is done a lot in sports analytics, it didn’t seem like there was a whole lot done in that area.

Tarran: So can you- I mean you spoke a bit about being interested in sports analytics before, but can you pinpoint the moment that you thought that this e-sports data might be worth digging into? Is there anything that catalyzed that interest?

McDonald: I don’t know that I can remember the point where that happened. I think it was just I saw some statistics on what the revenue was or what the viewership was for these things, and I also had a couple of people who I work with who would watch Fortnite on Twitch, like during the workday, for example, so that might have helped catalyze it. They were a younger group of colleagues and it was obvious that it was much more interesting to those folks and so e-sports must have a pretty good future, as those folks grow older. I think viewership and revenue will likely grow.

Tarran: So where did you start when you wanted to start digging into this data? Were there ready-made sources of information online that you could download and analyze? Or did you have to approach companies about getting access to their databases?

McDonald: Yeah, there were a couple of sources online that we heard about through a friend, where they have done a lot of the polling of the data from the company that makes the game, and they’ve sort of cleaned it and put it in a pretty useable format. And there’s also an R-package that helped with pulling some of that data so I think we got data from two different sources, the R-package and scraping, like web scraping.

Tarran: What sort of data are we talking about here? What are you dealing with? What sort of data are you looking at?

McDonald: The basic data that we use is basically game-by-game data for players. And it’s- the data just tells you how many kills, assists, deaths. The one game that we’ve been focusing on is Defense of the Ancients 2 (DOTA 2). So they have gold per minute is one of the big things with that game and so it has those things, and then you can tell what team they’re on and whether or not that team won or lost the game, so those are the main data points that we’ve been using.

Tarran: And what are the questions that you’re looking through with this data? What are the areas that you’re focusing on in particular?

McDonald: I think the most important thing that we were looking at was- we wanted to come up with an evaluation of player performance. It’s a common thing done in sports analytics, where you have these team games and each payer has an individual contribution, but looking at the most basic stats won’t tell the whole story because the – what we would call “box-score” type stats- that’s what we call them in regular sports, so just looking at kills, deaths, gold, something like that- it’s not going to tell you the full story because those are highly dependent on the player’s role. They’re also highly dependent on how good the player’s teammates are. And whether or not they lose the game is also highly dependent on who the player’s teammates are. So we wanted to come up with a metric for player performance that accounts for the player’s teammates and their opponents.

Tarran: And what is that metric that you’re using?

McDonald: So it’s a regression-based metric. We’re calling it Adjusted Plus Minus- that’s what it’s been called. For example, I've done it in hockey before with a few other folks. I think it was originally done in basketball, but the idea is the plus minus statistic in basketball or hockey is you know, you get a plus one for every point that’s scored when you’re in the game, and a minus one for every point that the opponent scores when you’re in the game. So that’s what plus minus is. The adjusted plus minus is just referring to the fact that you kind of have a plus minus type statistic but you’re adjusting for the player’s teammates and opponents.

Tarran: The metric that you’ve got, the data that you have, are you starting to see e-sports teams interested in it in the way that baseball, basketball, hockey teams are really investing in data now?

McDonald: Yeah, we hope to get interested in it. We haven’t published it yet, so we’re giving a presentation here at JSM. We have a paper that’s close to being a final draft that we’ll be submitting sometime soon. But hopefully, after that, we’ll see some teams get interested. I think some of the leagues are coming up with their own advanced stats. I think the Overwatch League released something a couple of days ago that was sort of a more advanced player type metric. It didn’t seem to be the same kind of thing as what we’re doing. It didn’t seem to be a plus minus thing, but it seems like an improvement over what’s out there. So leagues are definitely interested in providing these metrics and I imagine teams would be interested as well. So that’s kind of the hope.

Tarran: The U.K.’s case is going to be the same in terms of trying to optimize the strength of the team. Make sure that you’ve got a good number of attacking players and defending players and whatever it might be.

McDonald: Yeah exactly, that kind of thing, and then also eventually, hopefully, we’ll be able to model what sort of players play well with which other players. Or what characters or heroes that the players used mesh well with other ones. And so I think there's a lot of things that people know by experience from playing the game, but maybe there’s something new that we uncover or maybe we quantify by “how much does this actually matter?” We know it matters, but how much does it matter? Does it matter a little or does it matter a whole lot? So those are the kinds of things. The chemistry between players, team chemistry and things like that. Pretty much things that you might ask about a regular sport we ask the same kinds of questions with e-sports.

Tarran: Well, of course, the interesting thing you’ve got with e-sports is that you’ve got the players and then you’ve got their avatars, haven’t you? So you’re kind of dealing with almost two characters, or two individuals because they have to –well that player, the human player performance might vary depending on which character they choose, which role they decide to play?

McDonald: Yeah, exactly it’s – that adds an interesting element to e-sports, that’s not really there in sports. I mean it’s sort of there in that there are some things about an athlete that they can’t really do- by training they can’t change this attribute about themselves. So, height, for example, can’t really change your height. But something like strength you could change. So these avatars, the heroes, they have these attributes and so that’s almost kind of like the physical body that an athlete has. So it’s an area that is one of the most interesting things that we hope to look at is just the role that heroes play. And it’s pretty different from the kinds of things that we would see in sports analytics.

Tarran: Are there other differences between your experience with traditional sports and e-sports?

McDonald: Yes, I think there’s definitely a lot of similarities. I think the big difference is- we kind of just talked about the heroes- I think age might be another big difference. Not really sure. So, typically in sports, it’s much more difficult to project a player’s performance if the player is young. So it’s much more difficult to project a 16-year-old than a 21-year-old, for example, because of the growth- the physical growth and development that they undergo during those years. I guess it’s not totally clear yet whether the fact that e-sports athletes are much younger, it’s not clear yet whether that will make it more difficult to project their future performance, or whether just the nature of e-sports is such that it’s actually easier to project future performance for some reason. Maybe the kinds of skills or physical growth maybe doesn’t matter as much- the physical development. And so maybe players are roughly the same when they’re 16 versus 21. So it’s another unanswered question that would be pretty interesting to try to tackle.

Tarran: And are there other challenges that you foresee on the way to getting e-sports analytics established within the e-sports community? Getting that investment that we see elsewhere?

McDonald: Yeah, I don’t know. I don’t have a good feel for what the hurdles would be. Part of me thinks that you’ll have the same hurdles as you do in sports. Part of me thinks that the community might be more familiar with or more accepting of technological advances or just making use of data, just because of the nature of- you know the folks that worked on the game and created the game you know, a lot of computer scientists, things like that- so, in that sense maybe it would be more quickly accepted than it is in more traditional sports.

Tarran: I guess one of the interesting aspects is that by and large, traditional sports don’t change very often, whereas- when I say don’t change, I mean the rules just stay fixed. E-sports, there’s an expectation that games will be updated, new maps, rebalancing of weapons and character skills and things like that. I guess that adds another layer to consider?

McDonald: Yeah, that’s another interesting part of this. One of the- I should have mentioned it when you asked about the differences, but that’s another one of the big differences that’ll be interesting to look into. But I think the Overwatch League metric that I mentioned before takes into account patches. So different ratings for players, depending on which patch- game patches, and so I think that’s a really- I’m not totally sure the best way- I don’t know right now how we’re going to go about dealing with that. But it’s definitely something that should be dealt with because it does change the game mid-season. I mean, regular sports might have rule changes in the off-season, or enforcement rules- sort of change from regular seasons to the playoffs sometimes, depending on the sport, but there's nothing where in the middle of the sport- in the middle of the regular season, that a rule change that affects the player's abilities-

Tarran: Where your kick becomes less powered or anything like that.

McDonald: Right.

Tarran: So with your role at ESPN, will you be working specifically on e-sports data? Because I know they show a lot of the e-sports competitions now don’t they?

McDonald: Yeah, I think at some point I’ll be working more on e-sports. I think there's more of a focus on football, college and pro football, and college and pro basketball. You know I’ll be doing some hockey there as well. I think eventually I’ll do some e-sports as well. Especially if it’s something that ESPN starts covering more and more just as the popularity of e-sports grows. But for now the most popular things on ESPN are football and basketball, so that’s where a lot of the focus will be.

Tarran: But I can definitely see, from my own experience of playing games and watching these online competitions, you can imagine the data becoming a much richer part of the fan experience than say, it would be- you know you’ve got your football fans and baseball fans that are into statistical analysis in that they follow the numbers, but I think that maybe video gamers is a naturally predisposed to kill ratios, kill-death ratios, that sort of thing, that they’re going to be really hot on that, I think.

McDonald: Yeah, I think so. That was sort of one of the hopes when we started this, that the kinds of things that we’re working on would be adopted and maybe with less hurdles than are in the traditional physical sports. So hopefully yeah, folks are more predisposed to being interested in this sort of thing.

Tarran: Well as they say on TV watch this face. Thank you very much, Brian, it’s great to talk to you today, and I hope you enjoy the rest of the JSM.

McDonald: Great talking with you too, thank you.

Brian Tarran: My name is Brian Tarran and I’m the editor of Significance Magazine. Find us online at significancemagazine.com. For this special JSM series of podcasts we’re collaborating with Stats and Stories. Stats and Stories is a partnership between Miami University’s Departments of Statics, and Media, Journalism and Film and the American Statistical Association. Follow us on Twitter, Apple podcasts or other places where you can find podcasts. If you’d like to share your thoughts on our program send your email to Statsandstories@miamioh.edu, or check us out at Statsandstories.net. And be sure to listen for future editions of Stats and Stories, where we explore the statistics behind the stories and the stories behind the statistics.


Using the Stats to Improve Your League of Legends Game | Stats and Stories at JSM by Stats Stories

Michael Schuckers is the Charles A. Dana Professor of Statistics at St. Lawrence University in Canton, NY. An applied statistician he has received funding from the US National Science Foundation, the US Department of Defense and the US Department of Homeland Security. He is the author of over three dozen publications including Computational Methods for Biometric Authentication (Springer, 2010). Additionally, Schuckers has done work in sports analytics particularly ice hockey including consulting with a MLB team and an NHL team. For his work in this area, he was named a American Statistical Association's Section on Statistics in Sports "Significant Contributor".

Read More

The Statistics of the Year | Stats + Stories Episode 76 by Stats Stories

David Spiegelhalter pic.jpg

David Spiegelhalter is Winton Professor for the Public Understanding of Risk in the Statistical Laboratory at the University of Cambridge, Chair of the Winton Centre for Risk and Evidence Communication, and President of the Royal Statistical Society.

+ Full Transcript

(Background music plays)

Rosemary Pennington: As 2018 winds down, everyone from social media users to mainstream media outlets are releasing their lists of top albums, top books or top films of the year. Earlier this month the Royal Statistical Society got in on the action by announcing its statistics of the year. That's the focus of this episode of Stats and Stories, where we explore the statistics behind the stories and the stories behind the statistics. I'm Rosemary Pennington. Stats and Stories is a production of Miami University's departments of Statistics and Media, Journalism and Film as well as the American Statistical Association. Joining me in the studio are regular panelist John Bailer, Chair of Miami Statistics department and Richard Campbell of Media, Journalism and Film. Our guest today is David Spiegelhalter, I should say maybe Sir David Spiegelhalter, Chair of the Winton Center at the University of Cambridge. He's also the president of the Royal Statistical Society or RSS, which as I said just announced its choices for statistic of the year and I want to point out that he's the first three-time guest on Stats and Stories…

John Bailer: That should have been statistic of the year!

(Collective laughter)

(Vocies overlap)

Pennington: David thank you so much for being here today.

David Spiegelhalter: A great pleasure to be back again!

Pennington: Why choose a statistic of the year in the first place?

Spiegelhalter: Well you know we are statisticians, we think statistics are immensely important and we launched this last year as an experiment just to see if it would catch on and we were amazed at the interest in it. We’re in print, the popular radio programs and we don't just do a statistic of the year, we got 10 of them and people loved the variety and the choice so we thought we’d do it again!

Bailer: What was the criteria that you used? I mean you said you had hundreds of submissions so how do you…

(Voices overlap)

Spiegelhalter: We got hundreds of submissions. The first criteria was that it was faintly true.

(Collective laughter)

Richard Campbell: How many did you get rid of?

Spiegelhalter: Some of the entries were the old joke, you know, 95 percent of all statistics are made up. We expect those, but unfortunately that's actually one of the truest statistics judging by the entries we got, because they come in and they sound very impressive but then you start doing the fact checking and so many of them just don't stand up. I suppose this is not news to anybody. There's a lot of fake news around the world, a lot of false claims being made, a lot of them statistical and we ended up getting sent these. So we've had to do some serious filter to try to get things that we actually think are fairly accurate.

Bailer: So after you kind of filtered out the fake, how did you pick among the real?

Spiegelhalter: Oh, very difficult. We got good panel, we got journalists we got all sorts of official statisticians, you know, with some difficulty. We wanted a variety. We didn’t want them all gloomy. You know you could pick 10 gloomy statistics. We don't want to reinforce the impression that statisticians are all just such miserable people. And we wanted ones that were covered also, so some of the stories that you know, we know have been going on throughout the year. I should say the one thing that we haven't got is a Brexit statistic, but that's you know our own local problem that we're having to deal with.

Pennington: I remember that it’s so local. Do you remember that?

(Collective laughter)

Bailer: So I guess there are…how many did you pick? You picked 2 winners and how many runner ups?

Spiegelhalter: Yep, we've got two winners and then 8 runners up, highly commended statistics.

Bailer: So one of the things that I was curious about is, you know, there's lots of ways to report a statistic. And so I’m going to let you talk about some of the ones that you picked, I'm curious about the winners. I think…we know the people are just sitting on the edge of their seat, waiting to hear this result. So after you talk about the winners, I'd be curious for you to comment a little bit about why this representation versus some other representation of the story was compelling to you.

Spiegelhalter: Exactly. I mean it's terribly important because I know, we all know that we can make any number big or small, depending on how we frame it, what comparisons we make, what units we use and so you know we would try to frame them I think in a way that is most realistic. So when we do the winner, we actually reported into multiple frames in order to get a more balanced feeling about it.

Campbell: When you did this last year for the first time, I know the statistic, the international statistic I think, or was it the American statistic or U.S.?

Spiegelhalter: It was the international.

Campbell: The international one, that was in the Huffington Post and Kim Kardashian picked it up. How much news generation came out of that?

Spiegelhalter: You know we got a lot of coverage that was about essentially over the last 10 years. I think our main statistic was the number of U.S. citizens killed by lawn mowers over the years. That of course was just a hook to try to draw people in, just compare the number of people that are being killed by you know immigrant jihadist terrorists which is on an average of 2 a year, compared with the number for example killed by fellow Americans.

Campbell: Yes, and that was 11000.

(Voices overlap)

Spiegelhalter: So these are very stark figures, and we received some criticism about that, you know and I can see why, because it suggests well, that's the future risk. We didn't mean that, it's the past rates that what has happened. These are the statistics of last year, they are not predictions about what's going to happen next year.

Pennington: So what are your statistics this year for the international side of things and I know you also identify U.K. one as well. So what are the winners?

Spiegelhalter: OK the International one is a slightly negative one, it was more than negative. So it's 90.5 percent. And that's the proportion of plastic waste that has never been recycled. We also frame it to say well 9.5 percent has been recycled but still not a very large number given you know you're talking about you know 6000 million metric tons of plastic that’s actually not in use anymore, that has been got rid of. And so you know that means that only 10 percent has being recycled and out of the rest of it, about 12 percent is being incinerated and the rest is just lying around at landfills or will be dumped in the environment and you know I'm sure that in the States, certainly in the U.K., plastics has received a lot of attention this year….Blue Planet, these pictures, whales and fish and things like that with all this plastic in them, and this has become a very strong story. And then this was a really strong study done from the University of California, you know published in Science Advances. They made this assessment of the amount of plastic that was not being recycled.

Campbell: So I'm a general listener and say, I am watching cable news in America and I see the statistics come on and I'm saying, OK. How do they know 9.5 percent of the plastic waste has never been recycled? So I'm putting you on the spot here. So how would we respond to that? Because we get a lot of that, you know, people not believing in statistics and certainly not willing to do the work to find out where that information came from.

Spiegelhalter: Actually it was reported in a UN paper, in a report but it comes from a published paper in Science Advances from 2017 and kind of….Oh interesting! So they got plastic production data. They can get that from industrial production system statistics and then they can look at product lifetime distributions from eight different industrial use sectors. So by breaking it up into the different sectors, packaging and so on and then they have got data on how long within each sector plastic is in use, and then by knowing about the productions they can work out how much plastic is out there. So that's how they work out that you know only 30 percent of plastic ever produced is currently in use. That means 70 percent has gone and then…I'm just trying to work through how did they get at the amount that’s being recycled and they know from other sources…then they look at the recycling rates broken down around the world, from Europe and China. And in the United States plastic recycling has remained steady at 9 percent since 2012. So essentially I can stop and do this again. It's really cool. So they build a big model. First of all the model for plastic production, looking at industrial data. Then a model for how long plastic is in use. That enables them to estimate how much plastic is actually in use at the moment, which is you know, at least 30 percent of what's being produced. And then by looking at incineration and recycling data from different countries, they can work out how much is being recycled out of everything that's being produced and is not in use anymore.

Bailer: So a natural question is, you just described the models, that's estimating a lot of components. And you know, none of these things are known, and so there's uncertainty associated with all of this and you know what would you say when people say well, by reporting a single number that perhaps this is conveying an overly strong sense of precision?

Spiegelhalter: And I would completely agree. And that it would be much better to give a range of these numbers at a minimum. Actually I believe the giving ranges would make it more trustworthy and happier, having a range than a single number. I mean one can qualify it by saying around or an estimate and so on. So they’ve got a relative measure of plus or minus about 6 to 7 percent, which isn't too bad. So that would only take it, if the total is 10 percent, you know you might say the total is between 8 and 12 percent for something that’s being recycled.

Bailer: OK. I just think math is such an important point. All the time we see the kind of headline statistics, there's always in my mind, kind of two things that come - one is, you know, how well do they know this number, and then even when you have some of these other components like the 63 million metric tons, do people have a sense of how much that represents?

Spiegelhalter: Yeah, these are just big numbers. What does it mean? And that's why people will be so much more influenced by seeing a picture of a turtle you know with his head through a piece of plastic or something like that, what drives the emotional reaction to these things. You know what does that 63 million metric tons mean? It is extremely difficult to judge. I mean one way of course is to do it at per head of population, for a million people in the world, that’s a ton each, that’s enormous so I. So I think there is a problem with all these big numbers. It is amazing it is almost exactly a ton each of plastic for each person that is no longer in use. Wow! That debate is more impressive than the 6000 million metric tons which I haven't got a clue what that means!

Pennington: You're listening to Stats and Stories and today we're talking about the statistics of the year according to the R.S.S. with society president David Spiegelhalter. I'm going to ask you to talk now about the U.K. stat of the year, because I think it's interesting that both of these statistics of the year are somehow related to environmental concerns.

Here now 12:16

Spiegelhalter: That was a deliberate choice and we’ve also chosen one negative and one positive there. The U.K. one is a positive environmental one, that on the 30th of June the 28.7 percent is the figure and that's the peak percentage of all electricity produced in the U.K. is solar power, on the 30th of June. So that means that amazingly for this certain country, solar power was the biggest producer of electricity. Briefly, extremely briefly and that number is exact. That is a true statistic. But of course it was only brief, but it's a staggering change from you know, when it was so low, nobody thought about it 10 years ago in this country.

Bailer: So could you give us the list of kind of the highly commended statistics international?

Spiegelhalter: Yes. We've got some very positive and negative ones. The positive one is that in spite of all the stories you know that we hear about the decline of living standards in the West, worldwide the percentage of the population that it considers living in absolute poverty, has more than halved since 2008, that’s in the last 10 years. It has gone down from 18 percent to essentially 9 percent and it is a quite extraordinary benefit that this happened to people. And this isn't a story that makes the international news, that far fewer people are living in absolute poverty than 10 years ago.

Bailer: And then, just as a…well before we go to the other ones, I had a question for you in terms of reporting this. When I saw it, I was wondering if 50 percent reduction in absolute poverty would be a more impressive statistic to me than 9.5 percent...

Pennington: Yeah, maybe.

Spiegelhalter: No exactly. We chose deliberately to use the percentage point reduction. Then we can say it’s halved, essentially, but in this case we would have a bigger emotional hit to say poverty has halved in the last ten years. But we want to do this statistic, which is the percentage point reduction. We could frame this and give it a stronger emotional hit, but we chose not to.

Bailer: You are a risk difference guy, than a risk ratio guy here.

Spiegelhalter: Yeah exactly I believe in absolute risks, absolute proportions. We know that relative risk, relative changes can be highly manipulative. The way in which to communicate changes over time.

Campbell: Is part of the statistics of the year to how much behind the final decisions is what's going to attract a news story? We need to get people interested in and learning about statistics. What's going to get the New York Times to cover this, what's going to get the British press to cover this?

Bailer: Yes. There is a trade-off there. We can't just have a whole lot of negative stories and they can’t be too dull. We want them interesting, but at the same time they can’t all be about celebrities or whatever. Last year's was quite a nice mix. We couldn't find quite like that this time. We want good news stories but we also want ones that are just important and frankly ones that have a story that’s not generally being told, rather than just the celebrity stories. The stories about poverty being halved in the last 10 years, nobody's written a story about that this year. That’s not in our news.

Bailer: So you had 3 more that were in your highly commended group. So you want to just run through them real quickly and then we can…

Spiegelhalter: Yeah. Well the second one, I think this is terribly important. Amazingly from November 2017 to October 2018, the number of measles cases in Europe which is 64,946, nearly 65000 measles cases and 2 years ago it was 4000 cases.

Bailer: Oh my goodness!

Pennington: Wow!

Spiegelhalter: Isn’t that staggering? That’s 15 fold rise in 2 years. This is really terrifying. This is very serious indeed and we know why, because of all the stories about vaccines are giving kids autism, in spite of being disproved and in Britain we've recovered from that story, largely because we've exported Andrew Wakefield to the states.

(Collective laughter)

(Voices overlap)

Spiegelhalter: But the number of anti-vaccine websites and the fact that this has become politically acceptable, for example in Italy, major parties are arguing against vaccination. This is very dangerous, and you know the kids will die, and you know this is a really bad story.

Bailer: And so then the next one related to the Russian men.

Spiegelhalter: Yeah this is really extraordinary. This year Russia raised the retirement age for men from 60 to 65. Unfortunately for Russian men, 65 is their current life expectancy. It's only just above that, so it's estimated that 4 in 10 Russian men, 40 percent, will actually die before they get to that pensionable age, which is quite troubling compared with say the US, you know, that 80 percent men will get their retirement age and in U.K. 87 percent men will live past 65. I’m 65, I’m just taking my pensions. I’m a lucky one of those 87 percent.

Bailer: I really like that part of when you're reporting out the idea of putting that context. You know when people think about that, when you first report that 40 percent, which is that? Is that big or is it little? Then they are given that other example with the U.K. and US, I find that a really nice part of contextualizing the story.

Spiegelhalter: Yeah, so it's still in the U.K. 20 percent of men, 21 percent won't do it. So you know it's about half the figure in the U.K. About 13 percent won’t make the retirement age. So you know it is bad but in Russia then that's 3 times that, right? Which is very high. You need the international context with that data.

Bailer: And how about your last one?

Pennington: Kardashian, I guess.

(Collective laughter)

Spiegelhalter: This was a bit of a celebrity. 1.3 billion, this is extraordinary. The amount wiped off Snapchat’ value within a day of one Kylie Jenner’s tweet. So this is a bit of a flagrant appeal to populism. You know just a brief tweet that she made in February 2018…so does anyone else not open Snapchat or is it just me? Oh yeah. 367,000 likes! I mean, it is extraordinary. I mean there are other things that were changing about Snapchat also. Again we've got to be careful with drawing you know a causal pattern with certain decisions we know we can't draw straight causal pattern. But this is too good a story to miss.

Pennington: Yeah. I had a question about the U.K. statistics and maybe we can talk about some of the other highly commended ones but I wanted to ask about Jaffa cake.

Spiegelhalter: Oh yes.

(Voices overlap)

Pennington: …to explain what exactly they are and why this is noteworthy statistic for people who are not in the UK.

Spiegelhalter: There are kind of a form of biscuit, but they had to go to court to claim they are a cake because and they didn't have to pay a VAT tax on them if they will call it a cake. It is a type of biscuit with a soft bottom but a chocolate top with a bit of sort of orangey you know jammy stuff inside as well. I love them! They are a real sugar rush. I love them, I have to keep them out of my way. They normally sell them in smallish boxes but at Christmas they release what used to be called a yard of Jaffa cake which was 36 inches long, an old yard.

Pennington: A lot of sugar!

Spiegelhalter: Last year that contained 48 Jaffa cakes. Well, now it only contains 40 Jaffa Cakes, the cakes generally are of the same size, you just get less of them in your box, and actually the boxes shrunk and they couldn't fit it into a yard anymore. So now they have to call it a sort of Christmas cracker or Jaffa Cakes, and you know what this is? The end of Jaffa Cakes. They are incredibly…they sell billions of these things. I know I love them. But some say…but this is just the one example of you know the shrinking size of products, that you could say this is a good thing. It could be a great thing if people didn't eat so many Jaffa cakes. You know Mars Bar and other things will go smaller, this is a very good thing. Portion control is incredibly important, it’d be wonderful if people didn’t eat so much. But the price has gone down.

Bailer: I love the way you describe it as shrink inflation too!

Spiegelhalter: Shrinkflation, yeah! And Toblerone got a lot of interest last year as well when they reduced the size of the chocolate but not prices. So this is not a matter of perhaps global importance but some people notice this kind of thing, and again it made a good news story, where they got a lot of coverage.

Bailer: Were you surprised that the one report about the amount of shopping that was in store versus online?

Spiegelhalter: Yeah, this is the issue where you to decide, what about the framing of this? Do you frame it as saying that 18 percent of all shopping is now online, you know the big one in five spending online…or do you frame it as 82 percent of shopping is still done in the shop, rather than online. Do you do a positive or a negative frame? Because I have seen this story reported in both ways. Actually for us, we found it quite surprising that given you know the huge publicity around the rise of online shopping, the closure of so many shops in the high street now, I thought it is going to a big effect of this. I'm surprised it was at 82 percent. But then again of course you've got food, you’ve got a lot of stuff that’s not done online as well. But still you know 82 percent is still done by people walking to shop and paying.

Campbell: How that compares with the U.S.? It would be interesting to see that!

Spiegelhalter: I don’t know what that in the US is.

Campbell: It seems very high.

Pennington: That does seem high.

Campbell: Here in the US, everybody is using Amazon here you know.

Spiegelhalter: Yeah, well people use that here as well, you know. It was a huge amount as well so…I don't know the U.S. figure, I’m about to find that out.

Bailer: You know, the other one of the stories that the commended stats related to, the trains running on time and you know, we all thought well, all of us do travel and you know, it is about rail travel, but I was wondering how the rail travel in Great Britain compared to that in Europe, or how it might compare to air travel…I was thinking about some of this contextualizing and framing this too.

Spiegelhalter: Yeah, we really should. Again I think that's a very good point that we need to look at because the reason why that story's in here is that we have an utter disaster this year with regards to trains. They introduced a new timetable, that wasn’t planned properly, huge numbers of cancellations, absolute chaos and there were strikes as well. So I mean this 86 percent of trains are running on time is terrible because they know this must be above 95 percent of the time, that’s what they claim to be able to do, and that's where you can start getting compensation as well. They paid out a fortune in compensation. I was travelling on trains in the summer and you know they were just announcing on every train to tell you how to claim compensation. I was making the claim even before the train came in, got to my destination. I had my online compensation claim that I submitted so it was absolute shambolic. So this is far worse than it generally is, it’s the worst for nearly 15 years in this country, it has been quite recently late. But I don't know the international comparisons. That’s something I should find out. But actually it was so noticeable this year, the whole system really fell apart in the summer.

Pennington: We are starting to be getting ready to wrap up, but I do, before we go want to ask you about this. The first listed commended statistic for the U.K. about female executives of 250 companies.

Spiegelhalter: Yeah, so that's the figure 6.5 percent, which is 6.4 percent. Sorry, the figure is 6.4 percent, which is the percentage of female executive directors within 250 companies, especially the big companies in the U.K.. And the gender pay gap has been a massive issue in this country, because this country for the first time by law, in larger and medium sized employers after they pull gender pay gaps. Unfortunately those are just reported as what women get paid from what men get paid. And we were going to use those figures but actually they’re not…they can be very misleading because it includes many women who are in part time work, they are not adjusted for the kind of work. So what we want to do is to pick a job in which you know everyone is roughly comparable and then looking at what’s the percentage of female and it's extraordinarily low. And it doesn't seem to be getting any better. I mean it changed, it went from 38 to 30 in a year. I don't think that’s really statistically significantly different, but it's certainly no indication of things getting bigger.

Pennington: So that’s all the time we have for this episode of Stats and Stories. David, thank you so much for being here, it has been a really interesting conversation today.

Bailer: Always a pleasure David, I still think three should have been on there, number of times David Spiegelhalter has been on Stats and Stories.

Spiegelhalter: It’s going to be okay, I seem to be fumbling around as you see! You can tell it's the first interview I've done on this, I'm going to do a bit more preparation, some background on them. Yeah but that was very helpful to me in fact!

Campbell: Good!

Pennington: Stats and Stories is a partnership between Miami University’s departments of Statistics and Media, Journalism and Film, as well as the American Statistical Association. You can follow us on Twitter, Apple podcast or other places you can find podcasts. If you'd like to share your thoughts on the program send your email to statsandstories@miamioh.edu you can check us out at Statsandstories.net and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.


Better Bayes Winner Revealed | Stats and Stories Episode 73 by Stats Stories

Stephen T. Ziliak is Professor of Economics at Roosevelt University and Conjoint Professor of Business and Law at the University of Newcastle-Australia.  A major contributor to the American Statistical Association “Statement on Statistical Significance and P-values” (2016) he is probably best known for his book (with Deirdre N. McCloskey) on The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives (2008), showing the damage done by a culture of mindless significance testing, the history of wrong turns, and the benefits which could be enjoyed by returning to Bayesian and Guinnessometric roots.

Read More

The Fab Formula | Stats and Stories Episode 68 by Stats Stories

Mark Glickman, a Fellow of the American Statistical Association, is Senior Lecturer on Statistics at Harvard University, and Senior Statistician at the Center for Healthcare Organization and Implementation Research, a VA Center of Innovation.  He is well-known for his work in games and sports, having created the Glicko and Glicko-2 rating systems that are widely used in online gaming.  Mark co-organizes the biannual New England Symposium on Statistics in Sports, has been Editor-in-Chief of the Journal of Quantitative Analysis in Sports, and has been the chair of the US Chess Ratings Committee since 1992.  More recently, Mark has embarked on projects in music analytics.  His work on authorship attribution of Lennon-McCartney songs has received widespread media coverage.

Read More

Balancing Rigor And Entertainment When Telling Stories About Data | Stats + Stories Episode 49 by Stats Stories

Nick Thieme (@FurrierTranform) is a research fellow at University of California Hastings Institute for Innovation Law and freelance writer for a variety of outlets. Currently, his work focuses on AI regulation, cybersecurity, and pharmaceutical patent trolling. His writing has appeared in Slate Magazine, BuzzFeed News, and Significance Magazine. He was the 2017 AAAS Mass Media Fellow at Slate Magazine, writing about technology, science, and statistics.

Read More

Will You Be One Of The 8% Who Keep Their New Year's Resolutions? Understanding Behavior Change | Stats + Stories Episode 23 by Stats Stories

Dr. Rose Marie Ward is a professor in Miami University's Department of Kinesiology & Health. She studies college student health, with a focus on both addictive/harmful behaviors (alcohol use, disordered eating, unsafe and unwanted sexual behavior) and prosocial activities (women’s leadership, life satisfaction, scholastic achievement, exercise, and athleticism).

Read More

What Do Seinfeld, The Tonight Show And Stats+Stories Have In Common? | Stats + Stories Episode 7 by Stats Stories

Rick Ludwin was hired by NBC Entertainment in 1979 and made director of variety shows there in 1980. He then became vice president for specials and variety programs in 1983; senior VP for specials, variety programs and late-night in 1989; and executive VP for NBC’s late-night and prime time series in 2005. In its 57 years, The Tonight Show has had five permanent hosts, and Rick has been the boss of three of them. His late-night division at NBC developed the hit comedy Seinfeld. Rick, a 1970 Miami University grad, joined the Stats+Stories regulars to discuss the use and impact of ratings on television programming

Read More