The Dark Statistical Story of the World Cup | Stats + Stories Episode 295 / by Stats Stories

price.jpg

Dr. Megan Price is the Executive Director of the Human Rights Data Analysis Group, Price designs strategies and methods for statistical analysis of human rights data for projects in a variety of locations including Guatemala, Colombia, and Syria. Her work in Guatemala includes serving as the lead statistician on a project in which she analyzed documents from the National Police Archive; she has also contributed analyses submitted as evidence in two court cases in Guatemala. Her work in Syria includes serving as the lead statistician and author on three reports, commissioned by the Office of the United Nations High Commissioner of Human Rights (OHCHR), on documented deaths in that country.


Episode Description

Women’s World Cup action in Australia and New Zealand has wrapped up and Spain’s been crowned the champion. After players and fans headed home, residents were left to clean up after them. Hosts of such tournaments are also left to tackle the human rights implications of hosting an event that massive.  The human rights impacts of something like the World Cup are incredibly hard to measure and that is the focus of this episode of Stats+Stories with guest Dr. Megan Price. 

+Full Transcript

Rosemary Pennington
Women's World Cup action in Australia and New Zealand has wrapped up and Spain's been crowned the champion. After players and fans returned home, residents were left to clean up after them. Host of such tournaments are also often left to tackle the human rights implications of hosting an event that massive. The human rights impacts of events like the World Cup are incredibly hard to measure. And that's the focus of this episode of Stats and Stories, where we explore the statistics behind the stories and the stories behind the statistics. I'm Rosemary Pennington. Stats and Stories is a partnership between Miami University's departments of statistics and media, journalism and film, as well as the American Statistical Association. Joining me in the studio is John Bailer, professor emeritus of statistics at Miami University. Our guest today is a returning guest to Stats and Stories, Megan Price. Price is the executive director of the Human Rights Data Analysis Group or HRDAG, a nonprofit that works with human rights groups to figure out what questions can be answered using quantitative data, or how such data can be used to help people better understand human rights issues. In 2021, HRDAG won the raffle prize awarded to the individuals or groups for their work on human rights and democracy in the run up to the 2022 World Cup and Qatar. Price wrote an article for Significance about the difficulty of measuring the human rights impacts of events like the World Cup. Megan, thank you so much for joining us again.

Megan Price
Well, thank you. It's always really fun to be here.

Rosemary Pennington
I guess just to start the conversation, what sort of got you interested in writing about human rights in relation to things like the World Cup?

Megan Price
Absolutely. So the story idea really came out of a series of conversations that I had with my colleague, Mickey Warden, who's the Director of Global Initiatives at Human Rights Watch. And I should really say that the intersection of human rights and sporting events is her beat. And she has done some extensive research and writing on this that I'm sure we'll get into over the course of our conversation. But it came up with the two of us specifically because she knew there was this data complexity and this incomplete and missing data challenge to a lot of the questions that were being raised. And that has a tendency to be HRDAGs wheelhouse. And so that was where the idea for the story came from. And then the more I read her work and the work of others, the more interested I got in trying to unpack that, specifically for the audience that reads Significance magazine, because I think this is a problem that statisticians can contribute a solution to.

John Bailer
So let's hit rewind a little bit and talk about: what was the story? You know, so you did this piece where we're in the same month as Labor Day, it was recognized in the United States. So what led you to do this exploration and what did you see?

Megan Price
Yeah, so the general intersection of human rights and sports that tackles and raises awareness around I would say there's two pieces to it. One piece, which I hope we'll return to, and which has definitely come up more in the Women's World Cup, is around things like equality and equity, gender equity within the sporting world, and also sexual harassment and sexual violence within the sporting world. The specific piece that came up in the World Cup context in Qatar was around this bigger picture question of all of the infrastructure that's involved in these kinds of sporting events. And so we looked at it in the specific case of the FIFA World Cup, but this is a question that gets raised around any large sporting event, World Cups or Olympic Games, where, if you think beyond the sporting event itself, it requires not only all of the venues where the sport itself takes place, but also all of the infrastructure for the fans who come to enjoy that sporting event. So that can be hotels, roads, public transportation, airports, and in a lot of these settings, part of the the pitch to gain access to being able to host that event is the promise to build a lot of that infrastructure which requires a significant amount of labor and the way that that plays out varies across settings. And the way that that played out in Qatar in particular is for folks who may not be familiar with Qatar, it has this, I think, unique distribution of local citizens and migrant labor, depending upon how you think about the denominator anywhere between 85% and 95%, of the labor force in Qatar is brought in from outside the country. And so that creates a unique position that I think Minky and others, and I'm inclined to agree, argue, warrants some specific attention and some specific investigation to better understand the way that plays out. When it's brought into these large kinds of sporting events.

Rosemary Pennington
I know in relation to Qatar, there was a lot of writing about alleged abuses of migrant workers. But I think the thing that stuck with me was sort of stories of migrant labor deaths in relation to being ready in Qatar for this tournament. And I wonder, in connection to something like that, how do you track? You know, a point like that, like deaths related to this event? How do you quantify that? How do you track it? How do you figure out, yes, it was related to being a laborer or building this thing, and not something else?

Megan Price
With great difficulty. And that's really where I framed the article in Significance as well, that I think this problem is going to require a very multidisciplinary approach. And I'll just say from the outset, that we will be talking about deaths related to work and the deaths of migrant laborers. And I'm going to drill into the statistical challenges of that, but I want to, at the outset, acknowledge that each one of those deaths is of course a tragedy for the community and for the friends and family members of that victim. And so my attention to technical detail is not intended to convey a lack of sensitivity or a lack of weight to the importance of that. But I do think that that part of giving that the weight that it deserves, and paying attention to it, requires a better understanding than what we have. And that was really what I learned over the course of preparing this article, thinking that my statisticians had exactly the questions that you just raised. How there are many things we need to know from this specific population size, how many workers there are, what projects they're working on. And then if we're going to attribute causality, we need to know specifics about the circumstances of someone's death. And then specifics about the project they were working on. If we're going to attribute responsibility, was that a project under the purview of something related to the FIFA World Cup? Or was it a different construction project? And basically, at every point in those questions, we have challenges of missing incomplete data. And that's where I think statisticians have a role to play because that's very often the world we live in, in all of our analyses. And so we have statistical tools, and we have ways of thinking about what's the information that needs to be collected and what's the best way to analyze the information that I do have to try to bound? What's notable about these questions.

John Bailer
Yeah, I was struck when reading your article, just this challenge of classifying cause of death being such a critical question, and you know, that you go through different sources, some deaths were unclassified, other deaths were listed as cardiac arrest when it could have been, in fact, something that was induced by working in a very extreme temperature environment. This is sort of a challenge it seems with any occupational fatal injuries, you know, if you're exhausted at work, and you fall asleep driving home from work, is that an occupational fatality? Well, if you link it together, it's not something that's probably easily teased out. So this is when you are going through some of these issues of causality and responsibility, just the definition, or the attribution, as you were describing, of what happened and led to this outcome. And is that outcome something that would reflect it? It's hauntingly difficult, it seems. So what is it? What are some of the things that you thought about? And trying to approach that with these data?

Megan Price
Yeah, it definitely is. And that brings me to where I think this is really an interdisciplinary challenge that needs to be tackled with researchers from a variety of fields. Because the way this played out in this specific example was exactly what you were just describing, that sometimes the cause of death that was recorded was a general category or was a phrase or a description, that many physicians who were interviewed by organizations like Amnesty International and Human Rights Watch, were saying that that's really not best practices. And in fact, there was about a decade of time when Qatar had been awarded the World Cup hosting position, and was undergoing all of this construction, and was working on a lot of the rules and regulations and policies around how they are going to manage these big construction projects. And the consultants that the government themselves brought in, made some of the same recommendations of “you need to collect more complete data, you need to write down more detailed information for cause of death and conduct more autopsies and conduct more investigations.” And so that's something that I think from a statistical perspective, you can sort of see, over time, changes in the kinds of language they get used to, the kinds of cause of death categories, that the frequencies you can look at the different changing distribution. And that's not causality, but that can start to point you in directions and start to raise other questions. But I also want to say that this question of what's best practices around assigning causes of death and conducting autopsies is a very active area of research. And so I think that is somewhere where statisticians can work very closely with physicians and other kinds of investigators to really tease out what is the piece of information that you need? And to just acknowledge that certainly determining the cause of death is in no way something simple. And we've seen that throughout the pandemic, that knowing, you know, what was the primary cause of this death? Versus what were some underlying contributing factors? And in what order should things be listed is not necessarily straightforward.

Rosemary Pennington
I wonder, too, how difficult it is to do research that relies on data around migrant populations in relation to things like this, but even more generally, because you know, it is a population in relation to Qatar that were migrant workers largely coming in. But there are other situations where you have migrant populations that are coming in as a statistician, and as a scholar to figure out how do you make sense of that? So I wonder when you approach things like this beyond sort of simply figuring out like, is this data telling me what I needed to tell me? Like, how do I handle the population itself that I am wanting to study in a way that is ethical, but also capturing that experience?

Megan Price
Yeah, this is definitely one of those cases where numerically, both the numerator and the denominator are difficult to determine. And then as you just alluded to the experience piece, the qualitative part is important to capture as well. And that's where I think that fortunately, there are a lot of different organizations and different resources that can be called on. In the case of Qatar, some of the most interesting research I saw was relying on information from embassies. And so I think that is perhaps not a data resource that statisticians would always think of. But when we think about migrant labor, or immigrants of any sort, that's going to be an institution that knows what that experience is, and has some records about it. One of the other sources of information in Qatar was hospital records, which I think for those of us who work in public health, that's a much more conventional avenue for learning about these kinds of problems. But I think in general, I would consider migrant populations to be a hard to reach population. And so you do have to be creative, you do need to think about what are the institutions that they're going to interact with? What are the ways that we can learn about their experiences, and at the same time, be very wary of the ethics and the risk of things like the cost of surveillance, and things like the cost of triangulating these multiple data sources? That can be very useful to our specific statistical question, but may pose risks to the community. And so thinking through all of those pieces can be very daunting.

Rosemary Pennington
You're listening to Stats and Stories, and today we're talking with HRDAGs Megan Price about the human toll of large sporting events like the World Cup,

John Bailer
You know, as you were talking about this, and it seems like the critical nature of these multiple data sources, in part, reflects the fact that this is being done after the fact. If you could take a time machine back and put into place, a registry or a system for doing this type of work, you might have something different, so I'm really intrigued at this idea of how you align these the sources and the the embassies and the records that you've done that you've talked about. You’ve talked about another context with us before, in fact, so what are some of the ways that you try to bring this together and use this to do the analysis? But one aspect of this was talking about numerator and denominator, you know, just the idea that you need to know how many people have had an event. And then you need to know how many people were at risk of that event? So do those different data sources inform those different components of that question?

Megan Price
Definitely. And I think that is one of the things that makes this problem so challenging, is that very often those different data sources are collected for different purposes. And so they are representing different parts of the population that you as the researcher may actually be interested in. And so that definitely comes into play in this migrant worker case where you may have some records that refer to a workforce in general, and may not necessarily disaggregate by citizen status. And you may have other datasets that disaggregate by citizen status, but not necessarily age or worker status. And yeah, as you alluded to, this has played out in other HR projects where we're interested in particular types of violence, but certain kinds of vital records may record a wider variety of deaths than the kind that targeted research is interested in. And so yeah, figuring out how to capture that in the analysis and how to transparently represent the uncertainty that comes from that is one of the jobs of the statistician on the project.

Rosemary Pennington
I'm going to shift us to a discussion of the Women's World Cup that just wrapped up because you sort of primed us for a discussion about issues facing the women's sport, because one of the big stories going into the Women's World Cup was the fact that several of Spain's big players weren't playing because of frustrations and anger over the coach, and then post their win, right, which should have been this huge celebratory moment, we had the head of Spanish football forcibly kissing one of the players. And now the coach has lost his job, the head of Spanish football lost his job, is that right? I remember that and then I think maybe facing some charges. And that sort of feels like this very visible moment of like, a discussion around sexual harassment and assaults in the women's game. And I wonder what kinds of things HRDAG and other people were looking at as far as sexual harassment or sexual assault in the women's game? And again, it feels like something very difficult to measure and sort of track. So how do you do that?

Megan Price
Yeah, very slowly. So here, I will, again, defer to Minky, who really is spearheading this work and, and was hosting or was part of a conference towards the end of the Women's World Cup specifically to raise attention around this. And I think you're exactly right. What we're watching play out in Spain is a particularly visible moment. And certainly what I'm taking from it, and what I can only hope will become part of the larger message, is that this is an endemic problem that it is not just this one particular individual, this one particular instance. And that unfortunately, just like in many other professions, in the profession of sport, there is a system in place that allows this kind of behavior to carry on for a long period of time. And so I think the response that's needed is some kind of top down infrastructure policy change. And I think that it is exactly appropriate that FIFA as one of the governing bodies is being called on to take some action here and to take some responsibility. And as a former gymnast, I feel so many parallels and so many similarities to what we watched play out in the world of gymnastics. And similarly, I think the governing body of gymnastics had a lot of responsibility in the Larry Nasser case and in other stories that we've heard. And so I think that our I guess I hope that what we'll see is a more institution level reckoning.

John Bailer
Yes, it seems like there's some serious challenges to studying something that may or may not be reported at all right. And so it's, you know, the system has to be in place so that you can actually start to be able to track this. So do you have ideas about how that might be implemented, what that might look like?

Megan Price
It's very tricky to imagine. I'm not sure what it would look like in the professional sport setting. I know in a different setting, there is an organization that is creating an end to end encrypted, secure way for women who have experienced sexual harassment or assault on a college campus to report that. And so I think those are some of the kinds of ways that we can start to collect the information. We would need to track this. But I think part of it is also that we need to have a will to do something with that information, because in a lot of these cases, it does exist. Reporting has happened to some kind of an institution or governing body, in some cases to a state entity, perhaps a police force or something like that. And so I think, as much as the statistician in me thinks about the data collection piece, I think we also have to have a will to take action when that data does exist.

John Bailer
So we noticed that you're the HRDAG received the ref toe prize, so congratulations on that.

Rosemary Pennington
Yes, congratulations.

Megan Price Thank you so much.

John Bailer
And just a description of that is this prize sheds light on human rights violations and gives recognition to human rights defenders who deserve the world's attention. And there's no geographic issue or constraint. So they have other descriptions. But that's really cool. So what was the work that HRDAG did that really was exciting to this fun agency?

Megan Price
Well, thank you. It was a tremendous honor to be recognized in that way. And it was a wonderful humbling experience, the whole team got to go to Norway and spend some time with the foundation and with some other members of that community, it was really fantastic. And what they described to us that made HRDAG notable and honorable that year, in particular, was the combination of work we do bringing data and data science to questions of human rights advocacy. And that really is something pretty unique. And I say that because I actually wish we were less unique. I would like there to be more of us and more organizations bringing that combination of rigorous data analysis to human rights advocacy. And so I guess I'll make that my plea, my call to action. There's plenty of work to be done. And it would be nice to have a bigger community doing that work.

Rosemary Pennington
Well, I am wondering what is HRDAG focusing on now? And is there something that you've been following that you really think people should be more aware of?

Megan Price
Yes, so HRDAG has a long history of working outside of the United States on transitional justice issues working with branches of the United Nations with truth commissions, and we continue to do all of that work. But in the last 10 years, we've expanded our work to also include projects within the United States. And so those are the ones that are really at the front of my mind. Right now, we've been partnering with a variety of community organizers and activists, largely investigative journalists across the United States, who are examining police behavior and instances of police violence. And the way those projects have a tendency to go for HRDAG. And the role that we play in them is that our partners very often gain access to data through some kind of a legal mechanism, they win a court case, or they file a Freedom of Information Act request. And so in return, that information is shared with them in a way that I think of as almost being adversarial because it is legally compelled. And so often our partners are handed essentially a thumb drive with hundreds of 1000s of pages of PDFs. And so what we do is we come in and we help them make sense of that. We help them turn that haystack into a structured, searchable database that then we can ask more substantive questions of how many times did something like this happen? Are there patterns over time? Are there patterns over space, very standard statistical questions. And so that's what's on my mind in a general sense. And one of our most tremendous partners is based in Chicago, an organization called the Invisible Institute, and Trina Reynolds Tyler is our lead partner there. And she and my colleague Tarak Shah, recently gave a keynote address called the community built a model. And it was about the work that they did together, using volunteers from the community to develop training data and build machine learning models in very close collaboration. And the call that my colleague, Tarak, made at that conference, which I want to echo is that this situation arises over and over and over again, where community activists and advocates gain access to data, but then don't necessarily have the resources to make the most sense possible out of it. And so I think that's another place where statisticians, computer scientists, data scientists, that whole collection of skills could really be brought to bear on extracting important information from PDF files. And there's a lot of work to be done there.

Rosemary Pennington
As a journalist who's had to deal with a PDF file of data, I really appreciate this work because it is so incredibly important because you do get massive amounts of stuff that you don't have any idea what to do with.

John Bailer
So you talked about extracting information from PDF files as being one of the tasks that's done here. You know, I'm curious about what other tools and skills should someone bring to the table if they're interested in working with us? And that's the first part of the question. And this drives Rosemary crazy, because there's always a second part. It's just inevitable. So then I would say, and how would a statistician that might be interested in this pair up possibly with groups that have data that would value collaboration or partnership? See? Wasn't a bad second question.

Rosemary Pennington
It's a great question.

Megan Price
So the skills that I think come to bear most often in this setting, the extracting data from PDFs, is really a meticulous sense of data organization. So all the stuff that all of us, myself included, complain about the most in school, the processing and the cleaning of data. But thinking, but thinking really hard about, you know, where should these files be organized with respect to each other? What should they be named? I mean, this is all stuff that seems kind of boring, but if you get that right, then you can do the exciting, interesting stuff. And so then I think in terms of really specific skills, it really is that that data science, I think, place in the Venn diagram, where having some coding skills, having some comfort with things like scraping public websites, you know, data from public websites, and being able to use these kinds of machine learning tools that are so good at classifying Is this the first or the last page of a document, is there a header on this document, you know, pulling out that information that you're ultimately going to want to put in infrastructure database, and then to your second part, how to get how to get hooked up with? You know, the good news is there are more and more organizations realizing the need for this combination of investigating and gaining access to the records and then processing and making sense of the records. And so while I think we have not yet figured out the best infrastructure for how to bring those things together, I think there's a real recognition of the need. And so I think if you have a local organization that you're already aware of, and like their work, honestly, just reaching out to them, if you have the skill set and asking, you know, do you have this need in a bigger sense? I think we've talked before about the organization data kind, they often helped to matchmake people with these skills and nonprofits that need them. Similarly, with statisticians without borders, which is kind of within the NSA umbrella. So I think there are starting to be more of these kinds of matchmaking efforts to try and identify projects that could benefit from a statistician. But the last thing I'll say is, if you do want to get involved in these kinds of projects, I think the most important thing is commitment, because these organizations are very under resourced, and it does require resources to manage, even if you're generously donating your time, someone still has to orient you to the project and bring you up to speed and make sense of the the scientific and technical inputs that you're providing. So I think being really transparent about the amount of time and the length of time that you're able to commit to a project is really crucial for a successful project.

John Bailer
And just a quick follow up comment what you just said, I think the other the other opportunity is for the organizations to reach out to professional statistical societies, whether that's the American Statistical Association, whether it's the International Statistical Institute, which is based in The Hague, or the Royal Statistical Society, or you know, thinking about the local statistical Society, the National Statistical Society that might be a resource for them. So I think that there's this nice reciprocal component to this that if that could be pushed out to these organizations, that would be beneficial.

Rosemary Pennington
Well, Megan, thank you so much for joining us again today. It's been a pleasure having you.

Megan Price
Yeah, thank you. Thank you so much. It's always a lot of fun.

Rosemary Pennington
Stats and Stories is a partnership between Miami University's Department of Statistics and media journalism and film and the American Statistical Association. You can follow us on social media outlet formerly known as Twitter or Apple podcast, or other places where you find podcasts. If you'd like, share your thoughts on the program. Send your email to statsandstories@miami.oh.edu, or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.