Using Data to Protect Human Rights | Stats + Stories Episode 74 / by Stats Stories

Photo credit: Daniel Blue.

Photo credit: Daniel Blue.

Megan Price is the executive director of the Human Rights Data Analysis Group (HRDAG), and designs strategies and methods for statistical analysis of human rights data for projects in a variety of locations including Guatemala, Colombia, and Syria. She has contributed analyses submitted as evidence in two court cases in Guatemala and has served as the lead statistician and author on three UN reports documenting deaths in Syria. Megan is a member of the Technical Advisory Board for the Office of the Prosecutor at the International Criminal Court, on the Board of Directors for Tor, and a Research Fellow at the Carnegie Mellon University Center for Human Rights Science. She is the Human Rights Editor for the Statistical Journal of the International Association for Official Statistics (IAOS) and on the editorial board of Significance Magazine. Before she was executive director at HRDAG, Megan was the director of research there.

+ Full Transcript

Rosemary Pennington: Protecting human rights is one of the core issues of the United Nations and is the mission of such agencies as Human Rights Watch and Amnesty International. Tracking human rights violations, however, can be difficult and dangerous. It often involves researchers travelling to conflict zones or countries in transition in order to document victim experiences and gather data. Increasingly, activists and researchers are turning to sophisticated technologies including machine learning tools in order to analyze human rights data. That's the focus of this episode of Stats + Stories, where we explore the statistics behind the stories, and the stories behind the statistics. I'm Rosemary Pennington. Stats + Stories is a partnership between Miami University's Departments of Statistics and Media, Journalism and Film, as well as the American Statistical Association. Joining me in the studio, our regular panelist, John Bailer, Chair of Miami Statistics Department, and Richard Campbell, Chair of Media, Journalism and Film. Our guest today is Megan Price. Price is the executive director of the Human Rights Data Analysis Group or HRDAG which is a non-profit organization based in San Francisco. It works with human rights groups to figure out what questions can be answered with quantitative data, or how such data can be used to help people better understand human rights issues. Thank you so much for being here today, Megan.

Megan Price: Thank you.

Rosemary Pennington: Can you just tell us, to begin the conversation, how Human Rights Data Analysis Group or HRDAG started?

Megan Price: Sure. It's a little bit of a long evolution, actually. The work started with my colleague and co-founder Patrick Ball in the early 1990s when he discovered that the thing he had to offer some of the problems that he was seeing with -- His skill is a computer scientist, and at the time, he started this work formally at the American Association for the Advancement of Science. That was where he and his team first started using the name HRDAG. And then as they, over a few years, started to outgrow their very welcoming place there at the AAAS, they found a second home at another non-profit that does technology for good here in California called Benetech. The team was incubated there at Benetech for ultimately 9 years. So then, it was just in 2013 that Patrick and I rolled out HRDAG as an independent organization. That is the state we're in today.

John Bailer: So, how did you get involved in Human Rights Data Analysis, Megan?

Megan Price: I have a background in public health. My PhD is in biostatistics, and it was at public health school that I knew I wanted to do some kind of social justice work, and I started learning about human rights as a formalized field that could be studied and research conducted in this formal way. It was through some of my courses there in graduate school that I learned about the work Patrick Ball was doing and that I learned about HRDAG. I was fortunate enough to be in the right place and the right time right around the time I was finishing my dissertation, HRDAG got a grant that enabled them to create a full-time position for a statistician. I applied and never looked back.

Rosemary Pennington: So, what are the kinds of things that you're doing there at HRDAG?

Megan Price: So, the kinds of things that I personally am doing, my two main projects are focused right now in Guatemala and Syria. And each of those is a little bit different in terms of the data source and the methodological approach. So in Guatemala, there's a historic archive from the National Police, and it's literally this warehouse full of documents. And the challenge there was, how can we learn about the content of these documents as quickly as possible? And so, we were asked to -- Well, when we were asked that question, we partnered with some volunteers from the American Statistical Association who helped us design a random sample of this very large, disorganized collection of documents. So, that's been one of my main projects, is analyzing that random sample and draw inference about the content of that archive. And then one of my other main projects is working with documentation groups based in or from Syria who are recording the names of victims who've been killed in that ongoing conflict. And in that project, I'm primarily using the methods of what's called record linkage and what we call multiple systems estimation to essentially identify records of victims that have been reported to one or more of those sources and then use that pattern of documentation to estimate the total number of people who've been killed in that conflict.

Richard Campbell: So, Rosemary mentioned that work at the intersection of statistics and human rights can be dangerous. You were talking about Syria here and you're talking about countries that are in conflict. Can you talk about some of those? And I'm thinking about, what happens when governments don't like the data?

Megan Price: Yes. Well, I would say by and large, most of the risk is taken on by our partners who we work with who do the actual collecting and storing and managing of the data. And that can very often be very risky for exactly the reason you just stated. Sometimes, states in power are not happy about that particular data collection. Somewhat less frequently, they're not happy about our data analysis, but usually by the time a project gets to that stage, by the time we're trying to answer a quantitative question with statistical analysis, usually, it's post-conflict and there's been some kind of a transition. And so, sometimes, we do still have to deliver an answer that is unpopular, but there's usually some kind of infrastructure in place, or there's a different audience for that answer. It may not necessarily be getting delivered to the state in question. It may be getting delivered to the United Nations or to a truth commission or a local NGO.

Rosemary Pennington: You're listening to Stats + Stories where we discuss the statistics behind the stories and the stories behind the statistics. The topic today: using data to track and analyze human rights violations. I'm Rosemary Pennington. In the studio with me are Miami University Statistics Department Chair, John Bailer, and Media, Journalism and Film Department Chair, Richard Campbell. Our guest today is Megan Price, Executive Director of the Human Rights Data Analysis Group. Megan, I was watching a video in preparation for this in which you mentioned the importance of approaching this work from an interdisciplinary context. Why do you think that's important, to do the work that you're doing, that it has to be interdisciplinary in nature, or should be?

Megan Price: Yeah, I think it absolutely has to be. And I think for me personally, that's why I became a statistician. I want to work on difficult and interesting problems, and the way that our analysis has meaning and can be useful is by collaborating with local field experts and also local substantive experts who can not only help us interpret the data that we're analyzing, but then help breathe life into the interpretation of the analyses once we've completed them. I don't think any one of us alone can really do a sufficient job of telling the particular story of our work. I think we all have to collaborate together to identify the question that's most meaningful and most useful, and then to actually figure out what to do once we have the answer to that question.

John Bailer: So, I'm curious: How do problems come to HRDAG? Who are the clients for HRDAG's work?

Megan Price: It really varies. For the longest time, I used to say -- and this is still true. My colleague, Patrick Ball, is quite well-known in this field, and it's sort of a joke, but it is true that sooner or later, for anyone who was working on human rights violations or at a post-conflict area who had a little bit of data or was thinking about data, someone would eventually say to them, "You know, you should really call Patrick." And to a certain extent, that's still how projects come to us, is that folks know us and know our work, and will suggest that someone reach out to us. But sometimes also, it's -- So, I think of that as sort of a push, folks who come to us. Sometimes, it's also a pull. Sometimes, we'll read a news story, or be aware of a certain situation that we care deeply about and feel like our particular skill set could provide some added value. And so then, we'll start reaching out to our various partners and contacts at advocacy organizations and start asking, "Hey, is anybody collecting data about this? Is anybody thinking about doing an analysis? Is there some role that we can play?"

John Bailer: So, could you clarify that for Guatemala and Syria?

Megan Price: Sure. So in the case of Syria, the UN, the United Nations, approached us in early 2012 because they found themselves in this unique to them situation where they couldn't safely get people on the ground to conduct investigations and to collect their own data. And they were aware of these other organizations that were collecting information. But there were multiple of those organizations, and the United Nations didn't really know what to do with those multiple sources of information and kind of how to integrate them into a whole. And so, that was really the way the problem was presented to us, was we have these lists, what should we do? And we kind of said, that is exactly our wheelhouse. Let us help. And in the case of Guatemala, Patrick had been working in Guatemala for years at the point that the archive was discovered in 2005 and we were invited to come work with them a year later. And so, many of the folks involved with the National Police Archive were aware of Patrick, and HRDAG, and our work. And so, they reached out to him for help in trying to figure out what was the best way to make sense of this massive amount of data.

Richard Campbell: I suspect that a big part of your work is trying to explain to audiences whether they're government, whether they're journalists, who don't understand what it is you do or don't understand statistics and data. So, talk about some of the challenges of trying to explain to a more general audience the work that you do and your findings.

Megan Price: Yes. Oh my goodness, that is both one of the hardest parts and one of my favorite parts of my job. And I think in my case, the training that I go teaching Stats 101 to public health grad students is really what prepared me for that and what made me really love it. Because I think that explaining this work to folks who might come to these conversations with preconceived ideas about statistics being intimidating, or hard to understand, or boring, I just love winning those people over. But I would say that you're absolutely right, and it's been interesting to watch the evolution of data journalism. Because I think that these days, journalists are some of our most receptive audience members and they often do really have the patients to hang in there with us and listen to all of our caveats and to publish in interval rather than a single number. Not always, but we're finding a lot more success with that. And then I would say another challenging audience that we've really learned better how to communicate with is when we're explaining in a courtroom the kind of analysis that we've done to judges and to lawyers. That is very challenging, because a courtroom wants a very, very narrow, very specific set of facts. And so, that's really how we've learned to narrow our analysis down to the specific question at hand.

John Bailer: So, when you're explaining this, there's always going to be some kind of uncertainty or variability in these estimates. How do you communicate this in a way that people don't say, "Well, there's so much variability. I mean, we can't say anything." How do you combat that in telling these very important stories?

Megan Price: Yeah. That, to a certain extent, is the million dollar question. And unfortunately, sometimes, that is our answer. Our answer is, we really don't know. And I think that's one of the most frustrating times, is when there's so much uncertainty that, to tell the truth and to be transparent about what we know, we have to say we don't know But in other cases, I would say that when people reflect back to us, well, there's so much noise or we just can't draw this conclusion, we really do push back and try and say, "Well, but if you look at this distribution or if you look at this interval, we can show that they're not overlapping or we can show that there's a statistically significant difference between this relative risk and that relative risk." And we do, in those cases, tend to rely on some statistical jargon to try and emphasize the point that, no, we can draw some conclusion. And what the uncertainty is enabling is us to identify when we can say that there's a difference and when we can't.

Richard Campbell: You mentioned before too about some of the ways that you have to translate this to journalists for them to understand. Can you talk a little bit about mistakes that you see journalists make that are frequent or things that they might do better?

Megan Price: Yeah. One of my own personal pet peeves, and unfortunately, I don't have sort of a good solution to it, is maps. I understand that maps are very compelling. They're very interesting. As soon as we have any sort of geographic information, we want to put it on a map to tell a story to convey what's happening. But very frequently, when you draw a map and you put shading or circles or what have you on a map, there are blank spots. And I always want to know, is that spot on the map blank because in fact, whatever it is that you're measuring didn't happen there. So, there was no violence there or there was no drug use there or whatever. Or is it blank because you just don't know about that region, because you didn't have access to it, or because it's mountains and nobody lives there? And I think it's really hard to figure out how to explicitly convey that difference with the visual presentation of a map. Because all of us, we want to see those patterns, and our eyes are drawn to, clearly, there's a problem here and it looks okay over there. And so, I don't have a good solution to that other than using fewer maps, which is not a good solution, but that's one of the things that my eye always tends to get drawn to, is thinking about, "Well, what did that data really look like at the beginning?"

Rosemary Pennington: You're listening to Stats + Stories. And today, we're talking about data and human rights. So, what advice would you give, Megan, to journalists who are hoping to report or trying to report on some of the data that your group is producing? What advice would you give them to suss out what the story is in the data?

Megan Price: That's a great question. I guess the advice that I would give is -- it's not really advice, it's more of a wish. I would wish for journalists to have the lead time to ask a lot of questions and then the print space to tell a more complicated and nuanced story. And I think that one of the things that's hard is that I think those are the stories journalists want to tell, and I think that, unfortunately, sometimes they are on a tight deadline, and they just need a sound bite, and they just have a headline and a little blurb. They don't have the space or the patience for the five caveats that I'm going to give them.

Rosemary Pennington: Right. I think journalists would also like that space, too.

Megan Price: Absolutely.

John Bailer: In your article that you wrote in Significance Magazine a while back, you mentioned the idea of document dynamics versus conflict dynamics. And that was a -- I'd not thought about dynamics being modified by either of those words. So, could you talk a little bit about, what does that mean to talk about document dynamics and conflict dynamics?

Megan Price: Absolutely. So, one of the things that we think about, and by think I mostly mean worry about the most on our team, is what about -- because we were primarily studying conflict. What is it about the conflict itself that is affecting our ability to measure the violence that we're trying to measure? And so, specifically, when I talk about conflict dynamics and documentation dynamics, what we see over and over again are these cases where the violence increases and the security situation worsens. And yet, reports of violence decrease, specifically because it becomes impossible to do that documentation work. It's too unsafe. And in no way is that a criticism of the groups who are doing this incredibly important and difficult work. It's just reality. It's just what happens during a conflict. And so, if all we're doing is relying on what we're able to see and record, at some of the most crucial instances of a conflict, we may get exactly the wrong idea because we haven't made use of our statistical tools to estimate what we don't know.

John Bailer: Cool. What a hard problem.

Richard Campbell: It is.

Megan Price: It's a hard problem, but statistics is sort of built for it. We just have to use our tools.

Richard Campbell: We talk a lot, especially those of us who teach journalism, about the whole distrust or -- that comes I think partly from the polarization, the fake news movement, the distrust of science, the distrust of data. Is this -- how does this change the way you think about what you do and what you actually do? How often does this come to play in your everyday life?

Megan Price: I don't know that fake news specifically has had a big impact on our work, but something that I think of as being related to that is this paradoxical reaction that we see people having to data, and graphs, and visualizations, and this kind of thing, is that they both think they must be true, whatever they are, and sort of accept them unquestioningly, and don't have any skepticism, and just assuming that that is the story. And also don't really have a lot of understanding of how the data were collected and have, very often, have this sort of, "Oh, I hated statistics class and I don't want to dig deeply into what that means. I'm a little intimidated and a little... You know, I feel negatively about the way that's being used." That's often one of the challenges that we have to figure out how to walk, where we know that a quantitative result is going to lot of weight in terms of evidence, but we also have to give our audience the tools to evaluate whether or not that weight is appropriate.

Rosemary Pennington: This bring me to a question I've been thinking a lot about, especially in relation to the talk I mentioned earlier. You brought up, I think towards the end of that short little talk, the idea of bias and data and how, with the tools that we have available, sometimes we can lose sight of the fact that sometimes data can be biased and that we can import those biases into our analysis. So, how would you suggest researchers or people who are doing analysis on maybe data sets that are touchy, or like the police brutality stuff that you talked about a little earlier, or human rights violations? How can we as researchers think about bias and sort of mitigate bias as we sort of import the data and analyze it?

Megan Price: It always has to come back to the data. And I think we always have to ask ourselves, how is this data collected and how is it generated? And those may or may not be the same thing. And related to those questions, what's missing from this data? And I think actually, police data is a really good example to use, because it has been in the news a lot more, and I think folks are starting to get a lot more skeptical about the way that data is collected and what it really can tell us. Because if you think about arrest data, on the one hand, hopefully, it accurately represents a complete picture of the arrests that police have conducted. But an arrest is not synonymous with a crime because some crimes lead to an arrest and some crimes don't. And whether or not that crime leads to an arrest is a very complicated process related to what the crime is, the characteristics of the person who perpetrated it, the characteristics of the police force in that area and their relationship to that community, and the local laws and policies in that community. And so, that's a lot less clear when what you're looking at is just account of arrests. And it's kind of hard to necessarily untangle all of the processes that led to the data that you have, that you observe.

John Bailer: So, one thing we often ask guests when they come on Stats + Stories is about what students might do to prepare to do the kind of work that you do. So, what kind of advice, what kind of guidance might you give them?

Megan Price: Well, this is my own bias with three degrees in statistics, taking statistics classes.

Rosemary Pennington: John likes that.

John Bailer: No argument here, Megan.

Megan Price: I do think that that is the best place to be pushed, to think hard about where your data come from and how they were generated. Take some programming classes. I did not take enough in grad school. I did a lot of my programming learning on the fly on the job, and I think I would've been a lot better off taking some programming classes. And then I would also say to whatever extent you can within your program, design a curriculum of electives that you're excited about. Think about the kinds of problems that you want to tackle and then go take those classes. And maybe they're journalism classes, maybe they're law classes, maybe they are public health classes, maybe they're economics classes. But you know, really branch out. School is the time to take some of those risks and to go sit in on something that just sounds interesting, even if it maybe doesn't seem like it's directly on your career path.

John Bailer: Talk about writing. I've read some of your essays, you're a very good writer.

Megan Price: Oh, well, thank you.

John Bailer: Where did you learn how to do that? How did that get folded into all the statistics classes you've taken over the years?

Megan Price: Well, I was a journalism minor in undergrad.

John Bailer: Oh, there you go. These guys are going to be tough to live with now, Megan. That explains a lot.

Megan Price: Sorry about that.

John Bailer: Oh, that's okay.

Megan Price: I have to also give some credit to my mother, who is also a journalist and a copyeditor, and my college roommate, who is an English major. She read everything I wrote, and to this day, when I proofread things, I can still see her writing in the margin. "So?" with the big question mark because I haven't made my point yet.

Rosemary Pennington: Well, Megan, thank you so much.

Megan Price: Well, thank you, guys. This was a lot of fun.

John Bailer: Indeed.

Rosemary Pennington: That's all the time we have for this episode of Stats + Stories. Stats + Stories is a partnership between Miami University's Departments of Statistics and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter or iTunes. If you'd like to share your thoughts on the program, send your email to, and be sure to listen for future editions of Stats + Stories where we discuss the statistics behind the stories and the stories behind the statistics.