Counting Costs of Conflict | Stats + Stories Episode 380 / by Stats Stories

Xiao Hui Tai is an assistant professor of statistics at the University of California Davis' Department of Statistics. Her research interests include the use of non-traditional data sources to study problems in data-scarce settings. With the current focus on global public health and estimating the consequences of violent conflict, she's the author of the Significance article “Counting the True Cost of War”.

Episode Description

When it comes to studying conflict, there is obvious data to examine: spending on arms, the number of people killed or injured, and the amount of land won or lost. What's harder to track are the indirect effects of conflict, the ways it produces deaths over time, or its impacts on public health. Researchers are trying to find ways to account for the sometimes less obvious impacts of conflict, and that's the focus of this episode of Stats and Stories with guest Xiao Hui Tai.

Check out the full article in Significance

Timestamps

Forms of conflict and focus on Afghanistan (2:34)
Indirect effects of conflict: social, economic, health, and intergenerational impacts (3:42)
Challenges of measuring conflict impacts and the role of statistical infrastructure (5:52)
Difference between refugees and internally displaced persons (IDPs) (7:27)
Data sources for studying conflict: media reports and mobile phone metadata (8:59)
Using mobile phone data to track displacement (11:07)
How displacement is defined and measured in the study (12:56)
Study period, findings on immediate and persistent displacement after conflict (15:52)
Differences in displacement based on type of violence and geography (17:26)
Surprising findings: anticipatory reactions to violence (20:07)
Applicability of methods to other conflict zones such as Ukraine (22:08)
Future projects: conflict and seasonal labor migration related to opium poppy cultivation (25:48)

Transcript

Rosemary Pennington

When it comes to studying conflict, there are obvious data to examine: spending on arms, the number of people killed or injured, and the amount of land won or lost. What’s harder to track are the indirect effects of conflict—the ways it produces deaths over time, or its impacts on public health. Researchers are trying to find ways to account for the sometimes less obvious impacts of conflict, and that’s the focus of this episode of Stats + Stories, where we explore the statistics behind the stories and the stories behind the statistics. I’m Rosemary Pennington. Stats + Stories is a production of the American Statistical Association in partnership with Miami University’s Departments of Statistics and Media, Journalism, and Film. Joining me, as always, is regular panelist John Bailer, emeritus professor of statistics at Miami University. Our guest today is Xiao Hui Tai, an assistant professor of statistics in the University of California, Davis’s Department of Statistics. Her research interests include the use of nontraditional data sources to study problems in data-scarce settings. With the current focus on global public health and estimating the consequences of violent conflict, she’s the author of the Significance article, “Counting the True Cost of War,” which she’s here to talk about today. Thank you so much for joining us today.

Xiao Hui Tai

Thank you for having me. It’s great to be here.

Rosemary Pennington

How did conflict become something you were interested in as a researcher?

Xiao Hui Tai

I would say it was during my postdoc that I got interested in this topic. I was a postdoc at the School of Information at UC Berkeley, where I was working with a professor who had access to these different types of data. His name is Josh Blumenstock, who is a co- author on this particular paper that we’re going to be talking about. He works mainly in economics development, but he also had access to mobile phone data in conflict- affected countries. As we’ll talk about, that became a very interesting source of data to work with in these sorts of settings. And as you already alluded to, it’s very difficult to get good data in such settings. The consequences of conflict are important to document, and it’s not an easy thing to do, and so that’s kind of what got me interested in this topic.

John Bailer

In your paper, you mentioned some striking summaries of just how common conflict is, right? So, can you, just as a little bit of background, tell us: what are some of the forms of conflict that you might be investigating?

Xiao Hui Tai

Conflict can take many forms. There are the all-out wars that you hear about so often in the news, but there are also many types of civil conflict within individual countries that take place between armed groups, between armed groups and the government, and social conflict over land disputes—things like that. The focus of this particular paper was the conflict in Afghanistan that, as you all know, went on for several decades. Mainly, the conflict involved the Taliban, who were insurgents, and government-related forces. This is just one type of conflict that can be analyzed. There are, of course, many others.

Rosemary Pennington

I’m going to build a little bit on John’s question and simply ask: I think we understand what the direct impacts of conflict are, right? Casualties, communities being erased. Your paper is examining the issue of people who have been displaced from their homes. But before we get into that, I do wonder: what are some of the other kinds of indirect effects of conflict, and what are the struggles in measuring them?

Xiao Hui Tai

There can be many kinds of social, economic, and institutional effects that are not as easy to measure and document as the direct impacts, which are casualties and physical destruction. For example, even if we talk about casualties, there are direct casualties due to, say, bombings or physical fighting, but there are also indirect casualties that come later—for example, due to destruction of physical infrastructure or health infrastructure. If hospitals are damaged, people can’t seek health care, and you might see these effects later down the line. Also, if water supplies are disrupted or people get displaced, they’re in vulnerable situations in which other types of diseases become more rampant. These are indirect effects that you won’t necessarily see immediately. Apart from death, there are also impacts to education. People’s schooling gets disrupted, and these have downstream effects as well, which could even be intergenerational. If mothers are less well educated, then you see things like higher infant mortality rates in the next generation, and so forth. These kinds of impacts are much less easy to quantify compared to direct impacts.

John Bailer

In looking at these direct versus indirect impacts, I was thinking that measurement of what’s going on in a society implies that there’s some infrastructure for doing that. National statistical offices do that. And I’m wondering if, in a lot of the countries where conflict is being observed, there’s also not necessarily—even before the conflict emerged—a high degree of infrastructure to support this. Is that really amplifying the challenge?

Xiao Hui Tai

That’s actually a really good point. A lot of these countries that are affected by conflict also tend to be lower- and middle-income countries that don’t have very well-developed statistical infrastructure. Even in the absence of conflict, conducting surveys or censuses is very time-consuming and expensive, so it’s very difficult to collect good data even without conflict. Of course, this is exacerbated during times of conflict, where physical as well as statistical infrastructure is destroyed. There’s physical insecurity, which makes it difficult for enumerators to go out and collect data. And also, during conflict, there is displacement, which means people are on the move. That makes it difficult to track down and collect data from. In general, there are very tricky settings in which to collect data.

Rosemary Pennington

You mentioned the issue of displacement. In your research, you were looking at internally displaced persons. Before we dive into that, I wondered if you could explain: what’s the difference between a refugee and an internally displaced person?

Xiao Hui Tai

Forced displacement means that people are forced to flee their homes or their places of usual residence. People can be displaced outside their own countries, which makes them refugees, or they could be displaced within the borders of their own country, which makes them internally displaced people. There are different types of reporting available for these different populations. If you’re crossing a border, typically there is some type of administrative reporting that is necessary, and that, in some sense, makes refugees more visible compared to internally displaced people. You often read about refugees in the news, whereas you don’t hear as much about internally displaced people because there’s not that kind of administrative reporting data. You might just go to a relative’s house in a different place in the country, and there’s no administrative reporting that is necessary. Typically, you can collect data either using surveys, or through limited reporting—for example, if you register for services or benefits. But of course, these data tend to be quite spotty.

John Bailer

Your study ends up using multiple data sources, not just the data source for where individuals are located—and we’re going to ask you about that in a sec—but also you need to know where conflicts occurred and when conflicts occurred. Can you talk a little bit about that piece first?

Xiao Hui Tai

Of course. There is much better data on conflict events these days. There are a few particular data sources that empirical conflict researchers tend to use. The one that we use in this paper is collected by the Uppsala Conflict Data Program. The source of these data are media reports. They have a team of people scouring media reports for information about when individual violent events occur. These could be a single air strike, a suicide bombing, and so forth. These are individual-events data, and we have information about specific locations in the form of GPS coordinates, as well as the time that the event occurred. With these kinds of granular, individual-level conflict-events data, we’re able to track conflict at a much finer granularity, both in time as well as in space.

John Bailer

Let’s get to your data. I’m realizing that I could be contributing to your dataset now, just with a device that I’m holding in my hand. The listeners can’t see it—I’m holding up my cell phone. Apparently, this is telling a lot of people a lot of things about me, and I don’t necessarily know. I actually have a sense of that, but still. Talk a little bit about the kind of data that phones give you, and how that led you to start trying to answer some questions about internal displacement.

Xiao Hui Tai

Basically, when you make phone calls, send a text message, or request a data packet from your phone, what happens is that—assuming you’re not on Wi-Fi—you connect to the cell tower that is closest to you. If we have the locations of these cell towers, then what your phone is collecting is a dataset of timestamps and the locations of cell towers whenever you make such a transaction with your phone. If we interpolate and remove noise from these data, then we can approximately track a person’s trajectory over time. In the data that we have, we call this mobile phone metadata. It’s basically collected for billing purposes. We have these data in Afghanistan over an approximately four-year period. They’re all anonymized, but we are able to track about 10 million subscribers. We have about 20 billion such transactions. Based on those, we interpolate their locations and aggregate them in a manner that helps us with our analysis.

John Bailer

Okay, so now for each individual—just the few that you have, the only 10 million that you’re tracking—you’re going to infer location based upon their proximity to a tower. Then you need to define out-migration for this person. You want to define for each of these people: have they been displaced, in part in response to some event? Can you tell us a little bit about how you defined whether or not someone had been displaced?

Xiao Hui Tai

This uses an algorithm that is based on previous work. You can think of it as a scanning algorithm. Basically, you have these intermittent locations of a person. Perhaps they are in a single location for, say, a week, and then we see them in a different location—say, on a Saturday and a Sunday—and then they come back on Monday, for example. In this particular paper, we don’t want to consider short displacements like that. Maybe you’re just going away for the weekend. What we’re looking for are longer stretches of time, roughly like a week-long period. If you’re going somewhere else for a week, then it’s more plausible that you’ve been displaced, as opposed to just taking a trip on the weekend. This is the way we smooth out the noise. We do this for every individual, and then we aggregate it to what we call a district level. A district in Afghanistan is like a county in the US. Our analysis is conducted on this district level. When we talk about out-migration, we refer to a person being in a different district or county over roughly a week-long period. That’s how we define migration in this particular paper. Of course, there’s not really a standard definition of migration or displacement that people use. For example, in some survey data that was collected in Afghanistan, the survey question about displacement was just, “Have you been displaced in the last year from your home?” So it’s not like there’s a standardized definition that we could work with. We came up with something that seemed reasonable.

Rosemary Pennington What time period were you looking at? And I guess the question we probably both have is: what did you find?

Xiao Hui Tai

We were looking at 2013 to 2017. Basically, what we’re interested in is: if there’s a violent event happening in a district, do people react to it immediately, or how soon do they react? As well as how long do the effects last? What we found was that there is an immediate impact of displacement due to these individual violent events. It peaks within about a week or so. The magnitude of the effect is roughly that people are about 4% more likely to leave a district with a conflict event. These effects are persistent. Even three to four months after such an event occurs, we still see a higher likelihood of people being outside the district in which the event occurred, although there is some decrease in magnitude by that time.

John Bailer

I’m curious if—maybe this was not possible because the observations were anonymized—but if there are differences between different demographic groups. Maybe that’s a future project, but I’m curious if that was even possible.

Xiao Hui Tai

There aren’t any “know your customer” type requirements with these data, so we don’t have demographic information on individuals. But we are able to do some work looking at differences in effects based on the type of violent event or the type of district where the event occurred. There are some pretty interesting results. For example, if the violent event was perpetrated by the Islamic State as compared to the Taliban, we see a much larger displacement response. That makes sense qualitatively because the Islamic State is seen as more like a terrorist organization. They do vivid displays of violence and so forth, whereas the Taliban is seen more as a legitimate governing force in some parts of society. We also see effects based on location. In what we call provincial capitals—which you can think of as state capitals—these are the local seats of government. There’s a higher presence of security forces, more aid being delivered, and they tend to be more urban areas. When these provincial capitals experience a violent event, the displacement response is smaller compared to if violence happens in a more rural area. Another interesting thing is that if violence happens in a rural area, people are more likely to flee to these provincial capitals. These are some of the interesting things we were able to do because of the granularity of the data we have.

Rosemary Pennington

You had so much data to work with. I wonder if, during your analysis, there were things that you found particularly surprising or interesting.

Xiao Hui Tai

The most surprising thing we found was that there’s this anticipatory reaction to violence. If you look at a district, even before the violent event actually happened, you see people leaving beforehand. That seemed a little surprising to us. When we dug into it a little bit more, we found news reports and other qualitative evidence saying that government forces actually warned civilians beforehand, before there was a large event that was going to take place. The Taliban also would inform the public through things called night letters, where they post messages on people’s doors informing them that there was going to be some violent event. Another thing is that this dataset only records fatal violent events. Only when there’s a fatality is that event going to be reported. Other types of events, like troop movements or other types of skirmishes, would not be in this dataset. Another possibility is that there are all these other things happening immediately before an actual event happens that are not being picked up in our data. Those are some reasons why we might observe that anticipatory effect.

John Bailer

I was curious about applying these ideas—the tools that you develop—to other conflict areas. I was thinking about the war in Ukraine and certainly the internal displacement associated with that. Do you imagine this would generalize immediately to another country’s context?

Xiao Hui Tai

The methods, I think, would generalize, but the findings—the specific substantive findings—might be more relevant to countries experiencing chronic conflict, such as Afghanistan: a lower-income context, or a Muslim-majority country, and so forth. In terms of the methods, I think they could generalize. A question that a lot of people have is: how easy is it to get access to these data, and what are some other options for getting similar data in a country such as Ukraine? There are many different avenues to get these types of data. Large tech companies like Google, Facebook, Baidu, and so forth—during COVID—started releasing aggregate mobility data to track people’s movements related to stay-at-home orders and so forth. So there are more publicly available sources that track movements. There’s also a dataset now from a mobility aggregator called Veraset. What this company does is aggregate location information collected from different apps. Whenever you let an app access your location, those locations are shared with aggregators. They’re able to pinpoint your location over time in a similar way to what I described with mobile phone metadata. This company is selling these types of aggregated data, and I’ve seen it being used in Ukraine—not to answer the specific question about internal displacement, but to look at the effect of warning systems on people’s subsequent evacuation plans and so forth. Another interesting example I saw in Ukraine was people using satellite imagery of cars to see how people were being displaced within the country. People saw that during the start of the war, people were going west. On the west side of the country, there were many more cars observed in satellite imagery compared to the east. There are many innovative things that people are trying to do with respect to measuring these types of movements.

John Bailer

I wish that you would be put out of business in terms of working on conflict, but I fear that you won’t be. What’s next for you? What’s the next project that you have a passion for exploring?

Xiao Hui Tai

There’s an extension of this particular paper that we’ve been working on, and hopefully that should be wrapping up soon. It looks at the effect of conflict on seasonal migration in Afghanistan. This seasonal migration is labor migration related to agricultural work. More generally, it’s looking at labor market effects of armed conflict. Specifically, we look at the opium poppy cultivation industry in Afghanistan. Afghanistan is one of the world’s largest producers of opium poppy, which is the raw material for heroin. A lot of rural farmers are involved in this industry, and it’s very labor intensive. During the harvest, a lot of people go into these areas to help out. We use these same types of mobile phone data, and we use satellite imagery to determine the timing of the harvest. If we look at mobile phone patterns during the time of the harvest, we can see if conflict changed that type of movement. That’s another interesting project we’ve been working on in this space.

Rosemary Pennington

Well, thank you so much for being here. That’s all the time we have for this episode.

Xiao Hui Tai

Thank you for having me.

Rosemary Pennington Stats + Stories is a partnership between the American Statistical Association and Miami University’s Departments of Statistics and Media, Journalism, and Film. You can follow us on Spotify, Apple Podcasts, or other places where you find podcasts. If you’d like to share your thoughts on the program, send an email to statstories@amstat.org, or check us out at statsandstories.net. Be sure to listen for future editions of Stats + Stories, where we discuss the statistics behind the stories and the stories behind the statistics.