The Importance of Official Statistics | Stats + Stories Episode 116 / by Stats Stories

chaitra-nagaraja.jpg

Chaitra Nagaraja is Associate Professor of Statistics at the Gabelli School of Business at Fordham University. Prior to joining Fordham, she was a researcher at the U.S. Census Bureau. She combined her various research interests with her love of history in a new book, Measuring Society, which explores the history and measurement of official statistics.

+ Full Transcript

Rosemary Pennington: There are things we think of as fundamental building blocks of our society. Some of them are easily quantifiable, and others not so much. Measuring society is the focus of this episode of Stats and Stories where we explore the statistics behind the stories and the stories behind the statistics. I’m Rosemary Pennington. Stats and Stories is a production of Miami University’s Departments of Statistics and Media, Journalism and Film, as well as the American Statistical Association. Joining me in the studio are regular panelists John Bailer, Chair of Miami Statistics Department, and Richard Campbell, former Chair of Media, Journalism and Film. Our guest today is Chaitra Nagaraja, Professor of Strategy and Statistics at Fordham University. She’s also the author of the book Measuring Society. Chaitra Nagaraja, thank you so much for being here.

Chaitra Nagaraja: Thank you.

Pennington: Could you describe what Measuring Society is about, and why you decided to write it?

Nagaraja: Much of my work has been on official statistics, which is stuff that governments produce to tell you about the economy or the society, and I also used to work at the Census Bureau. And I wanted to combine those things with my love of history to produce something that explains to a regular reader how these statistics are computed, why they are idiosyncratic in certain ways because of politics and historical impacts, and I wanted people to understand what it is they’re reading about when they read about unemployment and so forth in the newspapers and magazines.

John Bailer: So, what’s one of the hardest things that we measure as a society using official statistics.

Nagaraja: I would say among the things I focus on are things related to heterogeneous things like price indices, for example. People buy all sorts of things. How do you combine all of that information into one number to say prices are increasing or decreasing, and that affects you cost of living, and your salaries and so forth. And that I found to be the most complex set of operations to produce a single number.

Bailer: Can you give an example of how a price index is calculated?

Nagaraja: So, they start off by looking to see where people buy certain items. Then they’ll go visit those stores periodically and see how much your applesauce costs and they look at that same brand of applesauce and once that brand of applesauce is now missing from the store, they have to find a substitute. Then they have to also figure out ok, the quality of things changes, like your computer, your cell phone. All of those things change really quickly, how do you handle stuff like that? Or even fashion, which is something that changes very quickly as well. So, you can think about the number of products or number of types of goods and services you buy across an entire nation where prices are very different depending on where you live. So, the operation is really complex. Including interviewing people, physically visiting stores, and trying to combine all of that in some statistical way to produce the Consumer Price Index, for example.

Richard Campbell: Another thing that’s complex is the challenge, and you talk about this, of writing about technical concepts without using equations or specialized jargon. Tell us about how- tell us your challenges in doing that kind of work, and in writing your book.

Nagaraja: Yeah, this was the biggest challenge for me. That was the one rule the editor had: cannot include equations. I have one equation but it’s simple enough, the equation for poverty, that it was accessible to the general reader. I did find that a big challenge. I mostly decided to try to explain equations through history to say okay, you have, for example, the poverty measure is looking at how much food cost in 1963, and they roughly said about a third of people’s budgets went to food. They multiplied that number by three and update it every year due to inflation. So, I try to take that apart. Why did they use food as the basis? So, I did a little bit of historical work on that. And why use inflation? Well, prices change. So, I tried to set up the book so that as you read through the chapters, stuff from previous chapters gets included in the measures for later chapters to give people a sense of intuition, rather than give us formula. But I just would say it was a very challenging- challenging in a good way, for me. Most of my writing as a university faculty member has all been technical.

Bailer: I found it interesting that you described what you were doing as thinking about official statistics like an investigative journalist. That you start with the who, what, where, when, how. I just knew that would bring a smile to Rosemary and Richard’s faces. It did for me as well. That was a great model for trying to pitch this.

Nagaraja: Yeah, I’ve found that in order to make it interesting and less about first this step is done, then that step is done- I started thinking about why did they turn out this way? If you look across different countries- to use poverty as an example again, in the U.S., we have a subsistence level of poverty measure. Whereas in Europe they look at more about whether you are able to function relative to other people in the society. And those are very different philosophies for approaching a measure. So, I got interested in that sort of thing, just to sort of explain why the U.S. has these particular formulas. It didn’t have to turn out that way. It could have turned out some other way. So, I felt like the journalist who, what, when stuff was helpful when trying to approach a way to explain those formulas.

Richard Campbell: So, in your training did you- I mean you’re a very good writer, and no offense to John, but a lot of statisticians aren’t. So, no john is actually a very good writer. So, what did you do in your training? What helped you become someone who could translate difficult, complex topics and equations into more accessible language?

Nagaraja: I would say two things. One, I read a lot of stuff outside of statistics. So, I’ve used some of those as models. The second thing I would say is when I was in my graduate program, and I was preparing for my (defense?) Some of my professors agreed to let me do a trial run. At the time it was quite brutal, because they sort of made comments about every slide. But I learned a lot from that in terms of how to present, what to think about the audience will know coming into my talk. And actually, related to that I was a T.A. one semester for the MBA course at Penn, and that course- essentially, there’s so many MBA students- there are about four professors. They had the exact same content in terms of the slides, but they presented that information in very different ways. So, I sort of rotated. Every day, every class I would go to a different professor’s class and see how they brought their own personality into what was literally the same content in terms of what they showed on the screen.

Campbell: Very good.

Nagaraja: And that also showed me that there’s not one way of making an explanation, and you should feel free to using your own personality, as opposed to trying to sort of standardize yourself, in a way. So, I would say those are the things that over the years I’ve picked up.

Campbell: Very good.

Penington: So, for someone who’s not an academic or who’s not working at a think tank or as a journalist, why should they care how these official statistics are created and used?

Nagaraja: Well one, they affect you. For instance, a lot of social security payments or even cost of living increases in your salary come from looking at what the Consumer Price Index or the index made by the Bureau of Economic Analysis- how those change over time. They also matter in terms of what your politicians tell you in terms of how things are going well or poorly, and things like in financial markets, they change a lot by the unemployment numbers that come out every month and so forth. So, a lot of things around you are affecting what you do, even if on a personal level you aren’t looking at those numbers and making decisions.

Bailer: So, just to follow up, these numbers do have a lot of impact when they come out, whether it’s the unemployment rate or the growth, and as you state in your book and as we recognize, there’s a lot of assumptions, there’s sampling that’s involved in this; this number is constructed in some way, and part of that construction is uncertainty and variability. But often when we see this reported, it’s reported in the context of a single number. As if it was brought down from a mount on a stone tablet. I’m wondering about the reporting on uncertainty and variability, and is that something that there’s kind of an over-interpretation of swings and these point estimates of numbers?

Nagaraja: I think there are. I think in general humans don’t like uncertainty. If you watch any news program and they ask people- they bring on experts to predict something in order to be called back a second time you have to come across as very sure of your answer. And I do think that extends to numbers, if people, see a number that seems like it’s more objective- what I was trying to show in my book is that numbers are actually not objective we shouldn’t think of them that way at all. Even though they look that way because they are solid as opposed to fuzzy. So, I don’t think it’s necessarily a media problem, I think it’s just a human problem. I think it’s really hard to think about how uncertain things are, even if you do give a confidence interval- like recently in an economist article they had confidence bands on a graph. Which I was surprised, but you know, that means the journals had enough knowledge to put a confidence interval in their graph, but it also means you assume that the reading public also has that same level of numeracy, and I’m not really sure that’s always the case.

Pennington: You’re listening to stats and stories and today we’re talking with Chaitra Nagaraja, Professor of Strategy and Statistics at Fordham University. I want to go back to this issue you’re raising- sort of I think around the issue of journalism- and you’ve mentioned the issue of the expert going on, and if you communicate uncertainty maybe you’re not going to be asked back. And the issue of how you present numerical information for a reader. What have you found frustrating in the coverage of things like the unemployment rate or poverty levels? What do you think journalists could do better when reporting these statistics?

Nagaraja: I have read that journalists don’t really get to pick their headlines.

Pennington: We don’t generally.

Campbell: We don’t that’s right.

Nagaraja: Those are kind of- especially with medical studies, for example: coffee is wonderful for your coffee is terrible for you, don’t vape, you know that kind of- they’re supposed to catch your eye, but I feel like they make you think that the results are more definitive than they are. So, you would probably know better about whether or not journalists get to pick their headlines. But I feel like that’s one of the biggest- the language used to make it seems like scientists came up with this result and it’s a definitive fact, as opposed to you know, we did this research, this is what we found. But it obviously- it’s always possible that results could be slightly different the next time around.

Campbell: When you teach about uncertainty do you have tips or things that would explain this better to both your students and to journalists?

Nagaraja: I would say- actually I do. There is a good article in the New York Times that was published during the last presidential election where they gave a certain data set about election poll results to four different groups of people, all qualified to analyze the data. And they go through, in a lot of detail, about what assumptions they made about figuring out who likely voters are and so forth, and they got different results. And I show students that article because it shows that even if you are a qualified researcher, and not acting in a malicious manner, you could still have disagreements about how to analyze and how to think about the fact that you are making certain assumptions. And assumptions are one expression of uncertainty.

Bailer: You know I was interested in your comment about poverty indices and how they differ between the U.S. and Europe. And we’ve had guests on previously from the U.N. Stat Division. Stefan Schweinfast was a guest on Stats and Stories previously, and we were talking about what the U.N. is trying to do in tracking official statistics from around the world. And I was just trying to imagine how difficult it is to think about how you would compare countries in terms of these indices, given these very disparate constructions. How is that done?

Nagaraja: I think it’s done multiple ways. I know the international laborers organization – they try to do something like that with unemployment. For example, in many countries they look at unemployment and focus on a population that’s 15 and older, but thanks to labor laws in the U.S. we use 16 and older. So, they try to use, for instance, demographic information to try to make an adjustment to make these numbers comparable. But the more complicated the statistic, the harder it becomes. There’s also a project the world wealth database, or world income database done by Thomas Piketty and some other people from France, where they try to do the same sort of thing for inequality levels; trying to get external information to adjust estimates and so forth to get estimates that are comparable across countries. I think it’s extremely hard to do, because most of these definitions to actually have them be able to be implemented and practiced are very complicated, and that there’s no way around that. But I think what they attempt to do mostly is try to find other information that you could use to adjust the estimates that are produced.

Campbell: So, in your book you talk about working with other professionals outside statistics: ethnographers, translators, you’ve also talked about your interest in history and those are some of the favorite parts of your book that when you go back and tell how they used to collect data, and how we do it today. Can you talk about the benefits of learning outside of your field and what meeting with those other professionals has brought to the table for you?

Nagaraja: Yeah, when I was working at the Census Bureau I worked in the research division, and it’s not just statisticians, it’s psychologists, people who did ethnography and anthropology, and so forth, and I – people who looked at user-experience, how well people can navigate the websites and forms and things like that, and I felt like I learned a lot about how all of those things affect people being encouraged to fill out a form, or give good information on the form. So, to give an example, census forms are printed in multiple languages, not just in English. So, there’s a lot of effort put to make sure that the translated version is reflective of what the Census Bureau wants to collect from people and that people can understand what’s happening. So, they do a lot of testing around that. and I feel like all those sorts of things gave me a more complete picture about how data collection is not just statistical technical issues, there are a lot of other things involved, and I feel like I learned a lot from them in just hearing them talk about their work.

Campbell: Very good.

Bailer: So, in the process of working on your book I’m sure you learned some things that are very new and different for you, what was the biggest surprise that you had in working and doing the research for your book?

Nagaraja: I underestimate how much I like reading government documents.

[laughter]

Bailer: That was not the answer I expected.

Nagaraja: A lot of really fascinating things that I unearthed- because much of it is digital now. There are a lot of digital archives, international archives, all that kind of thing. That I went down a lot of wormholes just reading about what people did and why they did certain things. And I try to include some of those fun facts, from budget proposals, and things like that, but I would say on a more serious note, one of the things that I really had a better appreciation for was how much effort many people over the years took to try to do their best to improve society through data collection. And the idea that in order to have a functioning society, a functioning government, and democracy you need these numbers to be able to represent your society and then make decisions about it.

Campbell: Very good. Chaitra, you would have made a good investigative journalist, by the way given your interest in data and documents. In your postscript you talk about the 2010 census and the coming census. Do you have some concerns about the 2020 census in terms of how it’s going to be executed, and some things that we should be anticipating that maybe we haven’t faced before in the challenge of gathering all this data?

Nagaraja: I mean- I guess the looming one is the fact that there was the citizenship question issue. Even though it’s not on the census form, it still caused enough controversy that people who would normally feel a little bit uncertain about filling out such a form, maybe feel even more uncertain. And those are exactly the kinds of people you might want to include in a count, right? So that’s probably the main one. I have no qualms or issues about the Census Bureau itself in terms of trying to do its best to do its data collection. But, given the fact that there were some reductions in how much testing they could do for certain things, given budgetary constraints, there’s some unknowns, I suppose. Do you think that was a little too—

Bailer: No, no that’s good. That’s important I think we’ve also heard some things about the new internet response component of the upcoming survey. So, it’s clear that the census has to be pretty serious about the research that it does before it launches. And I think the coverage of it is pretty critical. One of our previous guests Mark Hansen and colleagues of his have been talking about these Newscounts issues get good airing and good coverage. So, I think these are really interesting issues. In official statistics we talked had Andreas Georgiou as a guest on Stats and Stories previously. And you know, if you don’t think that people care about it- the story about how a debt estimate associated with a country led to some serious pushback from the government, and the importance of these agencies presenting information that the world needs. It’s not just something for internally, but something externally. Have you followed any of that story? Have you thought about the importance of the independence of actions of these agencies?

Nagaraja: I think the fact that they operate independently is key, because there is no point in making an estimate if there is no trust in that estimate. And maintaining a sense of independence is critical for that. If people think that you can just fudge the numbers in a way that you want, then there’s really no point to producing them in the first place. And I do think the Census Bureau does- and all the federal statistical agencies that are filled with employees who are really committed to trying to show, you know, trying to be as objective as possible in producing their statistics, to say this is our representation of what’s happening in the country, and any actions that diminish that are hard to come back from, right? Once you break the trust it’s harder to rebuild it; that takes a really long time.

Bailer: So, you’ve worked in official statistics and you’ve worked now in an academic setting, if you’re advising students that are interested in working in the world of official statistics, what’s some of the preparation you would recommend for them?

Nagaraja: Take a survey sampling class. That was key, I think. It’s not necessarily one of the hot topics, you know, it’s one of the machine learning topics or any of that. But that is the number one thing to prepare, I would say, in terms of getting enough knowledge or having a sense of exactly how difficult it is to implement some of these kinds of operations.

Bailer: How about political science, or sort of the breadth as well as the technical stuff.

Nagaraja: I think it depends on what you’re planning to do in official statistics. I guess if your work is mostly technical obviously it would help to get that context. But maybe on the day-to-day job that wouldn’t really be relevant. If you’re in more of a research division that probably would be helpful, because then you can sort of have a bigger picture about what’s happening. I will say though, to your point, that if you do have a sense of agency-specific history, like Census Bureau history or Bureau of Labor statistic history, that would give you some information about how people have approached and solved problems in the past, because some of these things have always come up, right? In the beginning, like I say in my book you know people went on horseback and collected this information by going from house to house. Then in the 1960s they started using mail forms. That was a technological innovation of some sorts. Even using an adding machine as opposed to hand counting things, that’s a technological innovation, and yes looking back at the time, obviously that would be happening. But at the time you know this was new technology, would it work well? Would it be accurate? Those are all questions. How would you implement doing a new technology and checking to see that it worked okay? You know all of those types of questions keep reappearing. So, an agency-specific might be helpful in trying to see how people solved problems in the past.

Pennington: Well Chaitra, that’s all the time we have for this episode. Thank you so much for being here.

Nagaraja: Thank you so much for having me.

Pennington: Stats and Stories are a partnership between Miami University’s departments of Statistics, and Media, Journalism and Film, and the American Statistical Association. You can follow us on twitter, apple podcasts, or other places you can find podcasts. If you’d like to share your thoughts on the program send your emails to statsandstories@miamioh.edu or check us out at our website at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories, and the stories behind the statistics.