Statistics History Chronicles | Stats + Stories Episode 298 / by Stats Stories

Chaitra Nagaraja is a Senior Lecturer at the University of Exeter. Her research interests are primarily in measurement, particularly macroeconomic and socioeconomic indicators, time series, and the history of statistics. Prior to joining Exeter, she was a faculty member at the Gabelli School of Business at Fordham University in New York City where she wrote the 2019 book Measuring Society and a research mathematical statistician at the U.S Census Bureau, focusing on the American Community Survey. The book is a history of US official statistics like unemployment, inflation, and poverty.  In addition to her university research and teaching, she is the chair of the American Statistical Association’s Scientific and Public Affairs Advisory Committee, a member of the Royal Statistical Society’s History of Statistics Section, and the book review editor for the International Statistical Review.  She also recently accepted a co-editorship position for the new history of statistics column in CHANCE magazine.  

Episode Description

The history of statistics is filled with interesting facts about the development of the field and stories of the people who helped shape it. A new column at CHANCE magazine will explore the history of stats which is the focus of this episode of Stats+Stories with guest Chiatra Nagaraja

+Full Transcript

Rosemary Pennington
The history of statistics is filled with interesting facts about the development of the field, and stories of the people who helped shape it. A new column at chance magazine will explore the history of stats. And that's the focus of this episode of Stats and Stories, where we explore the statistics behind the stories and the stories behind the statistics. I'm Rosemary Pennington. Stats and Stories is a production of Miami University's departments of statistics and media, journalism and film, as well as the American Statistical Association. Joining me is regular panelist John Bailer, emeritus professor of statistics at Miami University. Our guest today is Chaitra Nagaraja, a senior lecturer at the University of Exeter. Her research interests are primarily in measurement, particularly macro economic and socio economic indicators, time series and the history of statistics. In addition to our university research and teaching, she's the chair of the American Statistical Associations, Scientific and Public Affairs Advisory Committee, a member of the royal statistical society's history of statistics section, and the book review editor for the international statistics review. She also recently accepted a co-editorship position for the New History of Statistics Column and Chance magazine ‘History Chronicles”. Thank you for joining us here on Stats and Stories again.

Chaitra Nagaraja
Thank you very much for having me.

Rosemary Pennington
How did History Chronicles get started?

Chaitra Nagaraja
Well, actually one of the new co-editors of chance magazine, Wendy Martinez, actually, she was part of the AASA history of statistics interest group, and she's the one that sort of contacted Penny Reynolds, who's my co-editor, who is also involved in that group. And she contacted me and the reason why we both know each other was we sort of started talking to kind of bridge, the Royal Statistical society history of stats and the AASA, American Statistical Association groups together, see what we could do jointly.

John Bailer
So what got you interested in history in the first place in terms of the discipline?

Chaitra Nagaraja
I have always loved history. And as an undergraduate at the University of Chicago, Stephen Stigler was my professor. And honestly, I've always wanted to be like him. So that's been my sort of goal in life since then.

John Bailer Do you and have you and Penny talked about what you hope this column sort of what work it does for the education of history around stats have you sort of sat down like, here's where, where we need to be if we think this has been successful?

Chaitra Nagaraja
I think we're still experimenting since the column has just started. And we're trying to be open in the sunset, trying to find history in many different kinds of places to try to make it a little bit global. So we have both the columns and Penny has put together different dates that are important statistics history, and possibly include, you know, some voices from prominent statisticians to say, you know, here is or a way to kind of put into perspective, statistical ideas and how they've traveled through time. I think a lot of times, at least academics, when they read papers, you have the literature review part of it. But that really doesn't do much justice in terms of thinking about statistics as a field, which has its own sort of philosophies and motivations that don't necessarily appear in a dry academic article.

John Bailer
Okay, so why is it important to have an understanding of the history of statistics?

Chaitra Nagaraja
I think a lot of times when I read articles that are talking about algorithmic bias, and various types of ways that algorithms can be sexist or racist, or you know, harm certain groups of people, much of that boils down to, what was your data? How was it collected? And a lot of those questions depend on historical ways that people were thinking about something and either it's sort of stuck as in like a formal government policy, and then it becomes difficult to dislodge. You can look at poverty measurement as an example of that in the United States, or even thinking just a concept of it. For example, back in the 1800s, there was a lot of work on you know, thinking about how to take something that is a physical science and port those ideas onto people, and especially through astronomy, so how can we, but that is a very specific way of thinking about people that our people are, you know, fuzzy characters, not in the same way. You can't think of us as physical constants and things that stay the same through time. So trying to port ideas from one type of science to another really factor Maha how it motivated their own research and how statistical ideas came about. And I think following those through can help think about how to approach statistics now, in that it's not just a collection of tools, but rather a way of thinking.

John Bailer
So you mentioned that one of the things that you'll put together is some of the highlights and milestones. And I saw that in one of the first articles in one of the first columns that you had contributed the chance that you had mentioned some of these highlights and milestones, would you want to share some of your favorites?

Chaitra Nagaraja
Oh, honestly, I can't remember any of them together. So I would have to ask Penny for that, because she's the one who sort of collated all of those. So I will say 1790, because that was the first US Census. I think that would be a date that I personally know, I think is a good one.

John Bailer
Yeah, I also notice that I'm looking to see I have the advantage. I'm reading them now as I'm talking to you. So that gives me yeah, I've memorized all of them.

Chaitra Nagaraja
How come you didn't know about dates?

John Bailer
No, but things like the idea of introducing descriptions of, of distributions, like kurtosis, and some of those and truth in 1905, or the run up to D day thinking about the codebreakers, or in 1876, pointing to the Guinness story of student's t statistic. It's always fun to think about how kind of long standing and old some of these ideas are, but also that there was a context in which this work was done, and the importance of it. And I think that's one of the appeals to me of thinking about the history of statistics and, and how it plays out.

Chaitra Nagaraja
Yeah, I would agree. I mean, in some ways, statistics is very old and very new. If you look at some of the fundamental papers there in the past 100 years. On the flip side, thinking, statistical thinking itself is very old. You know, there have been sensitive censuses that have been administered for long periods of time, a census is even mentioned in the Bible, for instance, and the reason why Mary and Joseph were traveling in the first place to be counted. So those are all, you know, they didn't have the label of statistics, but they definitely had the essence of counting and thinking about chance, even in our everyday language. And so some of statistics required more formal mathematics to actually be able to write that down, and eventually computing to be able to actually calculate some of those things that were just theoretical ideas. So I think it's interesting how it's both very old and very, very new.

Rosemary Pennington
I'm looking at the column you and Penny wrote fresh perspective, where you're sort of laying out sort of what what History Chronicles might be. And in in it, you've talked about sort of erasers from kind of our, the story of stats in certain ways. And I wonder, or, or people who've just been sort of erased for, for the work they did. And I wonder how you're thinking, the column could sort of help address some of these erasers in this right. So the idea that stats has been around for a long time in different forms. How do you think columns like yours can sort of help address some of the ratios in the history of stats?

Chaitra Nagaraja
I think there are a couple of ways of thinking about it. One, I mean, my area is mostly government statistics. So unearthing various, really old government documents, and seeing what people were thinking of that time. And that sometimes I feel like the narratives have been flattened to, you know, oh, these people were just very racist or very sexist. And actually, they were having some more nuanced conversations. And bringing that to the forefront, I think, would help people understand that, you know, this has always been a dialogue between people with different ideas and different guests with different motivations, but isn't necessarily sort of one framework, to fit all all needs. They're also just like, with the discovery of the double helix and Rosalind Franklin, there have been people that have worked on things that didn't necessarily get the credit for it, for whatever, for whatever reason, so hopefully, this column will help with that as well. One of our future columns actually will talk about the suffragette movement and the interplay between how people considered that from a statistical standpoint, so hopefully, that will be something to look forward to my colleague who's at the Royal Statistical Society history section, committee member along with me, I'll Taya Lorenz or Rebus, she is writing that So, hope so. I think there's two ways of looking at it, not one.

John Bailer
You know, when I was in grad school, I was very taken with Stephen Jay Gould’s writing. You know, I really enjoyed it a lot. There was a lot of history and ultimately, even though he was a paleo biologists, geologists type of person, one of the message in some of his work was the the cultural and historic context and at the time in which science is done, and how that helps shape, what is done, do you see kind of those, that type of idea being infused in some of the future columns of this effort?

Chaitra Nagaraja
Yeah, I mean, things are products of their time. And sometimes it's hard to see that, at least, you know, in my own sort of development, as you know, reading and writing about the history of science, you know, you look at your textbook, there are these theorems that kind of appear out of nowhere. But they are a product of slow tinkering through time about what an idea, you know, like maximum likelihood, or what does it mean to have a representative sample, and some of those ideas are, you know, coming back to give you an example, design of experiments is not a very glamorous topic that used to be taught in terms of like agricultural experiments, and and you can see in a lot of universities that have courses on that have sort of been downplayed or removed for, you know, sexier topics like machine learning, or neural nets or whatnot. But you saw a resurgence in that in tech companies with their AV testing, and the fact that every time you Google something, you're probably sort of a partisan, complicated experiment. So I think there are a lot of ideas that are coming back into play in a different way, just because of the computing power that we have now, that I think would be interesting to look at, to see how they've changed. And, you know, what ideas can we take from the past and sort of bring to a different context.

John Bailer
I remember being struck by it when I was in grad school, and this was in design when talking about randomization to test conditions. And that that being said, the instructor at the time said something like, Well, this was a novel idea, when it first came up, that somehow the idea of randomizing subjects to conditions before an intervention was novel, and it's like, wow, how, how could that be novel? I mean, it's such an accepted, you know, given way that we do business, the fact that at one time it was thought of as this strange, new emerging idea is fascinating.

Chaitra Nagaraja
Even taking an average, something that you know, people do in the third grade was at one time a very novel concept that you can get more information by collapsing your 10 data points into one it doesn't, doesn't feel right, from an intuitive perspective, because you have 10 pieces of information. And now we're just sort of combining it into one number. How could that be telling us more? So that in and of itself was, you know, mind blowing for the time, though? We don't really think of it that way. seems so obvious.

Rosemary Pennington
You're listening to Stats and Stories. And today, we're talking with Chaitra Nagaraja about history and statistics. Chaitra, I know you have a column that is in the September issue of Chance for the History Chronicles about measuring race for Census, or why don't I Know what is the plural of census. Once this is censuses, I was like, it's not since I write… But I just you know, I was reading that, and I was so interested in it and sort of like the history of race data collection. I mean, I am someone who is multiracial, by background. And when I was a kid, I remember on all of those, like test forms, we had to, there was never something that reflected that. And I would often make my own boxes and get in trouble and get yelled at by it and write multiracial or like I would check all the boxes and get yelled at like, I would never fit myself in a box. And so it's something I'm always thinking about. And I was really excited to see this column. And I kind of just wanted to hear sort of how this where this grew out of for you and sort of why you decided to dive into this particular topic.

Chaitra Nagaraja
Well, I used to work at the Census Bureau's have always liked official statistics, but this particular topic came about because of my work on the policy advisory committee that I had with the NSA. And through that was circulated some possibilities for commenting on the current changes that are going to be made to the race data collection by the Office of Management and Budget. So in the US, that means this entity, which is part of the executive branch, is tasked with sort of setting some standards and those are called statistical policy directives. And there are some rules that everybody has to follow just for consistency's sake and race data collection falls under one of those directives. And so the first one was written in 1977, and was updated in 1997. And now it's going to be updated again. And so they spent a lot of time speaking with various stakeholders, the regular public included, they had a sort of session where people could ask questions at the most recent joint statistical meetings in Canada, which I attended. So it was really interesting to see, you know, what kinds of things came out of their focus groups, and so forth, and how to better have these racial categories sort of depict what people are thinking about themselves. And that includes, you know, multiracial, so to be able to check as many boxes as you want, as opposed to checking one. So for in the past, you were only allowed to check one further in the past, you had categories for multiracial, but that was mostly to figure out, you know, exactly how much black versus white you wrote for not so savory reasons. So that is, I feel like that's a different kind of question for multiracial than what we have now. Or we want to sort of have people be able to represent themselves as accurately as possible. So that's how I got interested in it. And I recently moved to the UK. So it's like, Oh, I wonder, you know, how things are in England. And so I started looking at that and realized it was vastly different, despite what I had kind of anticipated. So that's sort of where I got interested in the topic.

John Bailer
So as you started looking at the US and UK. So maybe we'll start with kind of your modern look, before we talk about kind of a historical evolution, maybe of some of these categories. What are some of the dramatic differences that you see between the way the US Census deals with this versus the UK.

Chaitra Nagaraja
Currently, the US allows you to do as many boxes as you want. And you can sort of give some additional detail if you prefer in the English census. So this is England and Wales run by the Office of National Statistics Scotland and Northern Ireland actually have their own agencies that do their censuses. But in England, they have a separate category for multiple ethnic groups. And so you could be white and black Caribbean, you can be white and black, African, white, and Asian, and then something else you write in. So from what I had read, it seems like actually, they had talked to people in the US and Canada about how they do multiracial and decided to kind of go a different way as in not do the, you know, check all that apply. That also, some major differences are based on, you know, actually, there's no way to get around this. But historical because in the US, you have certain kinds of immigration patterns, you have a lot of people who come from Central and South America, Mexico, whereas in Britain, you have a lot of people coming from former British colonies, India, Pakistan, Bangladesh, so those are really highlighted accurately, certain African countries, Caribbean countries, and so forth. So those have labels attached to them, and you wouldn't be in the other fill in what you need category. So those are sort of the main differences. Actually, one other difference, if you're from the Middle East, you would consider yourself Arab, you would have your own category in England, but not in the US, you would be considered white. But that's actually a change that the Office of Management and Budget are considering making.

Rosemary Pennington
In your column, it was really interesting to read, because it feels like it reads on like how we teach our ideas about what race changes, because at one point, you talk about how, you know, Chinese, Hawaiian, Chai, Japanese, all of these were imagined as races right? Earlier at one point, and Now certainly, I think we would see them as sort of national identities, not racial identities. And sort of I wonder, as you were doing this work, was there something that you found particularly interesting, or, or that you were surprised by when you were sort of digging into this?

Chaitra Nagaraja
Yeah, one, you know, if you look at a census form for different countries, it's really ambiguous. What counts as arrays, what counts as ethnicity, what counts as nationality? I'm living here in England. I'm a person of Indian origin, but I'm American. So in the British form, there's Asian or Asian British, of which I'm neither an Indian American. And, you know, granted, there's probably not that many people in that sort of combination of things. So it's not necessarily surprising that there is a box I can check. But it gets very complicated quickly, because some of these things are national origin. Some of these things are not what it means to be you know, if I write just American on my form, what would that mean for their calculations? What I was surprised actually that I didn't know before that I'm looking into this is actually Hindu was on the Census form in the US for a brief period in the early 1900s, which I was surprised about I had not known. And I, from what I can tell, that probably just means, you know, Indian subcontinent in general, as opposed to Hindu, like the religion. But that's something I think I would like to look into more just from a personal background.

John Bailer
You know, before we started the recording, you had mentioned that you had just gone out and grabbed a bunch of different census forms from all around the world. And yeah, at that time, would you describe what you said, there's lots of very different types of information that was being queried by different countries? Can you? Can you describe some of the different types of information that was maybe a surprise to you, that you saw on these?

Chaitra Nagaraja
Yeah, and, you know, just goes, you know, I've not been everywhere on the planet. So local indigenous groups that I, you know, how would you represent them on the Census form was quite interesting. So. So in the US, you would have, you know, Native American groups, and, you know, tribal affiliations that wouldn't necessarily appear elsewhere. In England, and this is one that actually, when I show American students, the English and US census forms, this is the first thing that they see under the section white it says gypsy or Irish traveler, and the word gypsy is not something that is good, not considered an okay word to use it in common language in the United States. So, people are very shocked to see that on a census form. And this is a, you know, a form that people in England are asked, as to the categories, this is not necessarily like, you know, forced upon the population. But if you move to like Australia, for instance, and you have Aboriginal groups and Torres Strait, or you know, so just to see the variety of boxes, and I guess, you know, in an intellectual sense that racial categories are arbitrary, but it becomes so much more apparent when you just compare these forms, that these are, you know, these are obviously made up. And we all know that, but looking at the form just makes it so apparent.

Rosemary Pennington
It just sort of gives you insight into how a society is seeing itself at a particular point in time, or maybe the tensions that are inherent in that society to when that census is being conducted.

Chaitra Nagaraja
Definitely, because it's sort of interplay between, you know, government categories, and, you know, lobby groups to get a certain kind of terminology used, or removed, and so forth. So, I think in the US since the 1960s, when people started filling out their own census forms, as opposed to having somebody kind of fill it out for them shows up at their house. I think there's been it's more collaborative approach to racial categories than it previously was.

John Bailer
I was also taken by the comment that you made that some countries don't collect this information.

Chaitra Nagaraja
Yes, I think there are a few others that I can't remember off the top of my head, but France in particular, because it sort of goes against their constitution about being egalitarian, and you know, treating everyone the same, regardless of where they come from. And that's the thing, there's been a lot written about, you know, why collect these in the first place. So it could be for sinister reasons. You know, the census was used inappropriately for targeting Japanese people of Japanese origin during World War Two, to put them in camps. So, you know, people are scared of that, with good, you know, obviously, good reason. Despite the laws, you know, there are more laws in place to prevent that from happening now. But on the flip side, there are groups that would like to be recognized. And that is, definitely I'm sure you've seen in the US, especially among Asians, you know, there's Asian and Pacific Islander, which is an absolutely enormous group of people. And originally Asian wasn't thought of in the same way as that monolith, but actually, Asian groups themselves in the 1960s thought, you know, let's band together and have, you know, tower with unity. But I think that has had mixed results in that, you know, you have groups of people that have moved to the US under very different circumstances. For instance, people from the Indian subcontinent came after mostly after 1965 I think that was the year the Immigration Act, and so tend to be much more educated and you know, they came for schooling, that's how my own family you know, parents came over, and so they have a very different trajectory than other types of groups that can maybe came as refugees, for example. And so combining them into sort of one group with very different languages, cultures, and different parts of the world may not make as much sense. And so now it's sort of very muddled.

Rosemary Pennington
It is interesting thinking about that new change, it's happening where people who are from Mena, Middle East or North Africa, will now no longer be sort of collapsed under white, and can choose. And so just, you know, again, thinking about, like, the choices we make, and I wonder if you could talk a bit about how, how did that decision get made? Or was it that pressure brought by people who are from that background, bringing that to the senses? Or is that something that sort of? I don't know, I don't know if anything develops naturally. Right. But what sort of how did that happen?

Chaitra Nagaraja
I think it's a bit of both. I mean, some of it probably was, you know, after September 11, it kind of was something that, you know, came to the forefront a lot more than, you know, for, obviously, for not so good reasons. But I do know, at least from the session I attended at the joint statistical meetings, that they did a lot of focus groups on figuring out how to define that category. How does someone know that they may belong to that category? And had a lot of trouble with that, in terms of what kind of, you know, deal is countries like what is what exactly does that mean? So I think there's some discussion still happening in terms of how best to describe this category in a concise enough way, because obviously, no one wants to read like 20 pages, right? You know, to fill out a census form. So that seemed to be something that we're still discussing at that point.

John Bailer
You know, as you were describing that, there's this mental picture of grabbing a lot of different opinions and bringing this together and trying to do this. And so, you know, there's, there's a sense that these are systems that are being implemented by these national statistical offices. So can you comment about what the role is of these organizations? And why are they so, you know, purposeful, of trying to, to gather this additional input?

Chaitra Nagaraja
Yeah, so in the US, it's very decentralized. They're considered 12, Federal Statistical agencies, and there's been discussion about combining them under one, but people are suspicious that they would have too much power, they would have so much information. So, you know, the usual, mini version of the state versus federal rights time type debate, you know, plays out on many different levels. But in more recent times, there's a lot of sense of, and this is true in England, too, that you need to, specifically with due to race data collection, that you need it to sort of see if civil rights are being enforced like that, that is one of the purposes of collecting this information. So the goal is to try to, you know, how do you portion funds or how do you, you know, you can think of, and the English Census does this, but the US Census no longer does ask a lot of questions about, you know, is English your first language? Or do you speak something else at home, you know, information needed to try to figure out where to apportion certain kinds of federal funds. In the US, that function falls a lot to the American Community Survey, which is sort of the hived off version of what's considered what's called the long form of the decennial census. So previously, people used to get a much, much longer census form, but they sort of decided that, you know, maybe we need that kind of information a little bit more frequently than once every 10 years. So the American Community Survey is held every year, you know, sort of data collected throughout the year. So that's sort of the main reason, and there was a good quotation, I think it's James Madison, if I remembering correctly, who said in that, you know, without, you know, I'm paraphrasing here, without, you know, information, like how do you govern, you know, if you don't know your population, how can you best serve them? So I think those kinds of reasons are good ones. Obviously, protections need to be put in place for confidentiality, privacy, and a lot of that is getting much much trickier with sort of independent private databases that have been amassed, you know, like everything we touch, right, goes into some database, and they can be connected to government databases in ways that you know, might identify people. So there was a lot of discussion about, you know, how we can give granular information that's useful to local authorities. So this is not just the federal government, but you know, your local city you live in your municipality, but still managed to keep people's calm potentiality maintained. So it's definitely a very tricky question.

Rosemary Pennington
Well, that's all the time we have on this episode of Stats and Stories, thank you so much for being here today. Stats and Stories is a partnership between Miami University's departments of statistics and media journalism and film and the American Statistical Association. You can follow us on Twitter @statsandstories, Apple podcasts or other places where you find podcasts. If you'd like to share your thoughts on the program, send your email to statsandstories@miami.oh.edu, or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.