Andrew Gelman is a professor of statistics and political science at Columbia University. He has received the Outstanding Statistical Application award three times from the American Statistical Association, the award for best article published in the American Political Science Review, and the Council of Presidents of Statistical Societies award for outstanding contributions by a person under the age of 40. His research interests include a wide range of topics, including: why it is rational to vote, why campaign polls are so variable when elections are so predictable and why redistricting is good for democracy among various others.
Episode Description
With the 2020 U-S presidential election all but upon us, media are rife with prognostications about which way voters are going to swing. Will reliably red states stay red or will voters produce a blue wave that crashes across the country? Will economic uncertainty trump concerns over COVID 19? Is political polarization really as set-in-stone as some have suggested? Understanding voter behavior is a focus of this episode of Stats and Stories where we explore the statistics behind the stories and the stories behind the statistics with guest Andrew Gelman.
+Full Transcript
Rosemary Pennington: With the 2020 US Presidential election all but upon us, media are rife with prognostications about which way voters are going to swing. Will reliably red states stay red or will voters produce a blue wave that crashes across the country? Will economic uncertainty trump concerns over COVID-19? Is political polarization really as set in stone as some have suggested? Understanding voter behavior is a focus of this episode of Stats and Stories where we explore the statistics behind the stories and the stories behind the statistics. I’m Rosemary Pennington. Stats and Stories is a production of Miami University’s Departments of Statistics and Media, Journalism and Film, as well as the American Statistical Association. Joining me are regular panelists John Bailer, Chair of Miami’s Statistics Department and Richard Campbell, former Chair of Media, Journalism and Film. Our guest today is Andrew Gelman. Gelman is a professor of statistics and political science as well as director of the Applied Statistics Center at Columbia University. He’s received the outstanding statistical application award three times from the American Statistical Association. The award for best article published in the American Political Science Review and the counsel of presidents of statistical societies award for outstanding contributions by a person under the age of 40. His research interests include why it’s rational to vote, why campaign polls are so variable when elections are so predictable, the statistical challenges of estimating small effects and research methods amongst a variety of other things. Gelman is also the author of the book Red State Blue State Rich State Poor State, Why Americans Vote the Way They Do. Andrew thank you so much for joining us today.
Andrew Gelman: I’m glad to be here.
Pennington: How did voting become a research interest for you?
Gelman: I’ve just always been interested in politics. I was a political science minor in college and then when I was in graduate school studying statistics, my best friend was a political science Ph.D. student, so I would go to seminars in the political science department and I found out that it was possible, as a statistician, to contribute. One of the first things that I worked on was estimating what’s called the Seats Votes Curve in Congress which is, well, you could say the percentage of seats won by the Republican party, say, in the legislature as a function of the percentage of votes that they get and in any given election you can look at maybe the Republicans might get 48% of the vote and 49% of the seats or whatever it is, you can look at what’s happening in one election after another and you can- we use statistical modeling so that the techniques that had been used previously were based on fitting simple curves to the seats votes curve or fitting things like normal distributions or other distributions to the votes that people had received, and my colleague and I went further by fitting a latent variable model which allowed for swings from one election to another but also uncertainty and that approach has since been used in many redistricting cases and so that’s how I got started.
John Bailer: So, has that changed a lot over time? I mean it sounds like this a classic strategy for looking at things like gerrymandering and other kinds of restructuring.
Gelman: Gerrymandering is worse than it used to be for a few reasons. So, part of it is traditionally there are countries that do nonpartisan redistricting. That’s just considered, well like the United States we have a non-partisan census bureau right? So, they count it’s not their career employees and they do their jobs there are a lot of government agencies like that. it used to be there are countries where redistricting is just done in that way. It’s not considered to be part of the spoil system. In the united states there are some states that do things in a nonpartisan way. Other states do bipartisan redistricting, they have a bipartisan commission but there are states that do it in a partisan way although even there they are somewhat constrained by the courts to not be too biased. What’s happened since 1990 when we started working on this, first technology so they have a lot more data, it’s easier to draw a more consise plan than before. The second thing is with political polarization states tend to be more dominated by one party or another so there are fewer constraints. The norms have changed, it’s become more, considered more acceptable to do a partisan redistricting. Sometimes the attitude is that well we’ll do it when we’re in power and the other party can do it when they’re in power it’s just part of the system. The courts have generally ruled that it’s up to the legislature to decide these things, so the majority to rule against redistricting, voting rights act has been weakened that’s another story, the courts perhaps have become more partisan but again if you have less split ticket voting even the actual legislators in the state are more polarized so you have fewer members in the middle and that effects things in all directions and of course once you get gerrymandered districts that can make things even more extreme because when you have districts that favor one party or another then there is less motivation for candidates to run in the middle.
Richard Campbell: Andrew, I have a two-part question; John loves two-part questions. The first is you know as the election comes down to the wire here and I’m starting to see a lot of discussion is where Biden was and where Clinton was at this point in time and it’s about the same. Clinton ended up losing, not the popular vote but the electoral college. How should we make sense of what’s happening? How is it different from last time for just somebody like me who follows this stuff but doesn’t really understand statistics the way that I probably should be on this podcast.
Gelman: Um Biden is doing better in the polls consistently better in the polls than Clinton was four years ago. But they’re not really in the same position and Biden has been at about 54% of the two-party vote for a couple months.
Campbell: The other part of the question was I hear pollsters taking about or pundits talking about polls that lean conservative and those that lean liberal. Rasmussen for instance recently is always considered one that sort of leans more conservative and I saw in the approval ratings Trump often is up. He’s up plus but they had like a ten-point swing in the approval rating I think on Rasmussen after the debate I think the first debate. What do we make of polls that swing like that?
Gelman: When a poll swings a lot, you want to look at the percentage of Democrats, Republicans and Independents, among the respondents. So typically we found, we wrote a research paper about this a few years ago where we found that where there’s a big swing that it tends to be a poll that over samples one party or another and it’s not just chance part of it is just randomly- it’s a random sample; who knows who you’re going to get. But part of it is that when a candidate is- has bad news his supporters are often less likely to respond to a poll. Now you might say well they’re less likely to respond to a poll maybe they’re less likely to vote but it’s not quite the same thing. 60% of people vote only about like 1% of people respond to polls. So, whether you respond to a survey is much more contingent on your mood I would think than whether you would vote. Now, interestingly enough, I mean it makes sense like why I respond to a poll. In the 1950s, arguably, it was more rational to respond to the Gallup poll than to vote because if you vote you’re one of whatever it is you know 50 million voters but if you respond to the poll you’re one of 1,500 people whose poll responses will be splashed all over the nation’s newspaper the next day. So, it’s no surprise that survey response rates are at 80% back then and voting higher than voting rates. But now you’re just if you respond to a poll that’s just one more of many many many surveys.
Bailer: So, are there aspects of kind of this this device of always saying red state blue state differences that drives you a little crazy? Is this sort of this overly simplistic world view that- and if there’s parts of it that drive you crazy? I’m going to do a Richard I’m going to ask sort of a second part of this question. If there aren’t aspects that drive you crazy what can be done to help inform and shed light on that?
Gelman: Yeah it’s hard I think- I have a big problem with talking about states for a few reasons. First it is ultimately voters who vote not states. There’s obviously a lot of variation within states, urban, suburban, rural. Now that said you’re in Ohio, which is a Republican leaning state, but you’re in a college town and in college towns have a lot of Democrats and cities have a lot of Democrats. That said I would guess that the college towns in Ohio are a bit more conservative than the college towns in California. And I would expect that even in Wyoming and Montana that the college towns are more liberal than the rest of the state but they’re probably much more conservative than the college towns in the Illinois or even Ohio. So, there is something- I guess I like the statistics framing where we think of predictors. So, what state you live in is a predictor how to think about that is another story. And there’s been a lot of discussion like how do you think about like is there something about Ohio? Is it the people in Ohio and they happen to live there? Is there some interaction? The other reason- I mean talking about states is problematic because it’s kind of weird to put all states on the same footing, right? So perhaps you should say like the area of Columbus Ohio that has as many people as a lot of states, and so if we’re talking-why should we talk about Montana and Wyoming and then talk about Ohio, we should talk about Montana Ohio Columbus Cleveland Cincinnati and so forth.
Bailer: Can I just as a quick follow-up you wrote you book about 12 years ago give or take, what’s been one of the biggest shifts or changes that’s occurred since when you wrote this book and describing this individual versus some of the state differences in voting patterns to what we’re seeing now in 2020?
Gelman: So when we wrote our book what we found that within any state richer people were more likely to vote for Republicans but the richer states voted for Democrats, that’s what kind of the book was about but nowadays the pattern is much less drawn so you don’t really see- like, within a given state the richer people are not necessarily voting for Republicans anymore. So, I think that we’ve seen this with recent campaigns that more educated voters have moved toward the Democratic party, and that’s new. So, when I was a student the more educated voters were consistently Republicans, there was a clear income- education gradient. Income is a little different than education, the pattern is clearer there but in- then the most- then for a while we found the most educated people are a little more Democratic compared like people with graduate degrees but it’s like you could just think of in terms of professions. So, like doctors and lawyers used to be conservative Republicans. Not the most conservative but those professions tended to be conservative and as you may know back after World War II one reason in the 1970s one reason we didn’t have a national health care plan in the 40s, 50s and 70s is the American Medical Association opposed it. Now they may so oppose it, they’re an organization, but doctors in general have changed. Now one reason is that doctors are more likely to work for somebody now and less likely to be self-employed so there’s a shorthand that we like – I like to say is that people who pay taxes are often Republicans are often conservative. People who get taxes tend to be liberal. So, if like University professors, even if you work for a private university like I do arguably I get taxes rather than pay taxes because the university is supported in so many ways by the government. And if you’re working, if you’re a teacher you get taxes, or a nurse or a doctor even if you work for a private hospital, it’s like quasigovernment. Things have changed in terms of individual professions. Income is more complicated. So, when I think of income and education one thing I like to think about is teachers and social workers. A lot of teachers and social workers have master’s degrees and they’re very liberal, they tend to be very liberal. They don’t get a lot of high salaries. So higher salary positions or people who are well educated in higher salaries are more conservative so doctors are more conservative than teachers and social workers there is even a graph that someone sent me a couple years ago with some survey of doctors by subfield, and the richer subfields, like radiologists, were more conservative than pediatricians who get paid less. It’s very- but I will say the aggregate pattern has changed. So it used to be very clear that within most individual states the Republicans who are representing are getting the votes of the richer people and you’re not seeing that so much and of course that does change politics if it changes who’s getting the votes and in some sense it’s kind of stunning that the parties still are so different on economic policy. So although there are some populous aspects of the Republican platform and the Democrats are certainly very corporate friendly nonetheless the Democrats still tend to favor redistribution policies and the Republicans tend to favor policies that keep power in the hands of richer people. And for I’ll just say not for a political comment because based on your political ideologies both of these attitudes can seem to make sense. If you’re a Democrat if you’re a liberal you can support redistribution not out of- not necessarily out of a desire to punish anybody but because you think it’s more fair and you actually think it benefits the general economy to spread out the spending power. Conversely if you have a conservative view, you can certainly argue that institutions that have more money have earned it and also that they can spend it more efficiently than a government can. So as political scientists we have to kind of separate the ideological content from the empirical content because it’s not kind of our job to judge someone’s views it’s more that we’re trying to describe them.
Pennington: You’re listening to stats and Stories and today we’re talking with Columbia University’s Andrew Gelman. Andrew this is going to come out a few weeks before the election, so what advice would you have for people as they’re consuming news about the election? There’s a lot of stories still out trying to say this voter in a case in rural Ohio is going to vote this way and therefore they’re representative of every rural voter in every café in Ohio. Are there things you think that we should be keeping an eye of or be critical of when we’re consuming news that are trying to predict voter behaviors in particular ways?
Gelman: I think that good journalists are informed about the statistic’s and that they use these interviews as a way of bringing that to life. So responsible journalists would look at polls and say that here’s how people seem to be voting in rural Ohio or the rural Midwest and then they’ll interview voters that are consistent with that general pattern. It’s like if you write a textbook, you have an example that illustrates your main point, but you had the point already you know the example has a life of its own so if they interview someone that’s the way it should be but I think media organizations tend to be pretty careful in being balanced in how they interview people. I mean we could talk more about stories but somehow the point of getting the individual interview is that you can learn something surprising from that individual interview. You shouldn’t take that interview as being some sort of statistical representation; it has a different value to it.
Campbell: You talk about stories and you’re one of the few social scientists that I know that have actually written scholarly articles on storytelling and in your – I think this was a 2014 article When Do Stories Work?, you make the point and I want to follow up on because it relates to something you just said you said that stories are useful to illustrate ideas but you also said that they are often evidence themselves. Can you talk a little bit about what you mean by that?
Gelman: Sure, so my colleague and I were interested in the idea that we learn from stories. So, the usual way that story telling I think is presented to social scientists or even to statisticians as a tool. So, what we’re told is that people think of the world in terms of stories, people are not natural calculators, so therefore if you want to convey your idea, embed it in a story. Like kind of make your research into a TED talk. Like give the elevator pitch. And it’s very much a broadcast idea. Like so it’s an outward idea. You have your brilliant idea, but you have to convince these stupid idiots out there, so you’d better tell a story and sell it like that. we felt that we had a more inward take on it which was that when we reflected on how we made our decisions and how we had our social science understanding it was influenced by stories. Even in statistic, if you- why do you use this statistical method rather than that one, like beyond well my advisor taught it to me, often it’s oh I went to this talk and there was this great example and I saw this. It’s not like there was a theorem. And so, we’re influenced by stories ourselves and so I wanted to understand how that could be and it struck me that effective stories have two characteristics. One is that they’re anomalous and the other is that they’re immutable. By anomalous I mean a story has a twist, it has a surprise in it. A story is like exploratory data analysis; it tells us something we didn’t already know. And what that suggests is that when interpreting a story, we should think about what our preconceptions were. It’s good to understand your priors because then you’ll understand what you appreciated about the story and even if it’s Little Red Riding Hood you know what’s the twist right? Like what’s the surprise? The whole story like a girl goes into the woods and there’s wolves. Like we knew that there were wolves in the woods right? So, in some ways that’s a very simple story the simplest version is the preconception is hey girls can walk wherever they want, kids can do whatever they want, no you’ll get eaten by a wolf. But then you get to the whole twist and oh well I was expecting she was going to get eaten by the wolf but actually she was very resourceful, and the wolf was resourceful too. People are not always what they seem, etcetera. But to really to get the most out of such a story you also want to understand where you’re coming from that. so that’s part of it the other part is that the story should be immutable so this is like the famous rock that Samuel Johnson kicked to refute Bishop Barkley, that a story- a real story has like these grits of unexpected truth so it’s different. We distinguish between what we call this story and a parable. So, a parable you can adapt to make whatever lesson you want, a story is just there. You have to tell it the way it happened. So if I think now- like if I think about this interview of some lady on the street in Ohio, what you’re going to get out of it is you’re coming into it with your expectations about what do country folk like, and then you’re going to learn something interesting. Yeah the interviewer just lets her talk a bit right? Because there’s more bits about information in ten minutes of her speech than there are in all of your social science theories put together. It’s just the nature of social science theories that they have less information; that’s kind of what a theory is about, it’s a form of data reduction. Now the paradox of story telling is that in statistics we say that we learn from representative data or even random samples but the essence of a story is that it’s anomalous, which means it’s surprising, which means it’s atypical. So how do you resolve that and so the way I resolved it is to say that in statistics there are two things we do. One is that we look- we get representative samples and random samples and random assignment in order to learn about populations and distributions that’s what might be called normal science. The other thing is what we might call revolutionary science which is that we look for residuals and outliers and problems. Now we always have to teach our students that when you say look for an outlier that the purpose of looking for an outlier is not to throw it out and not to just record it and say our data have four outliers it’s to actually look at it and see what went wrong or maybe what didn’t go wrong, right? So actually, in statistics we have this second mode of statistics where we’re actually looking for nonrepresentative things and trying to learn from them. And that’s how I feel like learning from stories is but again it’s tricky if you have a million datapoints you’ll find all sorts of weird outliers you don’t want to generalize from them. Similarly, an unscrupulous data journalist could do a little bit of selection bias and just you could cut 500 interviews and then find the one interviewer who brings up one weird issue. The interviewer is like blah blah blah I care about plastic straws what’s up with the plastic straws and then they start shouting and then you hear it on the radio and you’re like oh yeah plastic straws that’s what the people really care about that’s the real issue. But really there was some selection bias. So, it’s not easy so we learn from stories because of their anomalousness and their immutability but there’s enough stories out there that we can certainly do selection bias and of course that’s what social media is about. The outrage of the weak. Any week you can find anybody- something ridiculous being done by any party. You can find someone who’s done some embarrassing crime and so forth.
Bailer: You know this is great. I mean so I’m thinking about that you had mentioned just before we started the recording today that you were involved in teaching a class on communicating statistics. And I was wondering did you connect in your communicating statistics class some of these ideas about how did you integrate some of these ideas about stories into this class about communicating statistics?
Gelman: Yeah so that was we call it communicating data and statistics because I guess data is like a good selling point. I tell students that every statistician is going to become a teacher. Now most of our students don’t become statistics professors but if you’re a statistician working in government or industry you end up spending much of your time explaining statistics to people. It’s just that’s kind of your role and so we have to learn that. so, communication we taught- we discuss several different skills. Teaching, writing, speaking, collaborating, programming and statistical graphics and about half the class was actually about statistical graphics and about half so I kind of alternate weeks of the graphics with weeks of kind of softer skills like storytelling writing communication collaboration and teaching. And it’s funny because you can have a whole class, I used to teach a class on teaching statistics and every week we’d discuss like students were all teaching assistants in various classes so they’d visit different classes and we’d discuss teaching tracks, I have a book with Jeff Nolan I’m teaching statistics and like we did all these techniques but then the class- I mean you could do that whole class, it’s just a like there’s not time for everything so I ended up just folding that in and it became just two weeks of this larger class. But storytelling I part of it so I do have them practice storytelling and writing I think it’s maybe even the first day of class or the second day of class we have an assignment where they have to write a story in five minutes. So, we are first I first demonstrate so I show off. so, I ask them I say pick a topic and they pick a topic and I type it and they can see it displayed on the screen. I spend one minute outlining the story, like what I’m going to write and then I start writing and then when one minute’s left I look at what I wrote and I revise it and so I say you should be able to write a story in five minutes and I have them do this in pairs and so they practice that and then having done that we talk about what makes the story work.
Bailer: Very cool.
Pennington: Well that’s all the time we have for this episode of Stats and Stories, Andrew, thank you so much for being here today.
Gelman: Thank you.
Pennington: Stats and Stories is a partnership between Miami University’s Departments of Statistics and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter, Apple Podcasts, or other places where you can find podcasts. If you’d like to share your thoughts on the program send your emails to statsandstories@miamioh.edu or check us out at statsandstories.net and be sure to listen for future editions of Stats and Stories, where we explore the statistics behind the stories and the stories behind the statistics.