Siddharth Suri is a computational social scientist whose research interests lie at the intersection of computer science, behavioral economics, and crowdsourcing. His current work centers around the crowd workers who power many modern apps, websites, and artificial intelligence (AI) systems. This work culminated in a book he coauthored with Mary L. Gray titled Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass (May 2019).
+ Full Transcript
Rosemary Pennington: The Internet is ubiquitous, as are the algorithms and programs that shape our experiences. They invisibly guide what ads we see, what search results we get, and what content is flagged as inappropriate in social media spaces. What is also invisible, though just as ubiquitous is the human labor that holds it all up. A new book calls such labor “ghost work” and it’s the focus of this episode of Stats & Stories, where we explore the statistics behind the stories and the stories behind the statistics. I’m Rosemary Pennington. Stats & Stories is a production of Miami University’s departments of Statistics and Media, Journalism Media and Film, as well as the American Statistical Association. Joining me in the studio are regular panelists John Bailer Chair of Miami University’s Department of Statistics, and Richard Campbell, former Chair of Media Journalism and Film. Our guest today is Computational Social Scientist and Senior Researcher at Microsoft Research AI, Siddharth Suri. He and anthropologist Mary Gray, a senior researcher at Microsoft Research are authors of the new book Ghost Work: How to stop Silicon Valley from Building a new Global Underclass. Sid thank you for being here today.
Siddharth Suri: Thanks for having me, it’s a pleasure.
Pennington: Before we get the ball rolling, could you talk about how the book came about?
Suri: The book came about in an interesting and coincidental way. I had been working on– I was interested in conducting classical psychology style experiments, and I was working at Yahoo at the time. And typically these types of such experiments are done with undergraduates in a laboratory, but at Yahoo we didn’t have any undergraduates, and we didn’t have any laboratories, so necessity was the mother of all invention, and we, Duncan Watts and Winter Mason, we figured out how to put these experiments online and we used crowdsourcing sites like Mechanical Turk. For those who might not be familiar with these types of sites, it’s just a place where someone can put up a job and workers can come and do that job for pay. And what we did was the job was our experiment and workers were our subjects, and they could come and do them for pay. And I did that from about 2008 to until now, but if you fast forward to about 2012, I’d just joined Microsoft, and Mary had heard about some of my work. She was in Boston at the time and it was in New York at the time. She came down and she said, “Hey Sid I hear you do a lot of work in this space of crowdsourcing” and what’s now called the gig economy. And I said yeah. She said “Would you like to write an ethnography of the workers?”, and I said “Sure, let’s do it! What’s an ethnography?”
John Bailer: You may want to change the order of those questions.
Suri: And maybe I should have, but it all worked out in the end. And a little bit deeper reasoning was I thought to myself, Mary is an anthropologist and I am a Computer Scientist. We’re at almost the same place- she was in Boston and I was in New York- at the same time, interested in the same thing and I thought to myself when be that going to happen again?
Suri: Let me jump on this and see where it leads.
Bailer: Can you describe an experiment or two that you did with this online reference?
Suri: Absolutely. So I’ll describe my favorite cocktail party experiment, and it’s the following, it’s about the simplest experiment that you could imagine. We told the workers to roll a six-sided die, and if you don’t have a six-sided die handy, we gave them a link to a website called random.org which would roll one for you. And we said whatever you roll, report the number and we’ll give you $0.25 times the number you roll. So now, I can’t tell if you John, rolled a six or not, I can’t tell if you’re lying or not, but I know that if I get about a few hundred rolls, I know the general amount of lying in the population.
Bailer: That’s pretty good.
Suri: And it turns out people underreport ones and twos and over-reported fives and sixes. And then we did was we changed the pay rate to see if that mattered, it didn’t matter too much. But then the other thing that we did was we said: “okay, now roll the die 30 times and tell us all the rolls.” And again, we’re going to give them $0.01 times the sum of all the rolls, and it turns out the amount of lying goes way way way down. And what we- what the inference and the conclusion were when you ask someone to roll a die 30 times, they’re not going to say 30 sixes, because they know that you know they’re lying. So they’re going to try to change it a little bit if they are lying, or just tell the truth. So that was an example of probably the simplest experiment you could imagine.
Pennington: You called this the gig-economy, I think when you were responding to John, and I believe Mary has called what’s happening ghost work. Could you talk a little bit about how that term ghost work came about? And maybe whether it has reshaped how you think about the experiences of people like your participants?
Suri: In the book, we give a fairly technical definition of ghost work. We talk about work that is shipped, built, managed, and sourced through an API. But, a little bit more colloquially, there are many apps, websites, on the internet now that look completely automated. What’s actually happening is there’s a fair amount of human labor behind them, and that’s what we mean by ghost work. For example, when you look at your Facebook feed or Twitter feed, why do you not see any hate speech? In the case of Facebook, how come you don’t see any adult content? This is the internet. People will use it to say anything they want. So how do you get this very sanitized view? And the answer is people training algorithms to sanitize your feed. Another example would be search engine rankings. All the big search engine companies use humans to judge what a better and worse search engine ranking algorithm is. So it doesn’t appear that way to the end-user. Search engines appear that they’re completely automated and that’s where were came up with the term ghost work.
Bailer: So Sid, tell us about how many workers are we talking about here? How big is this network and what kind of people do this work?
Suri: So, it’s really funny you asked me the last part of that last question, what kind of people do this work? Back when I was doing those psych experiments I would have to explain to the audience what is crowdsourcing what is Mechanical Turk? Invariably after I would do it someone would always raise their hands and say “who are the workers?”, “what kinds of people do this work?”, and I would give them the statistics and demographics and they would put their hand down, and five minutes later they’d always raise their hand again and say “yeah, but, who are they?”, and that’s the genesis of this project for me. So, how big is this? There isn’t a ton of great data measuring how big this population is. Best numbers that I can give was done by [Pew?] they added a question into one of their surveys and I know this because Mary wrote the question about “have you ever…” I forget the exact wording, but basically, it was asking “have you ever done work through an API?”, “have you ever done work through a website, through a mobile phone app?”, and that kind of thing and it turned out about 8% of the US population – excuse me, 8% of the US working population had done work this way in the year prior to 2016.
Bailer: You know, I was curious about the generalized ability of results from this kind of work. I mean, for example, when you talked about the rolling the six-sided die and the lying when you started to think about this and say well, I wonder if it’s the same if I looked at respondents of one age versus respondents of another? Or respondents of – from different religious affiliations, or country of origin? You name the sort of stratification that you consider. Looking at some of the work that you’ve done with ghost work, is it not the case that you tend to have generally younger participants in this economy? And often more educated participants in this economy?
Suri: Yeah, so in the case of behavioral experiments and generalized ability, the way I look at it is in terms of shades of gray. So on the one hand, there are undergraduates in a laboratory who- it’s been studied that they are “weird” in the sense that they’re western, educated, industrialized rich democratic nations. One step away from that would be, I would say [inaudible] workers and crowd workers because they come from a more diverse age range, more diverse income bracket, more diverse educational background, etc... Now is that perfect? No. Is it a representative sample of the US? No is it a representative sample of the internet population? No. Is it maybe a step in the right direction? I would say yes. So that’s the way I think about that.
Pennington: So, what exactly did you and Mary do in ghost work? I know that this is based on research that was done in the United States and India, can you talk about how you found the people you worked with and what it was like for you to engage in ethnographic research?
Suri: What we did was, we teamed up early on and one of the first things we did was- a fairly lengthy survey that we gave to workers, and about their relationship to this kind of work, and at the end of the survey, and by the way we paid them for answering the survey, and at the end we said would you like to be interviewed. And so that was one place where we recruited. Another thing we had done was- on a lot of ethnographic work, there’s a pretty well-defined field site, like if you want to study workers at a factory for example, you just go to the factory. But this is a distributed online platform, so what’s the field site? So what we did was we had a mapping task where- it’s about the simplest task you could imagine, we said to workers, please put a pin n a Bing map wherever you happen to be right now and we’ll give you 25 cents, and it literally took five mouse clicks. And we got I think about 10,000 workers to do it and it gave us the geographic distribution of the workers, and that helped Mary figure out where to interview people geographically. It turns out they’re more popular in southern India than northern India, and which parts of the United States to look at. And so that was a good example- one example of this interplay between her field and mine. You know I could sort of map geographically map the population as a whole and that would allow Mary to focus on certain areas to go interview people.
Pennington: You’re listening to Stats and Stories, and today we are talking with Siddharth Suri, a senior researcher at Microsoft AI, about his book Ghost Work: How to stop Silicon Valley from Building a new Global Underclass.
Richard Campbell: Sid, I’m interested in a global underclass that was part of your- the subtitle in the book. It seems like this is going to be a hard labor force to organize, since most of this work is done – you know there’s no place where it’s done. And I also wanted to talk about – and most of- I think I read somewhere where a general description is a lot of these people are college-educated and under forty. So, what're the chances of, first of all, these folks organizing, and also is there any will in our political system to regulate this so these folks aren’t exploited? And I guess, my final part of this question is whose- what company is doing this the right way, and what’s a good model?
Suri: So, right. In terms of organizing workers, there’s an effort at Stanford to provide exactly that. There have also been efforts out of Germany to also organize workers there, and the way I think about it is the workers are – the workers self-organized to create these online communities around these platforms and they did this on their own time, on their own volition, on their own initiative. I think the answer is not to organize anything for them, but just to give them a little bit of a hand as they so need it, to organize themselves. And I think that’s kind of the right approach. In terms of the political will, one person who I would point out who’s thinking very forwardly in this space is Senator Mark Warner, I believe he’s from Virginia, and he’s got some very advanced thinking on this. But there’s another thing I want to point out, I think, the way I’ve been thinking about it I think there are three ways to make progress here. One is through the governments as we just discussed, and that is great because it would be broad-impact and lasting in the legal system. The second as you mentioned, would be having the workers organize themselves. And the third is something I just started looking into which was- what about the requesters what about the people putting these jobs on these platforms? By and large, I’ve interviewed – me and my team have interviewed almost a hundred of them, and by and large, they want to do the right thing and they want to give the workers a fair deal but they need a little help doing that. For example say you wanted to hire a programmer in Romania, what’s a fair wage for that person? That’s a hard question and as a developer, I don’t have that information readily at hand, so can we give them a little bit of help to figure that out? And then the third part of your question was who is doing well, and in the second to the last chapter of the book, we talk about a few companies doing it well. One would be, well one thing they do well is they hire groups of people that actually know each other and have working relationships already, and we even have a path to promotion. So that would be one example. Another example in that chapter would be Cloud Factory – they were based in Nepal and when that very tragic and horrific earthquake hit, they mobilized the entire team to help people in their countries, and the workers and their families. So there are people in this space- and then the third example we give in that chapter is they do a video transcription and captioning and translation and they’re sort of a part volunteer and partly paid organization, they’re working very well, so we do describe a few platforms that are doing things quite a bit better.
Bailer: You know, I was curious when I was thinking about what you guys were talking about and the idea of labeling observations being a big part of what’s going on with this type of workgroup. What would you like people that are working in data science and thinking about machine learning methods and predictive modeling methods to know about this workforce? Or to appreciate about how the methods that they’re building rely on this workforce?
Suri: It’s common in machine learning and data science to come up with a new algorithm and then look for benchmark data sets to test your algorithm and compare it to the state of the art. And that’s the common workflow, and now what I would like computer scientists and data scientists to realize is that data set was made by humans, and think about when it’s time to construct that data set, think about what’s the best way to do it what’s the best way to mitigate biases, what’s the best way to make sure that humans are getting a fair deal? etc., and bring those humans into the computational pipeline instead of just saying you give me the data set and I’ll munch it, why don’t we just take a holistic view and say okay this is a data set it was brought about in this way, is this valid for the research question that I am studying? If so great, if not can we modify it, augment it, should we throw it away and start over, what do we do?
Bailer: So I’m curious, have you or Mary ever participated in this work?
Suri: Oh you mean like- absolutely. So, I’ve sent time as a worker doing work on some of these platforms. Especially mechanical just because I wanted to understand things from their perspective.
Pennington: I also have. I did it in grad school as a side hustle sometimes, and actually, I learned about it from a research blog at Indiana University?
Suri: Do you remember who gave that talk?
Pennington: No, I don’t remember who gave the talk honestly, it’s been so many years.
Suri: I’ll bet you $100.00 it was Winter Mason because he was a graduate of Indiana and he and I and Duncan kind of – we did our early doctorates here.
Pennington: It’s very possible. I do want to ask you a question about interdisciplinary research like this. You are a computational social scientist or a computer scientist or whatever you want to call yourself, I’ve seen both those labels, and Mary is an anthropologist. And so what were some of the challenges of bringing these two different perspectives to explore the experiences of these workers?
Suri: So I’ve been very fortunate in my career. I started interdisciplinary work back in grad school and I didn’t even know it. My Ph.D. is in Computer Science I was doing behavioral experiments with people. What we would do is put them in a network, pay a game, and then we would change the structure of the network, have them play again, and figure out how the structure of that network affects their behavior. And I continued that through my postdoc. And then at Yahoo, we were in a very interdisciplinary group of sociologists, physicists, marketing scholars, political science, economics, that interdisciplinary lab then moved over to Microsoft and kept the same kind of interdisciplinary vision and perspective. So I’ve been kind of living the interdisciplinary dream for quite some time. Nonetheless, there were some challenges here, for example, I think a big challenge is the language barrier. Every field has its own jargon. It’s a way to signal that you’re an insider versus not. And when you work with someone doing interdisciplinary work you have to work with someone who’s going to speak in plain English and avoid that jargon, therefore you can establish a basic communication, and Mary certainly has that quality.
Campbell: Sid, there are two journalists at this table here and I’d like to ask- I’ll ask a two-part instead of a three-part question.
Bailer: Try to ease up – he was very good at remembering the three parts.
Campbell: So, one would be- how are journalists doing in covering this topic of ghost workers in the stories that you see covered? And the second part is are they missing stories? Because the journalist's job is to take this complicated work that’s going on and try to make it understandable to the general public, so-
Suri: Yeah, I would say in the last few years journalists have been covering this more and more, and I think a common refrain in the way it’s covered is about uncovering the hidden workers and exposing the issues and problems those workers face. That’s a fine approach. What are they missing? What they’re missing I think are two things. First, a lot of workers do this work because it fits into their lives. And it fits the social constraints of their lives. A lot of workers, for example, want to be able to scale up or scale down how much work they do in a given week. A lot of workers want to choose who they work for or what they work on. And this work allows them to do that. The second thing I think they’re missing is- and we tackle this in the book head on- the work can be kind of dehumanizing. It’s like I go to a platform and put up a task and workers come and do it, it’s like the workers almost feel like interchangeable cogs to me, and that’s a downside. And journalists have certainly uncovered that. But there’s actually also an upside to it, which we talk about it in the book, I now can’t be discriminated against. So, say I’m part of some discriminated class, whether it’s part of my race, gender, sexuality, religion, whatever- if people online just know me by a unique identifier, they don’t know all that about me and therefore they can’t discriminate against me. And that’s a hidden upside that I think is largely missing, and I think Mary uncovered that in some of her interviews where she covered – she spoke to Muslim women in India for whom working outside the home was looked down upon, and they do this kind of work to skirt that social constraint.
Bailer: Very interesting. So Sid, what’s next for you?
Suri: What’s next for me? I’ve spent the last few months working on that. One of the things I’m starting to work on is- I’m sort of going back to my experimental roots and I’m trying to understand empathy. And I was once at a presentation by Bill Gates, I was one of 1,000 people in the audience, and he gave his presentation and he took questions and I asked him- I waited my turn in line and I grabbed the mic and I asked him “What’s the biggest problem facing humanity today?” But then I also asked, “What’s the biggest problem facing humanity today that people in this room can solve?” What I meant was- you know, Bill Gates has infinite resources. I have very finite resources, so what’s the biggest problem I can solve? And I spent a lot of time thinking about that, and actually- screw the whole Bill Gates thing. So I spent a lot of time thinking what’s the biggest problem facing humanity that I could try to solve? And where I came up with that was empathy can I figure out ways to help people empathize with each other. For example, can I help people empathize for those with less money? Can I help people empathize for those who are from a different race? Or sexuality? That kind of thing. And that where I’m going next.
Bailer: Oh boy, that sounds like a great topic for a future episode. You’ve got to keep us in the loop Sid.
Suri: Sort of off the—well, one of the things I’m thinking about- well, here’s my approach – is instead of – the approach I’m taking is… I can describe to you how much harder a poor person has it in the United States versus a rich person. I could do that with a narrative, I could do that with an image – but what if I actually put you in their shoes? What if I actually put you in the decision-making process of someone who has less resources versus the decision-making processes of someone who has more resources? And then to show you how different the decision-making process is, regardless of how you decide to make this choice. And I’m trying to see if that causes people to have more empathy.
Bailer: Very cool.
Pennington: Sid thank you so much for talking with us today. This has been really interesting and good luck with the new project.
Suri: Thanks, I’ll need it.
Pennington: Stats and Stories is a partnership between Miami University’s Departments of Statistics and Media, Journalism and Film and the American Statistical Association. You can follow us on Twitter, Apple podcasts or other places where you can find podcasts. If you’d like to share your thoughts on the program send your email to firstname.lastname@example.org or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.