A Not So Standard Podcast | Stats + Stories Episode 212 / by Stats Stories

Hilary Parker (@hspter) is a Data Scientist, previously of Stitch Fix, Etsy, and the 2020 Biden for President Campaign. Her work focuses on the intersection of data science and product, from deeply understanding users to designing new experiences that depend on innovative data pipelines and client interactions.

Check Out “Not So Standard Deviations“

Support on Patreon

Roger D. Peng (@rdpeng) is a Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health and the Co-Director of the Johns Hopkins Data Science Lab. His research focuses on the development of statistical methods for addressing environmental health problems and he has made major contributions to our understanding of the health effects of indoor and outdoor air pollution.

Episode Description

Our lives are increasingly shaped by statistics and data. However, they remain concepts that can be difficult for broad audiences to understand. A number of outlets, including this one, have sprung up to help make them more accessible. Today another one, the “Not So Standard Deviations” podcast is the focus of this episode of Stats+Stories with guests Hilary Parker and Roger D. Peng.

+Timestamps

  • What was the motivation behind starting the show? 1:54
  • First episode 3:52
  • Tricky lessons learned along the way 5:48
  • Who do you feel about episodes before going live? 9:12
  • TikTok 13:10
  • Structure of the show 16:23
  • What do journalists do right/wrong when it comes to data science? 18:54


+Full Transcript

Rosemary Pennington
Our lives are increasingly shaped by statistics and data. However, they remain concepts that can be difficult for broad audiences to understand. A number of outlets including this one have sprung up to help make them more accessible today another one the not so standard deviations podcast is the focus of this episode of stats and stories where we explore the statistics behind the stories and the stories behind the statistics. I'm Rosemary Pennington. Stats and stories is a production of Miami University's Department of Statistics and media journalism and film, as well as the American Statistical Association. Joining me as always is regular panelist John Baylor Chair of Miami statistics department. Our guests today are the two hosts of the not so standard deviations podcast. Hilary Parker is a data scientist previously of stitch fix, Etsy and the 2020 Biden for President campaign. Her work focuses on the intersection of data science and product from deeply understanding users to designing new experiences that depend on innovative data pipelines and client interactions. Roger Peng is a professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health, and the co director of the John Hopkins data science lab. His research focuses on the development of statistical methods for addressing environmental health problems. And he's made major contributions to our understanding of the health effects of indoor and outdoor air pollution pings, the author of the popular book, our programming for data science, and 10 other books on data science and statistics. He's also a fellow of the American Statistical Association. Roger and Hillary, thank you so much for joining us today.

Hiliary Parker
No, thank you.

Rosemary Pennington
I just wonder how your podcast was born?

Roger Peng
Well, that's a good question. It's been a couple years now. So I had to remember.

John Bailer
So revisionist history already, come on.

Roger Peng
Well, so Hillary graduated from the Department of Biostatistics at Johns Hopkins. So I knew her for quite some time, but we kind of went our separate ways after she graduated. And then she and I met up at a conference, I guess it would have been in 2015. And just start talking and then after we met at the conference, I emailed her, I just said, Hey, you know, would you want to start a podcast about data science?

Hiliary Parker
Yeah. And this subject, I do remember the subject of that email was I want to broadcast your voice to the world.

John Bailer
Oh, boy. That's brilliant.

Rosemary Pennington
Yes, absolutely.

Hiliary Parker
It was very effective, very effective.

Roger Peng
Yeah. And I think that the general idea is to kind of I like to kind of create a podcast that I like to hear, which is, you know, to people who are knowledgeable, talking about, you know, what it is that they do, and just kind of the ins and outs of their lives and their jobs and kind of how how it works. And it kind of caught in a conversational way. So that's true. I think we tried to do that. And I think we were successful.

Hiliary Parker
I think something I'll add is that while I was at Hopkins, Roger actually started this kind of weekly casual meeting with faculty and students called Tea Time, where we were discussing the wheat it was the day after a weekly seminar. And so we would get together and ostensibly talk about that, but also talk about just whatever was going on. And Roger, you were I was in charge of Tea Time for a while as a grad student and Roger was like a very regular participant. And so I feel like we kind of became friends there. And then there was an idea with this podcast of like emulating that again, like the types of conversations we had there. Can we just kind of continue having them at a distance so

John Bailer
well, that it's clear that you have just a great rapport is part of the conversation on your show? If you continue taking this time machine back, what was your first episode? Oh, I think that's a really hard one, isn't it?

Roger Peng
Well, we literally just recorded our 144th episode, so it'll take a while to go back. Our first couple episodes, I struggled to even listen to it, because first of all, we had like, no equipment, and we had no, you know, we're just recording off of, I think our phone headphones. And I think the first episode was just What should we name this podcast, essentially,

Hiliary Parker
We also talked about this community building or like creating different spaces within the art community, the stats community, because we talked a lot about at the time, the friend of mine and myself were at U, a, us our conference. And somehow like someone had a picture of a cat in their slides, and we were like, Oh, we love that. And we were kind of aware of the fact that we were only like some of the only women at this conference. And so somehow this turned into a Twitter hashtag of our cat ladies and so we actually We talked about that I think in the first episode as well, about like, kind of feeling like we didn't quite fit into, like sports statistics or this computer. So it was a little bit like, here's the ways we've tried to like, kind of subvert or create our own thing and made stickers for our cat ladies eventually.

Rosemary Pennington
I know nothing about our I mean, truly, I mean, I know a little tiny bit from like talking to people, but I feel like I need one of those stickers if you make

Hiliary Parker
I need to reprint. I think I ran out. And so I just need to go with a reorder there. I really enjoyed that process. Because someone cave I put the DIS I had like a designer contributed. And they made the sticker have a cat just like lying on a keyboard of a laptop. I think if you own a cat, you've had that experience at some point. Yeah.

John Bailer
Yeah. With newspapers preceding that is in there the history of my cat experience.

Hiliary Parker
Exactly.

Rosemary Pennington
I wonder if there's been a topic that for you that it has been sort of tricky to kind of figure out how to talk about so I, I am I know stats, but obviously not as well as Mr. Baylor over here. And there are definitely times in the podcast where he says something or a guest will say something and I just look at him and like, what was that? I don't know. So you know, for me, there's often lots of things that I'm trying to figure out, like how to communicate well. And I wonder if there was something for you guys, as you were sort of working your way through this, that you really had a tricky time figuring out how to talk about in a way that is accessible to someone listening to a podcast?

Roger Peng
Yeah, I mean, I think we, you know, we try to figure out, essentially, what the tone of the podcast was for a while. But actually one of the questions that came up, I think, in the very first episode, was, we talked about how it's hard to kind of tell someone how to do a good data analysis. And I often have students come up to me after, you know, I used to teach a methods class. And they'd be like, well, I know all these methods. But what do I do now? Right? Like, like, I got this dataset, I got these methods. Now, what do I do? Right? And, to be honest, that has been a theme of our podcast the entire time, I think it's been the running topic. And I think we've struggled to kind of answer that question, like, the question of like, you know, what do I do for now six years, you know, almost over six years, actually. And it's just been and so that's been not the challenge has not been so much technical, as it has just been more. I mean, I think more deeply in terms of like, how, what do we what is the actual theoretical basis? Or what is the actual fundamentals of, you know, analyzing data, and doing it well-

Hiliary Parker
And I think, you know, to your point of the other person, saying that you like, have no clue what they're saying, or just these questions without answers, I think what has ended up being the most fun about the podcast is that we end up working through problems with the audience. And so I think that gives like some, I think that makes it more accessible. I know that the feedback we've gotten from especially beginners is that it's nice to hear people be more vulnerable and say, like, I don't know the answer. I think a lot of data scientists are used to kind of presenting like, here's the answer to this. And I know, some other podcasts have taken that approach. And like, we were just not the personality type to want to do like, let's talk about this method today, or let's like, dig into this. And so, not everyone will feel comfortable with that. And I don't think, you know, that I don't think everyone should feel comfortable fat or I mean, ideally, everyone could feel comfortable with that. But that's not where everyone's, I guess. And so I think that we haven't had a lot of issues where it's like, Ooh, I don't know what to talk about. Because that's sort of the premise of the podcast is that we don't know, but we're going to talk about it. And then the other thing I was gonna say is that we've actually made progress. Like, I feel like I've learned a ton from having this consistent conversation about, like, what is data analysis? And so that was, that was just totally unexpected for me that I would learn things and then give talks about it. And now I feel like more of an authority on certain things all because of doing a podcast. That was not expected.

Rosemary Pennington
Yeah, I totally understand that. Because I definitely feel like I have a much better understanding of stats than I did, coming into this. So thank you, mister.

John Bailer
Oh, It's a pleasure. And I would say I've learned a lot more about journalism, too. I mean, we picked Yeah, we picked kind of different, different perspectives and sort of coming to this, this podcast and, and just just hearing kind of the perspective of how journalists process and think about stories and how they take the results of statistical analysis has really been a blessing to me, in terms of of learning that that foundational idea and how they think about things differently than I do. Definitely. Has there been a particular episode that really generated just a lot of interest? You know, is there one that you would point to that was that you go wow, I'm I maybe you Did you predict it in advance?

Roger Peng
I can answer the second question much more easily, which is that I've never been able to predict when any, at all. I know Hillary knows, you know, there have been episodes where we're like we should even publish this. And then they generated all this discussion, like, I've never been able to in some episodes where you thought like, Oh, that was a really great conversation. And it's just like crickets. For sure. So that, so that's just in terms of like, the ability to determine what is popular is I don't have that ability. We have had a number we have, we don't normally have guests on our show. But we have had a few guests sprinkled throughout the six years. And those have always been really popular, I think episodes, so people-

Hiliary Parker
People don't actually want to listen to us, right. They're just waiting for the guests. That also I will say the one without a guest that was very popular was when I after the election when I kind of did a debrief on exactly what the election was like, especially the last election night and the days leading up to that. So yeah, I guess Roger blood, we actually have a topic, or other people, those are popular,

Roger Peng
Tend to be popular. Yeah.

Rosemary Pennington
I was actually gonna ask you about the I mean, I have several episodes that I'm curious about. But I was actually going to ask you about the dive into the election data, because that has become such a fraught conversation since the 2016 election, where it was so so quote, unquote, wrong. And I wonder sort of why you guys decided that you were going to dive into that? Because my inclination is to like, run in the other direction.

Hiliary Parker
Yeah. And I'm Howard. So that's not to say I don't believe that. I mean, I think part of the podcast is that we tried to be really authentic with like, what's on our mind this week? And so it was impossible to avoid that story. And I definitely know, for a fact, I said things in the podcast right after 2016 That I'm like, Ooh, I don't believe that anymore. You know, it was things like I was very, the content was driven by like, what was the thing that everyone was talking about that day? You know, yeah. And I do. I did have more hesitation. I don't know about you, Roger. But like, that was such a fraught thing. And like, everyone felt so trauma to, like, we were clearly traumatized by it. And so it was a little more fraught, but I don't feel like I have anything profound to say exactly. Like, we just kind of did it because that's, like the whole premise of the podcast. So

Roger Peng
I do recall two really popular episodes that stuck out to me where one was when we did the book club, actually, with the design thinking book that was actually surprisingly popular, because we actually went into quite detail and people really stuck with it. It was interesting. And the other one was one episode where we talked about COBOL. And I think like every COBOL programmer in the universe came out of the woodwork and sent us an email about it. I mean, it was totally unexpected.

John Bailer
I so I was required to write a COBOL program in a Pascal programming class because the the the instructor said, you should have experience writing at least one.

Roger Peng
I just think COBOL really brings it brings out you know, stirs the passions.

Rosemary Pennington
You're listening to Stats and Stories and today we're talking to Roger Peng and Hilary Parker of a not so standard deviation podcast. I do have a question about TikTok, so I have an 18 year old who we communicate by sending TikTok videos of animals. Yeah, but I still don't really get TikTok because I'm just not a visual person. So I wonder in your dive into TikTok, what did you guys learn? And what should we understand as old people teaching young people about TikTok?

Hiliary Parker
Did we dive into TikTok?

Roger Peng
What did I do a little while ago? I think it's fair to say that. I don't know if we died. We were dipped our toes,

Hiliary Parker
I think yeah, yeah. I feel like something was in the same boat. What are these young kids doing? Like this seems very foreign. I guess we talked about the data of TikTok I do now remember,

Roger Peng
I was skeptical that the algorithm that TikTok uses to suggest videos could a bit could have actually been that good. And that perhaps it was the data themselves? Like the videos that are being made are actually just inherently better. And so the game doesn't actually have to be that good.

Hiliary Parker
Yeah, we were taught Yeah, I remember talking about like, some breakthroughs, that platform to some degree was just making it so easy to make really good content, especially strain the format.

Roger Peng
Yeah. Yeah. As opposed to something like YouTube where it's totally unconstrained. But it's interesting to see other platforms now introduce those constraints. because, you know, and kind of now you can see other platforms, you know, people people are making videos that are more uniform and more like, you know, everything on YouTube now is 20 minutes and, and because the advertising structure and things like that and so it's it's interesting to see how the kind of the rules of the platform all drive all the data into a certain format essentially.

Hiliary Parker
So this is reminding me though, Who here has been following the bones versus no bones day,

Rosemary Pennington
I just figured this out like two days ago.

Hiliary Parker
Speaking of animal TikTok, that's been so fun. Roger, do you know this? I'm not on TikTok. It's this 13 year old pug. And every morning it's like in its little bed. And he's like, is it gonna be a bone stare no bone state and he takes the pug and like, like, puts it like as though it's standing. And then let's go. And like most of the time, the pug just flops back. And so that's a no bones day where you're supposed to like, relax, take care of yourself, like get the bath bomb out. And then if it's a bone steak he likes, stays up for a minute and like, but that's exactly the type of creativity that platform likes. Yeah, it's so fun. I don't know. Yeah, even though I'm not on there, and probably will not ever be on there because of the data concerns we've talked about. But I'll gladly like, look at it in a signed off way.

Rosemary Pennington
Yes, yes. I am, like kids send me all of the links to videos. I'm not going to actually use it. But I do appreciate the video. Yeah.

John Bailer
So I, you know, I'm still just struggling with COBOL stirring passion, but I'm gonna, I'm gonna keep processing that in the back of my mind. So I'm just curious. You talked about a couple of these topics that you've selected. What's kind of a typical pathway to an episode for you? Yeah. Is it loosey goosey? You know, show up on the mic and start? Or is it you know, how much structure is in play in advance of starting?

Roger Peng
So I just want to be clear, when I proposed this podcast, Hillary, I told her, she would have to do no work,

John Bailer
No bait and switch, except to show up.

Hiliary Parker
And he's been totally true to that. So we call him the CEO and him the president because like, I'm just a figurehead, and he does all the actual work.

Roger Peng
Right? Hillary speaks and talking points. And I actually like to dig up all the information right? Now, I mean, it's not very formal, like, you know, I don't we don't we don't even have an outline. Really. Often, you know, we record every two weeks. And that in each two week, interim, often, I'll just like, save links to stories, as you know, as they come up, and then usually six years straight. You know, every time we've shown up to record, there's been a list of things that we could pick from these recent stories, or just things that I thought about in the middle of the week, or experiences that Hillary had. And you know, and, and usually we're not at a, as you noticed, you know, our episodes are longer. And we've never had a shortage of material, it seems.

Hiliary Parker
Yeah, I would say there's always like a background processing of, oh, that might be good for the podcast. So like, some situation will arise or like some tweet or news story. And then it reminds me of the writers of Seinfeld, the way that they wrote that show was just mining, everyone's personal life was like candy, they would once people ran out, they'd like, hire new writers to come in, and then mind those people's personal life. So it's like to some degree like that, where at this point, even when I think about jobs, I'm like with this week of content, so it's like, it's kind of like there to some degree all the time. But then we've kept it so casual. I think that's why we've run so long, because it's just like, Oh, it's just like a fun conversation about things we're interested in. So it's not really that much work at all. For me.

Roger Peng
As promised

Hiliary Parker
Aore for Roger. Yeah,

John Bailer
Good for you, Roger. Good Delivery.

Rosemary Pennington
I think I lost out in the podcast situation here. I have a lot of work that I do to prepare for this show.

John Bailer
Oh, and I sit mint juleps while I sit on the back porch as we're getting ready. Yeah. Yeah. Give me that rosemary. No, right. Rosemary does a lot of work. It says, I think that we're sort of in a similar position in that we're often kind of just monitoring what we're seeing and what we're consuming. And it might be you know, we'll end up with news deserts that might be something that comes up for us thinking about journalism and its disappearance and kind of smaller communities. And so that's why, then it becomes, well, how would we talk about that? And who might we invite as a guest. So that's that kind of added twist of thinking about inviting others to be part of the conversation ends up being part of the monitoring. So you've been on my radar for quite a while both of you, let me let me

Rosemary Pennington
So I wonder, given that you guys are sort of talking through these like data problems. or statistics problems? I wonder what thoughts you have on the way. You know, data is covered. Like there's all these stories constantly about big data. And you know, certainly when Stitch Fix came out, there were a ton of stories about how that was working. And I wonder if you have thoughts on what you think journalists do? Well, when it comes to the coverage of stats, or data stories, and what you think they could do to improve that work.

Hiliary Parker
It's sort of just a huge coincidence, because we were just talking about a data story in San Francisco that drove me up the wall. And that was not even covering data, but rather data journalism, where I felt like they fell into some like, like some, they did some things that I think if you were more formally trained in statistics you would know not to do. And so that can get really frustrating. And I can't totally fault anyone, because it's like, if you're going to do all that training statistics, you'd probably go into a stat job and like journalism is such a tough environment. So that can always be hard. And that sounds, you know, this is like, obviously, you guys are thinking about this a lot. Like how do you do data journalism, right? Again, we're just talking about this, where it's like, I don't think, as a society, we have a great handle on this profession, and this field, and the realistic expectations around what the work is, and what sorts of things you could expect from it. And that is, in companies and on the inside as well as a broader societal thing. And so I can't prescribe what's going wrong, because it's like, Well, this was kind of going wrong everywhere. And I don't even think there's a right answer to.

Roger Peng
I mean, one of the things that I think is not unique to journalism, but journalism is a part of it. And it's just, it's just the treatment of specific analyses, and or maybe individual studies, if they're, like scientific research, and I think, even in academia, there's a, there's a tendency to want to assign like, a huge amount of weight to a single investigation, just because like, you know, this study is the answer, or this study proves everything's wrong, or, or whatever it is, it's hard to give up on just one study and be like, well, we have to look at a body of evidence. And that's hard for scientists and others to do so. But I think journalism kind of gets out there right at the public, the public is reading it. And so they kind of get picked on a little bit. But also, you don't want to be like writing a review paper every time. So I think it is a challenge to say, here's the study, it's not definitive, let's wait for more like, that doesn't sound like a great news article to me. But unfortunately, I think that's kind of what it should be, I guess.

Hiliary Parker
There is an aspect of, in some ways, data science that positions itself as like, let's not give in to the human impulse to like, find patterns where there aren't patterns, or trust anecdotes, rather than the larger body of evidence. So in some ways, they are at war with each other, because that's what makes good journalism is like, let's have this macro thing. Let's zoom into some examples and have quotes from people. And then maybe it's like healthy tension, where you as a journalist, you have to synthesize those, and really, as data analysts, data scientists to we should be doing that, but it's eating your vegetables, like people don't necessarily want to think in these more abstract ways, because that's just not how we're wired.

Rosemary Pennington
I like that idea of healthy tension. Because I think in journalism, I think that, you know, our deadlines are obviously very different. And then sort of the sort of space is very different. And I think there is that tension to like, wanting to get the story right. But I also want to make sure that it's not all cauliflower, I think.

Hiliary Parker
Yes, for sure. Yeah.

John Bailer
And yet, one of the common themes that we've had with a lot of the conversations that we've had is just trying to communicate uncertainty in the context of scientific studies. And then it's, it's a lot harder story to tell when there's fuzz. Mm hmm. When there's uncertainty and variability.

Hiliary Parker
And the article I was talking about drove me up the wall. Part of what was infuriating to me was that it was asking for such a small level of uncertainty for a decision that didn't need that level of rigor. You know, it was like, Hey, we're not talking about life or death here. We're talking about someone just saying, like, yeah, this thing seems like a problem. And so it's like, even if you get people on board for uncertainty, there's still the nuance of like, well, is this a civil trial where you just need clear and convincing evidence, or is this criminal or needs to be beyond a reasonable doubt? Like that's very, it's just like, it's literally hard, like this stuff has inherent complexity. Like it's not, it's not just complex for complex sake.

Rosemary Pennington
That’s all the time we have for this episode of Stats and Stories. Roger and Hillary, thank you so much for joining us today. It has been wonderful to sort of hear about your process and think about how I'm going to do a lot less work now.

John Bailer
Hey, thanks a lot, Roger.

Rosemary Pennington
Stats and Stories is a partnership between Miami University's Department of Statistics and media journalism and film and the American Statistical Association. You can follow us on Twitter at statsandstories, Apple podcast or other places where you find podcast. If you'd like to share your thoughts on the program. Send your email to statsandstories@miami.oh.edu Or check us out at statsandstories.net and be sure to listen for future editions of Stats and Stories where we discuss the statistics behind the stories and the stories behind the statistics.