Statistical Literacy | Stats + Stories Episode 364 by Stats Stories

Every year, statistics classes are filled with math averse students who white knuckle it to the end of the semester in the hopes of getting a passing grade. And the dream of forgetting about math and statistics for a little while. But what if it didn’t have to be that way? What if instead of white knuckling it, students were actually excited about the subject; or, at the very least, not terrified of it? Two professors has been developing strategies to help students get over their fear of “sadisistics” and that’s the focus of this special two part episode Stats and Stories

Read More

Explaining Science | Stats + Stories Episode 363 by Stats Stories

Ionica Smeets is, chair of the science communication and society research group at Leiden University. She’s also chair of the board of The National Centre of Expertise on Science and Society of The Netherlands. Her research lies in the gap between experts and the public when it comes to science communication, with special interest in the problems that occur when those groups communicate and what scientists can do about those problems. Smeets is the author of a number of journal articles on this topic and engaged in science communication for the public when she worked on a Dutch TV show about math. She’s also the co-creator of a children’s book called Maths and Life.

Episode Description

In a commencement speech in 2016, Atul Gawande told the crowd that science is a, "commitment to a systematic way of thinking, an allegiance to a way of building knowledge and explaining the universe through testing and factual observation." In the last ten years that understanding of science has become muddied for the public. Social media has helped fuel the rise of conspiracy theories built upon so-called alternative facts as people claiming to be experts spout anti-science ideas. Communicating scientific ideas was already difficult, but it’s become even more difficult in this environment. Science communication is the focus of this episode of Stats and Stories, with guest Ionica Smeets.

+Full Transcript

Coming Soon

Counting on Official Statistics | Stats+Stories Episode 360 by Stats Stories

Erica Groshen is a senior economics advisor at the Cornell University School of Industrial and Labor Relations and research fellow at the Upjohn Institute for Employment Research. From 2013 to 2017 she served as the 14th commissioner of the US Bureau of Labor Statistics, the principal federal agency responsible for measuring labor market activity, working conditions and inflation. She's an expert on official statistics, authoring an article in 2021 pondering their future.

Episode Description

When people think of public goods, they most likely think of things like parks or schools. But official statistics are also a kind of public good. They help us understand things like housing prices, the costs of goods and the spread of disease. However, this data infrastructure is under threat around the world. The work of official statisticians and the obstacles they face, is the focus of this episode of Stats and Stories with guest Erika Groshen.

+Full Transcript

Coming Soon

Chart Spark | Stats + Stories Episode 359 by Stats Stories

Being able to create compelling data visualizations is an expectation of a diverse array of fields, from sports to journalism to education. But learning how to create charts that spark joy can be difficult if you're not confident in your abilities. A recent book is designed to help people become more comfortable creating compelling charts, and it's the focus of this episode of Stats and Stories with guest Alli Torban.

Read More

Randomized Response Polling | Stats + Short Stories Episode 341 by Stats Stories

Dr. James Hanley is a professor of biostatistics in the Faculty of Medicine at McGill University. His work has received several awards including the Statistical Society of Canada Award for Impact of Applied and Collaborative Work and the Canadian Society of Epidemiology and Biostatistics: Lifetime Achievement Award.

+Full Transcript

————————

John Bailer
Did you ever think that you could know something about a population based on measurements that you didn't know were correct for any individual, or what it even meant for an individual in the population? That's something that's available through a method called randomized response. Well, not only could you ask questions about health care or health considerations and sensitive health questions, which was the motivation for it when it was developed many decades ago. There's a recent paper in the Journal of Statistics and Data Science Education on investigating sensitive issues in class through randomized response polling. And we're delighted to have James Hanley joining us to talk a little bit about this project. James, welcome back.

James Hanely
Thank you very, very much.

John Bailer
Yeah, so, so randomized response in classroom settings. Can you just give a quick summary of what the randomized response method is for our audience?

James Hanely
Yes, the idea is that I'm facing you. You're answering a survey, and I would like to know whether you've cheated on exams or not? Well, not you, but the class.

John Bailer
Me? Never James. Me, never, no, no.

James Hanely
But what about your taxes? Or, you know, what about something else, or I didn't give a book back to the library or whatever? But for a group, you can work out what proportion of them have or not with a certain plus and minus on it by giving you one of two questions to answer, and they could be the flip of each other as well, or they could be an irrelevant question, like, When was your mother born? Was your mother born in April? That's another version of it. Or did you cheat on your taxes? Those are two versions. So when I hear the answer yes from you, I don't know whether it's today. Are you lying about your mother? Are they cheating? And so the receiver can't interpret it. But when you put them all together, all the answers from all the classes should be a certain aggregate, and the aggregate is a mixture now of the two types of answers. So it's a mix. And if we know the mixing, which is what the probability of answering one way or the other does, we can then deconstruct it and separate out at an average level, what's going on. So that's the basic idea of it. Yeah, it's very clever. It hasn't worked very well, though, in sociology and in surveys. And I remember doing a seminar on this, and they gave you a talk about it when I graduated in 1973. The problem is that the general public doesn't understand it. They think you're cheating some or recording it, or doing some things, or have a camera. There's some way to do it. So I think it only works for a fairly sophisticated public, but the university students should be able to get it, but it's tricky. It's tricky. We were motivated by it because I was so annoyed that McGill wouldn't let us ask the question of our students whether they had been vaccinated against covid or not. So it was a huge political war at our university. I was talking to my co author, and I said, I am really esteemed, and I've actually written up this way of doing it again and adapting it. And Christian Jenna, my first co author, they said, oh my goodness, he had written a popular article for a journal doing it, but without any example, a real example that No, we've got to do this in class for real. Yeah. But the younger teachers at McGill didn't want to do it. They were afraid that the university had come down on them for breaking privacy laws, because in Quebec, your medical record is private and vaccination is part of your medical record. And in your country, you had no problem. Most of the American universities had no problem asking and insisting on vaccination. We were not allowed to, and it caused major trouble. And I sent the article to the provost the other day. I said, Look, you know, out of necessity comes methods, yeah, so we adapted it.

Rosemary Pennington
You stole my question from me because I was about to ask you what spurred this particular–

James Hanely
Don't get me started. We're still upset at the university in Quebec. It's private. Your vaccination status in Ontario and every other province with a different kind of law, or way of human you know, civil liberties, they had it the opposite way you had, yeah, if you weren't vaccinated and they let you into class, that's it. And you American reviewers of our paper had a tough time understanding why? Why? Why can't you ask? So we had a lot of trouble, and then we didn't get it accepted right away. So it was all about covid In the first version, and then we didn't get accepted right away. We need revisions, and we're all so busy, we didn't get to it. Revised the article two years later, and by that time, the whole story was stale. So that's when we had to broaden it so that it could go to cheating or whatever. But the original impetus was and in the article, I say in the little time. My own art class of 10 or 12, we repeated it. The one new twist we have is you can repeat the survey with people and ask them several times, and you can average the answers, and that's what gets you a narrower margin of error. And in fact, one of the reviewers said to her, if I asked you often enough, I should be able to figure out, even for you, whether you were ever not cheating, because the two mixes, rather than will, kind of diverge, you know, whichever you'll see one or the other eventually. But that was going too far.

John Bailer
Well, I'm afraid that's all the time we have for this rather short but very interesting episode of Stats and Short Stories. James, thank you so much for joining us.

James Hanley It was a pleasure.

John Bailer Stats and Stories is a partnership between Miami University's departments of statistics and media, journalism and film and the American Statistical Association. You can follow us on Twitter, Apple podcasts or other places where you can find podcasts. If you'd like to share your thoughts on our program, Send your email to stats and stories@miamioh.edu or check us out@statsandstories.net and be sure to listen for future editions of stats and stories where we discuss the statistics Behind the stories and the stories behind the statistics.

————————

The Nation's Data at Risk | Stats + Stories Episode 339 by Stats Stories

The democratic engine of the United States relies on accurate and reliable data to function. A year-long study of the 13 federal agencies involved in U-S data collection – including the Census Bureau, Bureau of Labor Statistics, and the National Center for Education Statistics – suggests that the nation’s statistics are at risk. The study was produced by the American Statistical Association in partnership with George Mason University and supported by the Sloan Foundation and is the focus of this episode of Stats+Stories

Read More

Getting Into Music Statistics | Stats + Short Stories Episode 330 by Stats Stories

Dr. Kobi Abayomi is the head of science for Gumball Demand Acceleration, a software service company for digital media. Dr. Abayomi was the first and founding Senior Vice President of Data Science at Warner Music Group. He has led data science groups at Barnes and Noble education and Warner media. As a consultant, he has worked with the United Nations Development Programme, the World Bank, the Innocence Project in the New York City Department of Education. He also serves on the Data Science Advisory Council at Seton Hall University where he holds an appointment in the mathematics and computer science department. Kobi, thank you so much for being here today.

Episode Description

We’ve always said that data science is a gateway to other fields on this show. From climate change to medical research, knowledge around numbers can be useful in just about every aspect of life. This is why we’ve brought back Kobi Abayomi to talk about his journey using data to get into the music industry on this episodes of Stats+Short Stories

+Full Transcript

Coming Next Week


Making Ethical Decisions Is Hard | Stats + Stories Episode 321 by Stats Stories

 

Stephanie Shipp is a research professor at the Biocomplexity Institute, University of Virginia. She co-founded and led the Social and Decision Analytics Division in 2013, starting at Virginia Tech and moving to the University of Virginia in 2018. Dr. Shipp’s work spans topics related to using all data to advance policy, the science of data science, community analytics, and innovation. She leads and engages in local, state, and federal projects to assess data quality and the ethical use of new and traditional data sources. She is leading the development of the Curated Data Enterprise (CDE) that aligns with the Census Bureau’s modernization and transformation and their Statistical Products First approach.

Donna LaLonde is the Associate Executive Director of the American Statistical Association (ASA) where she works with talented colleagues to advance the vision and mission of the ASA. Prior to joining the ASA in 2015, she was a faculty member at Washburn University where she enjoyed teaching and learning with colleagues and students; she also served in various administrative positions including interim chair of the Education Department and Associate Vice President for Academic Affairs. At the ASA, she supports activities associated with presidential initiatives, accreditation, education, and professional development. She also is a cohost of the Practical Significance podcast which John and Rosemary appeared on last year.

Episode Description

What fundamental values should data scientists and statisticians bring to their work? What principles should guide the work of data scientists and statisticians? What does right and wrong mean in the context of an analysis? That’s the topic of today's stats and stories episode with guests Stephanie Shipp and Donna LeLonde.

+Full Transcript

John Bailer
What fundamental values should data scientists and statisticians bring to their work? What principles should guide the work of data scientists and statisticians? What does right and wrong mean in the context of an analysis? Today's Stats and Stories episode will be a conversation about ethics and data science. I'm John Bailer. Stats and Stories is a production of Miami University's Department of Statistics and media, journalism and film, as well as the American Statistical Association. Rosemary Pennington is away. Our guests today are Dr. Stephanie Shipp and Donna LaLonde. Shipp is a research professor at the Biocomplexity Institute at the University of Virginia and a member of the American Statistical Association’s Committee on Professional Ethics, Symposium on data science and statistics Committee, and the professional issues and visibility Council. LaLonde is the Associate Executive Director of the American Statistical Association, where she supports activities associated with presidential initiatives, accreditation, education, and professional development. She's also a co-host of the practical significance podcast, Stephanie and Donna, thank you so much for being here today.

Stephanie Shipp
Well, thank you for having us. I'm delighted to be here.

Donna LaLonde
Thanks, John. It's always fun to have a conversation on Stats and Stories.

John Bailer
Oh, boy, I love that. I love getting that love from another podcaster. So thank you so much.

Donna LaLonde
Absolutely.

John Bailer
So your recent Chance article had a title ending in an exclamation mark Making Ethical Decisions is Hard! Well, I'd like to start our conversation with a little bit of unpacking of that title by having you describe an example or two, where data scientists encounter decisions that need to be informed by ethics.

Stephanie Shipp
I might start with that, because I'm the one that's always saying making ethical decisions is hard. And Donna seized on that and said, that will be the title of our article for Chance. And I'm like, Okay, that's great. So I don't have examples, but I want to just start by saying that I'm always on the hunt for tools to incorporate ethical thinking into our work. And I find conversations about ethics, especially with my staff primarily, who are young, a lot of postdocs and assistant research professors and students. But these conversations often go flat. So when we try to have conversations about our projects in the context of ethics, their reaction is well, I'm ethical, do you think I'm not ethical, or we only use publicly available data? So what's the big deal? And so we do a lot of the things like the traditional implicit bias training, and that's helpful. But that's actually more individually focused. It does translate to projects, because implicit bias is one of the areas of looking at ethics and projects. But it's not the entire answer. And so the focus of my work throughout my career has always been on: how do we benefit society? And thanks to Donna, if you notice that I'm participating in three AASA activities, I didn't actually realize that until they were listed, and I'm like, that's why I'm always so busy. Okay. I digress. Is that one of the first activities that I got involved in? Because I asked Donna if I would join the Committee on Professional Ethics. And there was a spot at that time because it's a committee of nine members, although they do have a lot of friends. And I was fortunate to join in the year that they were revising, they have to revise them every five years, the HSA guidelines. And I got to watch with awe, as a subgroup, every two weeks met and talked about how they would broaden those guidelines to incorporate data science and statistical practice across disciplines. I then also, at about the same time, was invited to be part of the Academic Data Science Alliance, and they were coming up with their own guidelines. And the group decided we had enough guidelines as good, the American for computing scientists are good. So why don't we create a tool which happened to me, I was like, This is great. And then I also became very involved in the history focused on societal benefit. So that's not really answering the ethical dilemmas I faced in my career, but sort of why I find making ethical decisions hard and what I've set out to try to do to maybe make it easier for not only me but others as well.

John Bailer
So Donna, you want to jump in with some sort of your sense of some cases or places where data scientists encounter decisions that need to be informed by ethics?

Donna LaLonde
Yeah, actually, we probably could have just titled the article Making Decisions is Hard. And I think that one of the reasons that I was so excited to see the ad work is the Academic Data Science Alliance, because I thought their focus on case studies aligned really nicely with the ethical guidelines for professional practice that the Committee on Professional Ethics had been involved in, in revising. And then obviously, the HSA board approved. And I think the reason that making ethical decisions is hard is, or maybe the two top reasons in my way of thinking, one is, is that there's often a power differential. And it's really hard to navigate that power differential just in your day to day work, right? If you're a junior investigator, and there's a more senior investigator, it can be difficult not to say that all of the conversation is too difficult, but it can be difficult to navigate, a concern or a potential place for disagreement about what's the best practice. And so that's, that's a part of where we're, I think the melding of case studies, and the ethical guidelines are really powerful, because it lets you practice before you're actually confronted with having to deal with a potential issue. I think the other issue that I became more aware of, as I was sitting in on the deliberations of the Committee on Professional Ethics, is there are a lot of stakeholders, and all of those stakeholders bring different perspectives and have different context. And so just navigating that landscape that is really complicated, also takes practice. So not specific examples of ethical decision making being hard, but sort of the bigger picture, which I think the ads tool and the ethical guidelines, help support.

John Bailer
You know, one of the things that I find interesting about discussions of professional ethics, ethics and data analysis, is that it's something that has evolved over time, you know, that you have this history. And you mentioned that in your article as well, going back to the late 1940s. So I was wondering if you could give a little bit of a sort of a lead into what was some of the history of research ethics, that then led to kind of this latest layering of considering data science issues?

Stephanie Shipp
I started with the Belmont Commission, which works only because that is the foundation for the IRB. So the Institutional Review Board processes that at least in the social sciences, we have to file an IRB protocol for every project that we undertake. Amazingly, there's a lot of disciplines that don't have to do that, although at UVA, that's somewhat different. But the Belmont Commission started because of the ethical failures of researchers primarily in the United States that were coming to the surface. Perhaps the most famous is the Tuskegee syphilis study that was conducted for a period of over 40 years in which African American men were subjected to a study of watching the progression of syphilis, even after penicillin had been discovered. And they were not told about the treatment, violating every ethical principle by today's standards. Because of that, I sort of wanted to say, Okay, how far back does this go and it actually, it's not that ethical discussions haven't gone back for a long time. But the first written one that I could find was the Nuremberg Code, which was a result of the atrocities of World War Two. And they had 10 ethical principles, and they were really clearly written but tense a lot to remember. And so 30 years later, when the Belmont commission formed around 1979, I think they realized that and they came up with the three principles of respect for people, which means you must be able to volunteer for the study, and you must be able to withdraw from the study. And that goes to the point that Donna made about the power differentials. You know, if there's somebody in authority telling you, you have to be part of that study, you may feel you have no choice, but that's not true. And then beneficence, understanding the risk of benefits of the study, but you have to weigh that with doing no harm and maximizing the benefits over possible harms. And then justice decides on the risks and benefits, so the research is distributed fairly. I think these are really important. But I also think their language is a bit hard to deal with, sometimes grab your, you know, wrap your arms around, and that's why I would advocate that you do need new tools and new ways of thinking. So that's a little bit of the history, but I think Donna's perspective was also really insightful when we looked at that and how we might be expanding our look at what the Mulla report did as well.

John Bailer
So Donna, did the AASA have sort of guidelines for professional ethics?

Donna LaLonde
It was informed by some of these discussions of this Menlo report. Well, actually the most recent revision was approved prior to the work that Stephanie Wendy Martinez and I have been doing and then it's since been joined by an ethicist colleague of Stephanie's. Although Stephanie mentioned she was on the Committee on Professional Ethics at the time that the working group was working on the revisions, and so certainly acknowledged the existence of the Menlo report. And obviously that's the Belmont, the Belmont Report. I think I'm excited about the opportunity and feel it's really critical that the HSA play a role moving forward. Now we're talking about artificial intelligence technologies, and how those technologies are going to impact science, but also society. I read, and I think I'll get this, this is close to correct, if not a direct quote, I read that Tim Berners Lee has said recently, that in 30 years, we'll all have a personal AI assistant. And it's up to us to work to make sure that it's the kind of assistance that we want. And I think that that's a really important conversation, that that needs to be informed by the American Statistical Association, obviously, that at the ad set group is really important as well, the Association for Computing Machinery, it has to be collaborative, because data science and AI is is collaborative, but we have to be focused on it right. And so I'm kind of excited that we might be able to use this Chance article as a jumping off point to figure out how to move that conversation forward and how to build some consensus. I'll just share one other reading. I don't know if you all, because I've just started reading the book, The Worlds That I See by Fei Fei Li, who, I guess now is being called the godmother of artificial intelligence, right. But anyway, in one of the chapters of the book, she says something like, we're moving into a world where from Ai being in vitro, to AI being in vivo, and I thought that is spot on. And we have to be paying attention.

John Bailer
Well, you're listening to Stats and Stories. Our guests today are Stephanie Shipp, and Donna LaLonde. Ethical uses of data have been legislated in parts of the world, including the European General Data Protection Regulation rules, are similar laws starting to emerge in the United States?

Donna LaLonde
Well, I'm not an expert on the laws, I would say similar conversations are happening. And I know that NIST, the National Institute for Standards, is leading the way by having framework conversations. Obviously, the White House issued a memo on artificial intelligence. So I don't, I'm not aware of laws. But I think certainly we're talking about how AI needs to be legislated.

John Bailer
So my question in part was sort of thinking about what are some of these rules of practice, and in your article, you talk about the importance of ethical decisions throughout the entire process, this whole investigative process. And one aspect of that was kind of the data security and data, you know, kind of how you deal with the data, and sort of this is a matter of trust. And that immediately got me thinking about things like this GDPR rules that were really kind of codifying, and forcing this idea. So that was an example of kind of saying, Look, if you're there certain information, informed uses of your data. So this is tying on some of those issues that you mentioned about informed consent, risks, benefits, and otherwise. Can you talk about some of the other components of an analysis where ethical decisions are coming into play? I mean, you know, Stephanie, you kind of hinted at it kind of with where you were talking about this idea of implicit bias, that might be part of an analysis. Maybe you could sort of expand on that a little bit for us.

Stephanie Shipp
Sure. I'll go back to your GDPR question for a second. I mean, that's primarily on the commercial side, and making sure that companies aren't misusing the data in ways unintended that could cause unintended consequences. Claire McKay Bowen has written a book, Protecting Your Privacy in a Data Driven World, and I highly recommend that and maybe highly recommend her. Maybe she's already been on Stats and Stories. Okay, and so she would be the expert to talk about that specific legislation. But definitely in terms of implicit bias, that's probably one of the hardest parts. Because we all think we're ethical. We all think we're very objective. When we're doing our work primarily as statisticians or economists or any, anyone in a quantitative field. I think it's because of constant conversations and training. And I'll just give a really simple example from work that we were doing a few years ago, where we were bringing data in science to inform or promote or support economic mobility in rural areas. And it was a three state project, we were working with colleagues in Virginia, Iowa, and Oregon. And one of the professors was just in this is what I find with ethics, when you see solutions. They're deceivingly simple and elegant. But, you know, thinking of those ahead of time, it's not always so easy. But anyway, this professor, they were just starting out with working with a project in rural areas. So he used a Mentimeter. It's a tool that collects data or answers from a team or a group, anonymously. And then it provides some analysis. In this case, he did a word cloud. So he just asked them a really simple question, what is life in rural America like? And so these students, you know, they quickly started putting in a lot of just words and keywords in their thoughts. But when the word cloud showed up, they immediately recognized their implicit bias. So there were a lot of positives or neutrals that they talked about rural areas being quiet, hardworking, healthy, small towns, crops, or farming, and also had a lot of negatives for the uneducated, ignorant, isolated, forgotten, non optimal. Well, they now went into their project working in a rural area with their eyes wide open, they now understood Oh, now when I'm looking at the research questions, we're going to be asking for problems that they mutually identified with the community. They could now address, am I being biased? When they're looking at the data sources they were using? You know, are these data sources? Will they have unintended consequences? What about my analysis? What are the results? Will they harm a particular group over another group, you know, maybe at the benefit of another group? So I thought that was just a very simple but excellent way to teach implicit bias specifically in the context of a research project. And that got me excited.

John Bailer
So would you think about the kind of workflow in a data analysis project? There's also analysis that occurs, there's modeling, there's prediction, and you mentioned some of their ethical issues, even in how you train a model, how you build a model to make predictions for other cases? Could you talk a little bit about how that might play out in terms of an ethical concern?

Donna LaLonde
Well, I'll just jump in and say, I think we started to appropriately pay more attention to vulnerable populations, right. And so that if the data set isn't reflective of the population, then the model is going to be flawed. And I think, you know, we all are probably familiar with some of the facial recognition, the concerns about facial recognition, right, and where white faces are more likely to be recognized than people of color. So I think it starts with the data that's being collected, then it also is, I think, we talked about models or being black box. Right? Really, do we really understand what the model is doing? Or do we just sort of trust and I think that many in our community are moving us to be more aware that we need to have interpretable machine learning, right, we need to understand what the model is doing. Because otherwise we're, we're likely to make flawed decisions. And I guess, John, I'll just say one thing, I think I left the tee out of NIST. So I want to make sure I give a shout out to the National Institute of Standards and Technology, right.

John Bailer
Nailed that answer to a tee. Perfect. Yeah. So it's interesting, when I was looking at some of your discussion in that pit in the paper, you talked about the idea that some of these rules like the ADSA ethos, talks about different lenses to think about, you know, work that's being done. Could you give a couple of examples of such lenses and why they're important?

Stephanie Shipp
I'm happy to jump in on that one. So I think in their case studies, they gave good examples, and one of the simplest ones and they say it was the simplest way to get the sort of story or get people thinking about this was using cell phone data to conduct a census and they just focused on the life cycle stage of data discovery. And of course, data discovery led them to say using cell phone data. And so what are the kinds of questions you might ask? It would be like, What was the motivation of the company for sharing their data? And are they sharing a complete set of data? What are the challenges with the data? Are they willing to be forthright about that? Or is it again, a black box? And if it's a black box, maybe you can validate those data using other data sources. But really going through that sort of the whole lifecycle and asking those questions, but how important first, the problem identification is to identifying those data sources that are relevant. And then really questioning? How are the data born? What's the motivation for providing them? What's missing in those data? And what kind of biases might be implicit in the data as well? And then again, always the ultimate question, how might this harm one group at the risk of benefiting another? And so the cellphone data in some countries, that may be all they have, they may not have the resources to conduct a census, but then how might you validate that if you are using it, so it's always weighing the pros and cons of the limitations and the caveats, with the benefits?

John Bailer
You know, it's interesting, as you're talking about some of these applications, in certain places, you can't get other kinds of data, they're not even available. And I know that the existing datasets are becoming more and more important to our friends in the official statistics community. Just because you know others, they're a great supplement to these existing data sources they can find. But I'm curious about this idea of provenance of data, just sort of knowing where it comes from. And that's also something that makes me think a lot about the models that are being used, whether they're the generative AI models, or others that are being used for prediction. A lot of times the good examples that you've given, people have provided a lot of detail about where their data comes from, and their analyses and they share the models on GitHub or some other repo, there's sort of, it's almost this, this kind of let the light shine in. And you can see what I've done. So is this a sea change in terms of how people are being asked to think about when they're doing an analysis, and thinking about when I publish my results, I'm also publishing everything that goes into it.

Donna LaLonde
So, I hope so, John. I think that I think and I hope that we, the members of the American Statistical Association, are leading the way for that, you know, which obviously builds on lots of great work around reproducibility and replicability. But I wanted to come back to your data provenance question, and bring in another group of folks that I think we explicitly want to acknowledge, who need to be a part of the ethical decision making education process, and that is students and teachers. And I think that students and teachers, not just at the undergraduate level, not just as graduate students, but K-12. And I think a lot about this, because I don't know if we are doing a sufficient job of describing the data provenance of the secondary data sources that teachers might bring into their classrooms. And I think that's on us, right. So the work that we are doing at the research level, where we're asking researchers to make their code available, make their data available, I think we need to be thinking about how we're describing these data sets that might be part of an educational experience. So that students are practiced in recognizing the provenance and the ethical concerns that could, could arise. And so wanted to make that explicit. And I think that's the kind of nice compliment that the ethical guidelines and the ads ethos project bring to mind for us, right? Because the lenses are really interesting in terms of a socio-technical view. And then the guidelines are really focused on you as the individual statistical practitioner. And I think you take those two together, and we actually have a powerful way in which to both educate and make sure that in practice researchers and data scientists and statisticians and computer scientists are behaving ethically.

John Bailer
You know, one of the things that I'm really glad that you all have done this type of work, so I sort of, you know, I raised my cup of water to you and saluted because I think that it's so important to have these. When I taught data practicum classes, I would use these as an early assignment for the students to start thinking you know, you're using data from someone you have a responsible ability to treat that with respect. And we used to, we also used to bring people in to do the IRB training with these classes, just to get them thinking about it. But I really love this idea of how do we push a conversation of kind of where does data come from? And what is your responsibility to handle this appropriately? Not just thinking that you can mechanically process it. I'm curious now, just sort of as we're sort of sneaking up on a close here. What do you see as kind of some of the future issues or challenges thinking about ethics and practice of data science and statistics?

Stephanie Shipp
I think we've already discussed some of them with AI. And how do we go forward? Donna, Wendy, the other co-author on the paper, and I have been talking about, does there need to be a Menlo commission, version two or point to 2.0. And Donna brought up education at a young age. I remember when my daughters were learning statistics in first and second grade, I was so excited. But now how do you incorporate like, Okay, where did the data come from? And what are the ethical dimensions of this not you need to of course, make those words a little easier to look at. I also think from this article, what I learned the most was the benefit of looking across disciplines. And I have a colleague who likes to say statistics is the quintessential transdisciplinary science. And in this article, we brought together science and technology studies through these four lenses through the ad sub tool. I learned a lot from that. Again, a lot of the language around ethics, though, I think is very hard to grapple with. And I wish there were a way to simplify that language. But once you understand the concepts, that's also important. We also looked at the computer and the IT world through the Menlo report. But it's also just beginning to look through these from a cross disciplinary perspective, which is what statistics does, but encouraging even more of that, because I think how much we learned just in doing this article and looking across disciplines as well. And then finally, just one last port when I gave my very first talk on statistics. And that was now I think, in hindsight how bold I was not being an expert, and still not an expert in this field. Somebody from industry stood up and said, How do we bring this to industry? And she meant it. But I don't think the industry always feels that way about that. But how do we bring these ethical dimensions of using data, which is part of the premise of the GDPR. Behind that are the teeth of that?

John Bailer
Well, I'm afraid that's all the time we have for this episode of Stats and Stories. Stephanie and Donna, thank you so much for joining us today.

Stephanie Shipp
Thank you.

Donna LaLonde Yep, thank you for having us.

John Bailer
Stats and Stories is a partnership between Miami University’s Departments of Statistics, and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter, Apple podcasts, or other places you can find podcasts. If you’d like to share your thoughts on the program send your email to statsandstories@miamioh.edu or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.


The Art of Writing for Data Science | Stats + Stories Episode 320 by Stats Stories

Sara Stoudt is an applied statistician at Bucknell University with research interests in ecology and the communication of statistics. Follow her on Twitter (@sastoudt) and check out her recent book with Deborah Nolan, Communicating with Data: The Art of Writing for Data Science.

Episode Description

Communicating clearly about data can be difficult but it’s also crucial if you want audiences to understand your work. Whether it’s through writing or speaking telling a compelling story about data can make it less abstract. That’s the focus of this episode of Stats+Stories with guest Sara Stoudt. 

+Full Transcript

Rosemary Pennington
Communicating clearly about data can be difficult. But it's also crucial if you want audiences to understand your work. Whether it's through writing or speaking, telling a compelling story about data can make it less abstract. Communicating with data is the focus of this episode of Stats and Stories, where we explore the statistics behind the stories and the stories behind the statistics. I'm Rosemary Pennington. Stats and Stories is a production of Miami University's departments of Statistics and Media, Journalism and Film as well as the American Statistical Association. Joining me as always is regular panelist, John Bailer, emeritus professor of statistics at Miami University. Our guest today is Sara Stoudt. Stoudt is an Applied Statistician and Assistant Professor of Mathematics at Bucknell University with research interest in ecology, and the communication of statistics. She's the author with Deborah Nolan of the book, Communicating With Data, the Art of Writing for Data Science. Sara, thank you so much for joining us today.

Sara Stoudt
Yeah, no problem. Thanks for having me.

Rosemary Pennington
You have been doing a lot of work around data communication, about writing about data, why did communicating data become this passion of yours?

Sara Stoudt
Yeah, it started sort of serendipitously, in that Deb Nolan, when I was in grad school, was thinking about teaching this class for undergrads, and reached out to me about maybe helping out. And so at that point, I hadn't really thought of myself, maybe as a writer, like, how do I claim that title, but through working on that class, and then writing the book after that we sort of both had to grapple with like, yes, we're statisticians. But we do a lot of communicating. And at some point, we have to claim that sort of title of writer as well. And so I think by starting with that process, and really working through the book, maybe sort of get more into it and think about how I might apply it to my teaching more and how I might apply it to my own work and sort of snowballed from there.

John Bailer
So now, I gotta ask you, are you a better writer now?

Sara Stoudt
Maybe I think I'm a better writer now. I think that I think more about my writing now than maybe I did before. I don't know if that helps or hurts, but I think that I pay more attention to it. And when I'm doing other things, I'm thinking more about reading, like when I'm reading just for fun. Now, I'm like, in my head about that a little bit. And I think that's a good thing.

John Bailer
No, I agree completely, just this, I really love seeing the diversity of different types of ways that you approach ideas in writing, you know, ranging from Significance article reading to writing to another Significance piece, can TV make you a better stats communicator? So I'd like to just sort of explore those maybe in reverse order. Because I think though, the one about the TV shorts, you know, these sort of small episodes as being this model, can you give a reason why you are inspired to connect to what was going on in these small, episodic television shows? And what that might teach us about writing?

Sara Stoudt
Yeah, I think for me, I was writing a lot of talks. And I was thinking about, like, zooming out, like, how am I writing this talk, because you give a talk for lots of different audiences. And the job talk is maybe a little bit more formal. But more recently, I've been doing more talks for broader audiences. And I had to mix up my approach. And I think it's also just the 20 Minute versus like, the 40 Minute versus like, the five minute talk, like all those things, take different structures. And I was trying to think about that. But at the same time, it was like right after deep pandemic, and I had just watched a lot of TV, frankly, and rewatching, a lot of my old favorite shows, but from the beginning, and really paying attention to the pilot of like, how much has to actually get done in the pilot to set things up. And you don't appreciate it until you know what the story actually is to like, how much effort went into that? So I was thinking all about that. And I was thinking, Oh, this is sort of related to how you do the talk, like, you know, the whole storyline. How do you set it up, you only have so many minutes to get the point across. And so part of it was like me justifying watching so much TV. Another part of it was just like, how do you write a good talk? I think it's sort of elusive. And doing it for different audiences, different time points, like having a good sort of structure, I think can go a long way. And that was sort of the motivation for that piece.

Rosemary Pennington
As you've been doing this work on sort of communicating data broadly, have you noticed things that are particular hiccups for you and how have you sort of worked around them?

Sara Stoudt
That's a great question. Yes, I have many hiccups. I think that sometimes, and you might see it today, like I can tend to monologue and in my head, I'm like, yes, this all is gelling. But because I have all of this extra context, I forget that the connections are not necessarily being made by the audience. Right? It's sort of like the stream of consciousness makes sense for me, but not for everyone. And I think that gets back to the planning. And I think a lot of the work I've done recently is the planning of writing. Because you have to take that step back. And I think we can just sort of forget to do that, because we're pressed for time we're reading that talk on a plane, you know, you just don't have that sort of zoom out, like, “what am I saying” moment? So I think that gets, like planning the talk. The reading too, right? Just slowing down, I think, is my biggest hiccup. I'm sort of like, Oh, I gotta do this, I gotta do this. But if I take the time to breathe, and zoom out, like, what am I saying? What is the goal? What's the best way to do this, even starting with pictures? Sometimes I just start the talks with like, all of the plots, or the little doodles that tell the story. I think that has helped me a lot too, because I think I can just sort of jump in too quickly, and then get in the weeds. So I've been trying to pull myself out of that.

John Bailer
Yeah, that's, I recognize that same temptation. And, you know, when I've done this kind of writing, I think a lot about having to pull out, you know, in sort of thinking big picture. And one thing that really struck me when I was reading one of your pieces on or reviewing some of your slides from this storyboarding talk that you did, as part of this idea process, the process of writing is that the punch line has been organized in the form of narrative. And one of the things this podcast has taught me is a lot more about thinking about the narrative that goes along with an analysis or with any kind of work that you're doing in research. So can you talk a little bit about the kind of insights that you've kind of gained about structure from the idea of storyboarding?

Sara Stoudt
Yeah, I think the main thing is that when we do statistical work, we're so proud of all the stuff we did, we're like, I did this, I did this. And II did this fancy thing. But ultimately, that's not what the reader cares about, they want to know what you found. So I think it's this temptation of like, you want to show what you did. But that's only ancillary to what you actually are trying to say, which is the findings. And so trying to this gets back at the like, taking a breath. It's like you have to switch gears from doing the stuff to saying: What is the big picture? And so I think the storyboarding helps you sort of shift gears. It's like, don't talk about what plots you made, what analysis you did, what are the common themes? What did you find? How does this connect towards a bigger picture, and it also makes you sort of Kill Your Darlings, you can't put every plot in a paper or a talk. And so you have all these things, and you have to sort of whittle it down. So I think the storyboarding is both just like it's iterative. It's really tactile, you sort of have to think there's like, no numbers involved. It's like, maybe there's some plots, and you're rearranging, so I think it sort of helps you sort of shift that gear. And I do this all the time. I mean, to write a talk and write a paper now, I'm like a very tactile writer. And so I think doing that activity with students has really helped us all sort of shift gears, right, fewer reports have like, I made a histogram of this, I ran a regression and more just like, this is skewed left, which means this and the regression tells me this, I think helping us to get towards that sort of language is what motivated the storyboard and why I keep sort of using it.

Rosemary Pennington
When I was in my past life, when I was a journalist, I did Science and Medical reporting, sort of towards the end of my time. And I loved it. I absolutely loved it. But it was always a little tough to sometimes get scientists to talk to me, because they were always so scared that their work would be misconstrued. Or they were concerned, I had more than one say, I don't have it, like, you know, five minutes is not enough time to communicate, whatever it is, and I guess what advice would you have for, you know, statisticians or scientists or anyone who has data that they want to communicate around the fear that they're not going to have enough time to tell it clearly? Or if they or if they tell it, they're not going to do their work justice, if they sort of have to sort of make it very simple, or turn it into a narrative?

Sara Stoudt
Yeah, I definitely feel that tension. Statisticians are so annoying, because yeah. Did you say that Sara just reinforced her belief? I think it comes down to the level of detail. It's like, maybe we don't want to talk about that one regression result in five minutes because there's nuance but that regression result means something in context and you want people to know about that thing. So not to sound like a broken record, but I think it comes down to the zoom in and out thing like, I think you can zoom out in five minutes, what's the impact of your work? Let's not try to explain necessarily the details of how you got there in that sort of form of communication, perhaps. But I think it's hard because that's not the part we get the most practice with. We're in the weeds most of the time. And so trying to navigate that is, is challenging. But I feel that tension too. Like, sometimes I'm like, Oh, I don't really want to explain what I'm doing right here until it's perfect. But, but that, how are you going to get your work out there? So it's a balance, but maybe focusing on the impact first. And trying to get away from the things you feel most worried about the precision for?

John Bailer
You know, what you just said, really, really resonates. This idea is what do you spend most of your time doing? What is the focus of your effort, and one of our former colleagues, Richard Campbell, was fond of saying that people are the best writers they'll ever be when they're just getting out of composition after their first year at the University, because they don't write a lot more after that. And in you know, the ideas that you have to you become a better writer by writing, you know, that and having some structure, I think, really kind of catalyzes that in a real, real great way. So I find that this challenge is trying to help get people out of the kind of full in technical focus, and then expanding it to think about okay, now, how do you take from the technical out to the broader community? So what are things that you've been doing to kind of help the students that you work with and the communities that you interact with to do that?

Sara Stoudt
Yeah, I think one thing is just the fact that if you think about the structure of a typical assignment, it's like you do a final project, you turn it in, and then that's it. Right? You don't get the chance to iterate. And that's where you start to get at the like, what is this really saying? And so what we've done at Bucknell is sort of add in the iterative process more in the project. So we actually teach a writing intensive designated intro stat. And that means that that comes along with having to do revision throughout the semester, and they get tons of feedback from peers from the instructor. And they rewrite different parts that come together as a full report. And so they'd just like to spend more time noodling on it for lack of a better word. And so I do think we still need to push more on zooming out, what's the big picture? Because I think we spent a lot of time on the preciseness of how they're talking about the results. What does that significance level mean? That kind of thing, and that kind of class. But I think just building in time to revise before the final deadline goes a long way. I think it's hard because it does take a lot of feedback time in the semester, which is challenging to do quickly, especially at scale. But I think you have to show students that revision is part of the process. And to do that, they have to revise the final project. And so that means pushing back deadlines so that you have time for that. But the context part is important too. And I think I actually want to do more with that, because I think I'm not doing a great job of pulling that out. I feel that tension. It's something like interest at work content seems king, but thinking about how to do that, as they keep progressing as statisticians thinking more about those conclusion sections and trying to work shop those more than the results sections, which is what we ended up having to focus on at least in that class.

Rosemary Pennington
You're listening to Stats and Stories. And today we're talking with Bucknell University Sara Stoudt about communicating with data. Sara, so I'm going to sort of take this question slightly sideways, I asked a former journalism professor, revise Yes. Like we revise those kids revise till the end of the semester. But I wonder what advice you would have for a working journalist who maybe is trying to report on data. You know, most of us are generalists. Many of us are not comfortable with numbers and stats. I mean, that is a stereotype that doesn't linger, because it's sort of, there's some truth in it. So I wonder, you know, we want to communicate this clearly. Because we think it's important to our audiences. What advice given sort of what you've been doing, would you have for journalists when it comes to reporting on stories that involve data, whether it's complicated or not?

Sara Stoudt
I think one thing is like, have a buddy, like, statisticians we're friendly. If you find someone that you'd like, work well with workshopping it that way, because I have collaborators who just help me write better in general. And I think journalists can have that too. And I would love to see more cross pollination with that. Because, yeah, like statisticians want to be able to write for broader audiences better, too. So that seems like a win-win. I think there's some common statistical things that everybody is fussy about, and doing a little reading up on that. I mean, you have a lot of, I'm not saying do more work, because I know, journalists are busy and doing important things. But maybe like, you know, a little community that talks about some of those big ticket items, like, you know, how to report on a p-value, how to report on a confidence interval. It's dry, but that's the stuff that gets you, but maybe doing it in a more community setting. And maybe I started getting a group of statisticians and journalists together to do that. Because as teachers, we face that, too. It's like, how many ways can I explain this? It's still confusing. So it's good for us all to practice, I think. But I don't have any magic solutions. You never know, I guess.

John Bailer
Yeah. So, before we started the podcast, I team-taught a class with a journalist with Richard Campbell, this was quite a while ago, and it was interesting to me to think about the style of writing was so different, that he was talking about then what I was, was thinking about, and that I had done professionally. You know, he was there, there was a sharpness and focus to what he would bring to writing that I found myself being surprised by I mean, not not in a bad way. But just, it was just such a different style. And I was realizing there were these multiple epiphanies for me about kind of this idea of how often in my own writing, I wasn't getting to the point as quickly as I could have. And I wasn't kind of spending so much time on the talking about process, but maybe not getting to the punch line with this the kind of emphasis that it really deserved. So that so I, I mean, I think that the exposure that that for me, as a statistician, and working with with journalism colleagues, has helped me become a much better writer, and a communicator, because of trying to think about, well, gosh, if I tried to do what they're doing, how does that what does that mean in terms of how I produce a product had I written or oral product? So it sounds like you've learned a lot and went through these processes, but also these examples that you found whether it was from pilots from a television show, or from other models? I know that you're sort of thinking that a question will eventually emerge. And I'm wondering, yes, I always wonder, and that's always the problem here. So I would like though, to get to get back to this idea of the pacing and timing of a story as it as a sort of parallels, you know, a big bang theory episode, you know, so I love this idea of the these parent thinking of these parallels between early on introducing kinds of characters and introducing context, and then kind of introducing some conflict and some resolution to conflict and some punch line to the very end. So could you just kind of give us a little kind of a talk through of kind of, of the parallels between kind of, you know, starting out where, where the characters meet? And what does that mean in terms of statistics, and then going through the rest, please?

Sara Stoudt
Yeah, so if you're giving a talk about your own work, you know, everything, but literally, people don't come in with any context. And they have to like to care about it by the end of your talk, because you want them to follow up, because you're not going to tell them everything. Same with the pilot, it's like you have this like 20 minute period, to hook them and have them come back. And you have to set up everything. They don't know anything about the characters or the setting, like, what's the show gonna be about. So you have to cover a lot of ground. And if you think about how you want to present your work, people have to understand why you're doing the work, because that's part of the way of getting them there. Why is your work hard? Like, why is it a big deal that you're doing the work and sort of connecting it to what other people might be doing. So it's sort of like, actually, in the first talk that people hear from you. It doesn't even matter how you're doing the thing. They just need to know why you're doing the thing. And what makes it interesting, or hard that it's worth doing, because they'll follow up and read the paper after that if they care. Same with the pilot, they'll keep watching the show once they're sort of brought in. So I think you have to think about it in terms of stripping it way back to when you started the project, right? Like, why? Why did you pick it as an interesting problem? Who brought you the context, if you're a statistician who's working in applied field, you also have the challenge of talking about the context of the work so like I work in ecology, if I'm presenting at a SAS conference, there's some baseline ecology, I also have to cover in that talk. Right so you can imagine like, okay, maybe ecology terms are like the characters, you got to learn what they're about. You have to learn what the major conflict is. There's an ecological conflict, like why do I care from that point of view? But then there's a statistical conflict of like, why is this a stats problem? That's hard and I started going from there. But I've described all that, then do you have, like, 20 minutes?

John Bailer
No, that helps a lot. I mean, the idea of the the images ties back to kind of some of your storyboarding, I love the idea of thinking about putting all of your plots on, you know, on some display and sort of moving them around, and maybe connecting them in terms of the story that you want to tell annexing out the ones that aren't effective. When I taught visualization or other kinds of data practicum classes, I would often say you'll make more than 10 times more the number of figures you'll ever include in a report that you issue, just because you're trying to find the right way to tell the story. And ultimately, for me, I often found that if I could generate the figure that spoke to me, I could write the text that would describe it to others. So do you find any kind of relevance and importance of kind for you doing the visualizations as part of input and inspiration for the text that you would produce?

Sara Stoudt
Yeah. And actually, I've been doing a lot of things, not even on the computer, but like, sketching what is the graph I want? That will show me what I need? Or what do I expect this to look like if what I'm thinking about is true. And then trying to make that graph. Because I think sometimes when I'm just making graphs on the fly, I'm making ones that are easy for me to code, but are not necessarily the right graphs. And so I've been doing a lot of that sort of thing, like doodling. And I think that has helped and especially if you're thinking about the right conceptual diagram for explaining your work, that is also something that I need to draw first, because I'm not great at the like, shapes on the Google Slides or whatever. But I think that it really helped me solidify the story. Because sometimes if I'm just looking at a bunch of, you know, scatter plots, histograms, it's hard to, like, really see what's going on. So thinking about the maybe less traditional visualization that would like to really consolidate everything, and then trying to think about like, is this a plot I can actually make?

Rosemary Pennington
So you've been doing this work for a while now you've done work around how to present and the storyboarding and you have the book, what sort of next for you when it comes to stats communication, like what do you want to be working on next?

Sara Stoudt
Yeah, I think for me, personally I'm thinking a lot of like creative writing that's related to stats and data. So thinking about either data or statistics concepts as constraints for something like maybe like, Could you write a poem that's constrained in a way that's informed by data? Or could you write short stories or speculative fiction that have these sort of like data II concepts? You think there's all this sci fi now, that has to do with, you know, climate change, or the rise of machine learning and like the ethics of those things? I think that we could also write more stats focused fiction, not just for the sake of writing them, but I could see them being useful teaching tools. I think I'm personally just trying to break this sort of false binary of like, you're a quantitative person, or you're like a creative type. And so I'm really interested in trying to fuse those and like, can we do more artsy things with data? So that's what I'm thinking a lot about. I don't know if that's necessarily going to end up my professional take on communication. But I'm really trying to do that for myself. I think when I started down this road again, I didn't really claim the ownership of the title writer. And now that I feel like I can say that, I feel like the next hurdle is like, Are you a creative writer? Like, can I write more than just nonfiction? So we'll see where that goes.

Rosemary Pennington
Well,thank you so much for being here today, Sara. That's all the time we have for this episode. It's been great talking with you.

Sara Stoudt
Yeah, thanks for having me again.

Rosemary Pennington
Stats and Stories is a partnership between Miami University’s Departments of Statistics, and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter, Apple podcasts, or other places you can find podcasts. If you’d like to share your thoughts on the program send your email to statsandstories@miamioh.edu or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.


Data Visualization Contest Winner | Stats + Stories Episode 300 by Stats Stories

Nicole Mark is a visual learner and communicator who found her passion in the field of data visualization. She started out making maps of imaginary worlds and cataloging her volumes of The Baby-Sitters Club on her family's original Apple Macintosh. Now, she analyzes and visualizes data in Tableau and with code, always on a Mac! She writes about dataviz, life with ADHD, and the modern workplace in her blog, SELECT * FROM data. Nicole co-leads Women in Dataviz and the Healthcare Tableau User Group. She’s working on her master’s in data science at the University of Colorado, Boulder. Check out her Tableau site.

Episode Description

After producing hundreds of episodes we have lots of data lying around. Data we made available to you, asking you to crunch the numbers for a contest that told the story of our podcast. The winner of that contest Nicole Mark joins us today on Stats+Stories.

+Full Transcript

Coming Soon

Stats and Stories is a partnership between Miami University’s Departments of Statistics, and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter, Apple podcasts, or other places you can find podcasts. If you’d like to share your thoughts on the program send your email to statsandstories@miamioh.edu or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.


Viral Statistical Capacity Building | Stats + Stories Episode 293 (Live From the WSC) by Stats Stories

Matthew Shearing is a private sector consultant working globally in partnership with the public, private and not-for-profit sectors on improving official statistics and other data systems, Monitoring and Evaluation, and embedding official statistics standards in wider international development.

David Stern is a Mathematical Scientist and Educator. He is a former lecturer in the School of Mathematics, Statistics and Actuarial Sciences at Maseno University in Kenya and a founding board member of African Maths Initiative (AMI).

Read More

Survey Statistics: Where is it Heading? | Stats + Short Stories Episode 292 (Live From the WSC) by Stats Stories

Natalie Shlomo is Professor of Social Statistics since joining the faculty in September 2012. She was the head of the Department of Social Statistics (2014-2017). Her research interests are in topics related to survey statistics and survey methodology. She is the UK principle investigator for several collaborative grants from the 7th Framework Programme and H2020 of the European Union all involving research in improving survey statistics and dissemination. She was the principle investigator for the ESRC grant on theoretical sample designs for a new UK birth cohort and co-investigator for the NCRM grant focusing on non-response in biosocial research. She was also principle investigator for the Leverhulme Trust International Network Grant on Bayesian Adaptive Survey Designs. She is an elected member of the International Statistical Institute and a fellow of the Royal Statistical Society. She is an elected council member (to 2021) and Vice-President (to 2019) of the International Statistical Institute. She serves on editorial boards of several journals as well as national and international advisory boards.

Read More

Are We Trustworthy? | Stats + Stories Episode 290 by Stats Stories

Communicating facts about science well, is an art. Especially if you are trying to reach an audience outside your area of expertise. A statistician in Norway however, is convinced that how you say something is just as important as what you say when it comes to science communication. That topic is the focus of this episode of Stats+Stories with guest Jo Røislien.

Read More

C.R. Rao: A Statistics Legend by Stats Stories

The International Prize in Statistics is one of the most prestigious prizes in the field. Awarded every two years at the ISI World Statistics Congress, it’s designed to recognize a single statistician or a team of statisticians for a significant body of work. This year’s winner is C.R. Rao, professor emeritus at Pennsylvania State University and Research Professor at the University at Buffalo. Rao’s created and been honored for a number of contributions to the statistical world in his over 75-year career. That’s the focus of this episode of Stats and Stories, with our guests Sreenivas Rao Jammalamadaka and Krishna Kumar.

Read More

Judging Words by the Company They Keep | Stats + Stories Episode 269 by Stats Stories

The close reading of texts is a methodology that's often used in humanities disciplines, as scholars seek to understand what meanings and ideas a text is designed to communicate. While such close readings have historically been done sans technology, the use of computational methods in textual analysis is a growing area of inquiry. It's also the focus of this episode of Stats and Stories with guest Collin Jennings.

Read More

Rewards Points vs. Privacy | Stats + Short Stories Episode 262 by Stats Stories

Everyone can relate to being in a rush and needing to get just one last item from the store. However, upon reaching the checkout line, after being asked the all too often refrain of, “can I get your loyalty card or phone number” you may wonder why is this information so important to a store. The annoyance and potential ramifications of giving up your data so freely is the focus of this episode of Stats+Stories with guest Claire McKay Bowen.

Read More

Talking to a Statistical Knight | Stats + Short Stories Episode 259 by Stats Stories

Sir Bernard Silverman is an eminent British Statistician whose career has spanned academia, central government, and public office. He was President of the Royal Statistical Society in 2010 before stepping down to become Chief Scientific Adviser to the Home Office until 2017. Since 2018, Sir Bernard has been a part-time Professor of Modern Slavery Statistics at the University of Nottingham and also has a portfolio of roles in Government, as chair of the Geospatial Commission, the Technology Advisory Panel to the Investigatory Powers Commissioner, and the Methodological Assurance Panel for the Census.  He was awarded a knighthood in 2018 for public service and services to science. 

Episode Description

Sir Bernard Silverman is an eminent British Statistician whose career has spanned academia, central government, and public office. He will discuss his wide-ranging career in statistics with Professor Denise Lievesley, herself a distinguished British social statistician.

+Full Transcript

Coming Soon

Stats and Stories is a partnership between Miami University’s Departments of Statistics, and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter, Apple podcasts, or other places you can find podcasts. If you’d like to share your thoughts on the program send your email to statsandstories@miamioh.edu or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.


A Shared Passion for Math and Statistics | Stats + Short Stories Episode 257 by Stats Stories

At Stats and Stories, we love to have statisticians and journalists tell stories of their careers and give advice to inspire younger professionals and the next generation about what they can do with the power of data. However, we have yet to have a couple join us to talk about their careers and how statistics in Brazil have progressed over the past 30 years. That's the focus of this episode of Stats and Stories Pedro and Denise Silva. 

Read More