How the Bureau of Labor Statistics Gets its Data | Stats + Stories Episode 113 / by Stats Stories

Wendy Martinez has been serving as the Director of the Mathematical Statistics Research Center at the Bureau of Labor Statistics (BLS) for six years. Prior to this, she served in several research positions throughout the Department of Defense. She held the position of Science and Technology Program Officer at the Office of Naval Research, where she established a research portfolio comprised of academia and industry performers developing data science products for the future Navy and Marine Corps. She was honored by the American Statistical Association when she received the ASA Founders Award at the JSM 2017 conference. Wendy is also proud and grateful to have been elected as the 2020 ASA President.

+ Full Transcript

Rosemary Pennington: Understanding the realities of the workforce is important for workers, employers, scholars, and pretty much everyone else living in a particular economy. The United States Department of Labor monitors workforce issues at the national level, and the data managed and analyzed by its Bureau of Labor Statistics helps it do so. That’s the focus of this episode of Stats and Stories where we explore the statistics behind the stories and the stories behind the statistics. I’m Rosemary Pennington. Stats and Stories is a production of Miami University’s Departments of Statistics, and Media, Journalism and Film, as well as the American Statistical Association. Joining me in the studio are regular panelists John Bailer, Chair of Miami Statistics Department, and Richard Campbell, former Chair of Media, Journalism and Film. Our guest today is Wendy Martinez, incoming President of the American Statistical Association. She’s also the Director of the Mathematical Statistics Research Center at the Bureau of Labor and Statistics. Wendy thank you so much for being here today.

Wendy Martinez: Thank you for having me.

Pennington: Could you, just to get us started, talk a bit about the kinds of data your center is in charge of at the bureau?

Martinez: Yes. Actually, let me back up a little bit and talk about this- my office. So, I’m in the office of survey methods research and we have two sides to our office. One is the NASDAQ Research Center, which is the one I belong to. And then we also have the Behavioral Science Group and they’re more involved with the questionnaire design and that type of thing. So, our office really supports all of the BLS programs.

John Bailer: So, what does it mean to say you support those programs?

Martinez: The Bureau of Labor Statistics has three main program areas, or offices, I should say. One is the Office of Prices and Living Conditions. So that office collects information and data on prices. They have the Consumer Expenditure survey, all of that information and data is used to do the Consumer Price Index, which is hopefully familiar with most of the listeners. The other office is Employee Working Conditions and Benefits, and they do things like collect information on what does it cost to have an employee, all the benefits that employers provide all over the country. Also, what I think is really important for employees is that they also collect information on injuries and illnesses. And the third one is the Employment/Unemployment Statistics, and that office is responsible for producing the monthly unemployment numbers that most everybody is familiar with. That usually comes out the last Friday of the month. And finally, we have another office that’s concerned with quota activity. So, our office, these two sides that I mentioned- these two research centers support all of those surveys with doing basic research tasks.

Bailer: So, when I hear things like productivity- that seems like a really nebulous concept. And it seems like productivity may have changed over the years. I mean, if we were thinking about working on an assembly line and productivity in that context, versus thinking about software being developed, or a new app being developed and then distributed, how is productivity defined? And how has that evolved over time?

Martinez: I have to say- well, I’m going to back up a little bit again. Most of the employees at the Bureau of Labor Statistics are really economists. So, some of the economics is a bit of a mystery to me. And probably productivity is the thing that’s most a mystery to me. But yeah, I think- so our Office of Productivity- it’s kind of interesting because they do not collect data themselves. They rely on data collected by the Bureau of Economic Analysis, the Census Bureau, and I think other offices at the BLS. So, they do use macro-economic models, models that would be of interest to statisticians, you know regular regression, and so forth. So, they kind of take a lot of information together and look at productivity in areas like manufacturing or GDP and so forth. And I should say in closing on that question is that our new commissioner, Dr. William Beech- he’s very interested and emphasizes the issue of productivity.

Richard Campbell: So, in my own research- and two of us at the table here are very interested in the job-outlook for reporters, and so I’ll use BLS data, and I know [Pew?] does- how do we know how many reporters there are? That are actually working at daily newspapers, how do we gather that data? How do we know that information?

Martinez: Ok, that’s a god question. There’s a survey in our Employment/Unemployment Statistics Office, and it’s called the Occupational Employment Survey. If you go into BLS.gov you can find the data for these surveys. I really like that survey because the data- you can download it and it’s very easy to understand what’s there and to use the data. But they have statistics on occupations over different industries and throughout the country, so they would have information on journalists and other occupations.

Bailer: What’s the size of the staff of people who are gathering that kind of information?

Martinez: The BLS has about 1,500 employees in the National Office and there’s about another 1,000 around the region. We have different regional offices. The regional offices have the economists and others that actually collect the data. I may be getting [inaudible] a little bit but if it’s household data, the Census Bureau collects that for us. If it’s establishment, we collect that ourselves. And that’s across all the three program areas that I’ve mentioned: prices, working conditions and then employment/unemployment.

Campbell: One more question about this and it has to do with projections. So, I noticed in the BLS projection they say that between 2016 and 2026 there’s going to be a decline of ten percent in the number of journalists in projection, and the number of print journalists, not broadcast journalists. So, where do those projections come from and how do we know that that’s going to be fairly accurate?

Martinez: That is a really good question, I actually love that program.

Bailer: [Laughter] Richard is so happy that he asked a good question. He’s just dancing around the table.

Martinez: I usually don’t- I’m an English major- I usually don’t get complimented for my questions

Martinez: Well, the reason why I like that is – I’ll kind of cut to the end of that story, from the standpoint that- what you just described is information that’s used for a very popular product that the Bureau of Labor Statistics produces which is called the Occupational Outlook Handbook. And it’s all online now, that handbook is used by students as young as in middle school so that they can decide what careers they might want to go into because it’s got information about the occupations, what type of education is usually expected that a person might have, what is the employment outlook, and so forth, so I think that’s what you were referring to . They have a really- it’s kind of an interesting process that they do, and the whole modeling and the whole process is outlined on the BLS website, but it does include information from the Census, the population numbers. They kind of look at what’s the projected population so they can get the workforce? Then they do models of the GDP, they have macro-economic models that project out to the future- you know, what’s expected in terms of productivity. There’s that word again. I’m trying to remember all the steps- but then they use information from the occupational employment statistics, and then they put it all together and make their projections. But they also use some additional information from, I believe, subject-matter experts to refine the model, so it’s not just purely math or economics or statistics, but they use some information about the culture, what they might be expecting in the future in terms of our cultural and sociological outlook.

Bailer: So, I’m curious Wendy, just in the process of conducting these big surveys, these impactful surveys that BLS does- how long does it take to design, collect, analyze, and report out on the survey? It seems like there’s so much time- and what fraction of the time is spent in each of these components?

Martinez: I think it varies, but that’s a really good question.

Bailer: Oh yeah Richard, you’re not the only one that can ask a good question.

[Laughter]

Martinez: Because most of our surveys have been around for many years, so the designs are pretty well established. It doesn’t mean we don’t go back and redesign when it’s needed, but it does take a while for everybody to collect the data, do editing, review, and those types of things and then do the estimates and the modeling. So, for example, that process that I was trying to describe for the employment projections- that can take a year.

Bailer: Wow.

Martinez: So sometimes there’s a lag. I think that occupational employment statistics- those are published with almost a year-lag there too. I should say all of the information that you’re asking about is on our website so you should refer to that. And I just want to say this too, because we are a government agency- we are transparent, we’re objective about collecting the data, publishing our statistics and we have a handbook of methods, which is on our website, and it describes the methodology from the survey design to collecting the data, making the estimates, estimating variance and so on, so all of that is on the web if people are curious.

Pennington: You’re listening to Stats and Stories and today we are talking with Wendy Martinez, Director of the Mathematical Statistics Research Center at the Bureau of Labor Statistics. Wendy, it sounds like a lot of this work is wrapped up in survey data, and I know one of the things that survey researchers often struggle with is the response rate, and that’s increasingly become a problem given the fact that there are no land mines, basically, right? So how does the work at the BLS navigate this issue of non-response of survey? Or is that an issue you face given the kinds of work that you’re doing?

Martinez: That’s another really good question. That is a big issue with surveys now that people, you know, they either don’t answer the phone like you mentioned, or they maybe distrust the data collection, they don’t think it’s going to be put to the proper use. So, it’s something that we worry about because most of the surveys are voluntary. The other side of my office, the behavioral science side, they worry about those kinds of things. So, they do experiments on different modes of data collection, whether it’s web-based or something else. So they try to then- and I also- folks in our field office, when they try to collect the data they may give the respondents information about this is how your data is being used and why it’s important, and they try to contact the sample units several times to get them to respond. And for those who are interested in statistics they also look at the non-response bias because that’s important. Maybe the non-response might not affect the quality of the estimates, so they look at what’s the bias, because people aren’t responding.

Bailer: So, I’m going to ask you a journalism question. These guys are putting the pressure on me. So, what do you think about how the BLS reports and some of the summaries and projections and other information that’s contained in the reports, how it’s covered in the press?

Martinez: Well, actually, this is going to sound terrible, but I try to ignore the press.

[Laughter]

Pennington: You’ve broken our hearts.

Martinez: I mean, that’s not totally true, but yeah, I think there’s always times where stories might attack the numbers or the Bureau, but I believe there’s also many cases where there’s positive stories about the Bureau of Labor Statistics, and we certainly have a long history of producing quality estimates that were objective in what we do- we’re not political. So, I think we’ve stood the test of time in that respect.

Bailer: So, just as a quick follow-up, so some of the numbers that you all are producing are just launched with incredible intensity. Whether it’s the unemployment rates, in particular. And when you see these numbers reported, it’s as if they were carved in stone. You know, these point estimates have so much weight to them. But, you know, there’s some process- some sampling that’s happened. There’s some uncertainty with it. And it’s seldom- I don’t know that I’ve ever seen an interval of plausible values reported when we see here the percentages of unemployment or underemployment or whatever the endpoint is. So, what do you think about that? And telling the story where the number that’s being reported is really just the single best estimate, but really there’s uncertainty associated with it?

Martinez: Well as- now how would I get in trouble with this answer?

[Laughter]

Bailer: You’re not getting in trouble, we’re all friends here Wendy.

Martinez: Okay. But, as a statistician, I like to see the variance, or some error reported along with it, or an interval like you mentioned. I don’t know that all of our surveys report estimates of variance, but it is something that we look at. In fact, that’s something that my group is involved with, is how can we make better estimates of the variance, or even how can we improve our estimates? So, yes, that’s something we certainly look at. I don’t- I believe that type of information might be reported differently, depending on which survey that is. And sometimes depending on the situation they can’t maybe report a statistic because it might inadvertently disclose information, or somebody’s identity, or maybe just the variance is too high.

Campbell: So, one of the things that we are trying to do here is to help reporters do a better job of covering data and numbers. So, if you’re put off by a lot of reports, is there something that you think that journalists can do better and report on their stories they’re missing? Is there an over-emphasis on a number and not on another number?

Martinez: I think it’s important that our data is talked about in the news. Because that’s the way that I think the public have come to understand how important federal data are.

Bailer: I think that’s a challenge that I’m not sure there’s a general appreciation for what statistical agencies do, the kind of information provided. I mean, when you’re talking about the occupational outlook and you’re saying that this is something that kids might be reviewing and considering as they look at demands for future careers, or as journalists are thinking about what’s the future of our profession? Whether it’s broadcast or print, that seems pretty critical. And certainly, the unemployment rates are something that everyone- that the economy seems to respond to, so it’s really important work.

Martinez: Well, you’re right and the CPI, the Consumer Price Index, that drives our cost of living increases and so on. And it’s interesting- so, you asked me about the non-response and reasons why people don’t respond, and that was something I looked at in some of my research while working at the Bureau of Labor Statistics, which is to try to get a sense of why people were not responding by looking at information that was recorded in the notes from the interviewers.

Pennington: Oh, interesting.

Martinez: So that was a lot of fun and what came out of that was some of the reasons were that people were worried about their privacy and kind of anti-government sentiment, or distrust of the government. So, I think people don’t realize, like you said John, how the data are used, the statistics are used, and why it’s important and how it does affect our daily lives, really.

Bailer: So, I’m curious about some of the problems you are working on now. What’s one of the future challenges, or current challenges on some problem you’re considering and that you’re investigating that you and your team might be trying to understand?

Martinez: I’ll give you an example of one that’s not necessarily just in my office, although we’ve been helping with it. Some of our employees are working- researchers are developing methods for- they call it auto-coding, but essentially, it’s supervised learning, getting into the technical side. But essentially, it’s assigning a label to something. And the something here happens to be occupational descriptions of an employee’s occupation. The government has a set of occupational codes, and employers don’t necessarily use the same title or code. So, if we have a description of an employee’s job duties, or their title, then we need to assign a code to that. And before, it was mostly done by hand, but some of our researchers- they’ve been doing an excellent job at developing auto-coders that use past data that was labeled and then use models to then build these coders that will assign occupational codes. So, it’s really been a tremendous time-saver. So, I think that’s the wave of the future to use more of those techniques to save time and be more accurate.

Campbell: So, there are three professors here. You’ve been a teacher, you’ve been a professor, how would you- what would you recommend to students if they want to do your job, or get into statistics and labor statistics? What should they be majoring in? What courses should they take? Are there any courses you wished you would have taken at some point and didn’t?

Martinez: I love that question too. [Laughter] I guess I’ll just back up- I always keep backing up here- but I’ll just say a little bit about what I took with my Ph.D. I got it in 1995, and when I look back at what the courses I took, I realized that it is pretty much what I believe data scientists would take today. A mixture of statistics, but both the theory and computational aspect of it. So how can we solve problems computationally rather than with the theory, I guess. Databases, parallel programming, using the computer, so having those skills. So, both a statistician/computer programmer. So, I think having those kinds of classes would really help. I always think too, that we should have as many tools in our toolkit as we can, because that gives us the flexibility to make connections and work on different problems. So, focusing on different types of software. So, in running our python stats would be helpful. And I always think it’s good to have good communication skills.

Bailer: Very good, we like to hear that.

Martinez: As you pointed out, you have to be able to communicate our results and what we’re doing to the general public.

Pennington: Well, that’s all the time we have for this episode of Stats and Stories. Wendy, thank you so much for being here.

Martinez: You’re very welcome. Thank you for having me.

Pennington: Stats and Stories is a partnership between Miami University’s Departments of Statistics, and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter, Apple podcasts, or other places you can find podcasts. If you’d like to share your thoughts on the program send your email to statsandstories@miamioh.edu or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.