Linda J. Young is Chief Mathematical Statistician and Director of Research and Development of USDA's National Agricultural Statistics Service . She oversees efforts to continually improve the methodology underpinning the Agency's collection and dissemination of data on every facet of U.S. agriculture. She works on the surveys designed to characterize agricultural activity in the US.
+ Full Transcript
Rosemary Pennington: When most Americans think of agriculture, they might imagine fields of corn or wheat or herds of cattle or broods of chickens. Perhaps they even stop to consider the regulations that determine what foods are safe to eat. What they might not spend much time thinking about are the agencies that oversee farming and food in the United States. The work of one such agency, the United States Department of Agriculture, or USDA, is the focus of this episode of Stats & Stories where we explore the statistics behind the stories and the stories behind the statistics.
I'm Rosemary Pennington. Stats & Stories is a production of Miami University's Departments of Statistics and Media, Journalism & Film and the American Statistical Association. Joining me in the studio is regular panelist John Bailer, Chair of Miami's Statistics Department. Dr. Richard Campbell, Chair of Media, Journalism & Film is out of town today. Our guest is Linda Young. Young is Chief Mathematical Statistician and Director of Research & Development for USDA's National Agricultural Statistics Service or NASS.
Thank you so much for being here today, Linda.
Linda Young : Thank you for having me.
Pennington : To help our listeners understand the work of your particular arm of USDA, could you explain the kinds of information NASS is charged with gathering?
Young: We have two primary responsibilities. The first is the Census of Agriculture which is conducted every 5 years in years ending in 2 and 7. We are preparing right now to send out about 3 million questionnaires at the end of this year. This census gives us information about the types of farms that are in the U.S., the types of people who are running them, their characteristics, the demographics, and it provides a foundation for policy makers. The Farm Bill is based on a lot of information that's produced from the census.
The other major component that we have is the Estimates Program. We produce estimates of the number of acres and the production and the yield for all types of crops. We conduct over a hundred surveys a year, produce more than 400 reports on all aspects of American agriculture. One thing that people may be aware of are the commodity markets. The numbers that we produce are very influential in those markets.
John Bailer: Wow. NASS is something that I think has probably influenced my life a lot in ways that I just can't appreciate.
Young : I did not realize just how much NASS did until I joined.
Bailer : I'm just impressed because I didn't realize that there was going to be every five years a census of agricultural production in the U.S. Can you say a little bit more about how do you even find the 3 million farms? I assume when you said 3 million farms questionnaires, you have a sense that the 3 million people that are working entities that are working and producing.
Young : That's our big challenge - just trying to find farms. Because a farm, by definition, is any operation that has the potential to produce a thousand dollars or more in sales in any given year.
Pennington: Oh wow.
Bailer : Oh, my!
Young : Because when you think about it, the person doing the big backyard garden is a farmer. So we keep this list as complete as possible of all farms or potential farms in the U.S. Some of the things, some of the people on the list aren't really farmers. But we send out the information until we can either confirm it one way or the other. In 2012, we found 2.1 million farms. Now that included adjustments that we had to make for under coverage of our list, nonresponse, as well as some misclassification of farms.
Bailer : So I'm not surprised. That was going to be one of my next questions, the issue of what kind of response rates do you get when you send this out? Do you send this out in waves or are you sending out the multiple million by mail and then following up with calls or following up with other ways to try to promote this?
Young : We have quite a campaign going on. There are all kinds of producer groups that help us get the word out. It's on our website. But what we do is we start with a wave of letters and we're opening up a website. The big push this time is a new census web form that is more responsive than taking and should be easier for people to use. And so we send out waves of letters saying, "Here's how you can get on the web and record your information." Then, we follow up with the paper questionnaires and those two come out in waves. Then we just get ready to collect the data and then analyze the data.
Pennington : Now, Linda, there are a lot of agencies that are feeling kind of a push to do more to sort of get the word out about what it is they do and also sort of help people understand how they can access their data set. Are there things that NASS is doing to sort of raise the visibility of the work the agency does and to sort of help make sure reporters or researchers know what data is available to them?
Young : Historically, NASS has made a big effort to keep in contact with its data users. There's a data users meeting every year in October that people are welcome to come to and get the information. We have an advisory committee with people from the farming industry that meets once a year and provides suggestions for improvement. We are in constant contact with the different commodity groups who are interested in what we're doing and how they can better help us get the statistics out. And so it's quite a diverse approach that NASS uses.
Bailer : It seems like you would have to.
Pennington : So, I didn't realize that the census of farms was happening every five years in these odd number years. Well, I guess 2 is not odd. It's an even number. But I think that, I didn't realize that these censuses were happening quite so frequently. Why do you think that - I'm assuming I'm not the only American who doesn't realize this - why do you think that Americans might not be so aware of the work your agency does in counting the farms?
Young : Well, as people have been increasingly moving to the cities, I think they have become a little more disconnected and actually a small percentage of the population is actually involved in agriculture now. Although I see signs that that is changing. Urban agriculture is getting more and more attention, as is organic farming and local foods. So I think over time that could change.
Bailer : So are you estimating characteristics such as the growth of urban agriculture, the local farm, the foods movement - is that something that you can capture in the data that you guys collect?
Young : That's one of the things that we are working on right now. We have been putting out an organics survey and report for a few years now. We just - I think it was 2015 - did a survey, which meant it was published last year, on local foods marketing. We haven't done anything on urban agriculture except for a pilot study. There are some real challenges with that because our list frame, it's extremely hard to find the small farms that are as widely dispersed as are those that tend to occur in urban areas. So there are some real challenges there and it is an active area of research for me right now.
Bailer : I'm just really intrigued at the idea of trying to figure out how many people have the potential to have a thousand dollars of product produced on land that they own, and possibly work.
Young : That is one of our real challenges with some of our estimates, especially when it comes to the census and why misclassification is an issue for us. In fact, we revised our analysis in 2012 to try to better capture under coverage, nonresponse and misclassification.
Pennington : You're listening to Stats & Stories where we discuss the statistics behind the stories and the stories behind the statistics. The topic today is Agricultural statistics. I'm Rosemary Pennington. Joining me is panelist, Miami University Statistics Department Chair, John Bailer. Our special guest is Linda Young, Chief Mathematical Statistician and Director of Research & Development of the USDA's National Agricultural Statistics Service or NASS.
Linda, of course, I am sort of the journalistic end of this since Richard is not here. I'm a former journalist and did some work covering issues around farming when I worked in Alabama for a while. And I wonder, from your perspective, what do you think are some of the underexplored or underreported stories that are hiding in the data that your agency gathers?
Young : There are a number of issues that are facing agriculture today that one could explore with the numbers and that people are very interested in. One is that the average age of farmers has been increasing. That's a major concern because who is going to produce food in the future if people age out of the system? And yet, after the last census, we were questioned about the way we were capturing the demographic information and had a couple of panels and revised that section and I do think that as a consequence, we are going to better reflect the participation of women and beginning farmers.
Pennington : Oh, that's interesting.
Young : I don't think that, not necessarily the age of the oldest farmer is going to change, but I do think we are going to be able to better reflect the evolving nature of farming, which is a major concern.
Bailer : You know one of the things that, in talking with people who I've done work with in occupational safety & health is that farming can be a pretty dangerous occupation, in terms of working alone and working with heavy machinery. The rates of injury can be relatively high in such work forces. Do you track any of that as part of it? I know that CDC/NIOSH does some of that, but I was wondering if you guys were involved in that as well.
Young : I don't think so, John, at least I've never worked on any surveys related to that.
Bailer : Okay. Not completely surprising. Now how long have you been at NASS, Linda?
Young : Four years.
Bailer : Four years. What was the biggest surprise that you learned when you joined that group?
Young : Just that, whenever you change jobs, it's a new culture and it's just the culture, it's just a little different. Not good, not bad, just different.
Bailer : Okay. How about in terms of the work at NASS? The thing you didn't realize before you started there that kind of popped up at you, kind of "Wow, I didn't realize NASS did that!"
Young : Actually, I had worked on a contract with them for about four years before joining. So there weren't a lot of surprises in that way for me.
Pennington : One of the things that I found surprising when I was on NASS's website was that there was a section about, information about hurricanes and sort of the impact of hurricanes. I didn't dig around too terribly much, but are there things that you are gathering at NASS or are keeping track of that you think people might find surprising, that maybe you don't find surprising, because you've been there for such a long time, but maybe data that people might find surprising or compelling that might be useful to them?
Young : Well, one of the things that we've just done recently that we're proud to be able to help is that information on the impact of Harvey and Irma on the crops. Now what we've done is provide information about the number of acres or at least the percentage of acres for different crops that have been inundated with water as a consequence of those 2 hurricanes, which is useful to a number of people and we were able to do that using some radar imagery, satellite imagery, and so yes, a couple of people in my division worked really hard over some weekends to try to get that out in a timely fashion.
Bailer : That's very neat. So what are the types of statistical methods that someone would need to know in order to work for NASS?
Young : Well, we do a lot of surveys. Sample surveys are always really nice. We still do use the probability-based survey. However, more and more we are looking at modeling and in 2012, we actually put in place a new set of methods for the census of agriculture based on capture/recapture. That had a lot of modeling involved with it. And then we are increasingly developing models that will combine information not just our survey data but some administrative data that we have so that we can put all the information together to get a really good estimate with a valid measure of uncertainty.
Bailer : So what about the - I got to know about the capture/recapture.
Pennington: That's what I was going to ask
Bailer: Is this in terms of the people that are working in the business? Or what was capture/recapture used to do?
Young : Well, our list frame and as hard as we work and although there are over 3 million people listed on that list frame, we missed some people. So we also have an area frame that covers all the U.S. And prior to 2012, we assumed that because of that, if we missed them on the list frame, we could use area frame to figure out how many had been missed. And the problem was that we had misclassification because those little farmers go in and out of business, sometimes a little over a thousand, sometimes a little under a thousand. And so we developed - that's what got me excited and actually led to my joining NASS. We developed an approach where we could quantify both under-coverage, nonresponse and misclassification using capture/recapture between the list frame and the sample that we take every June off the NASS area frame.
Bailer : Very cool. You know some people listening to this might not know some of those terms, those technical terms. Could you define just what a list frame is and what an area frame is?
Young : So a list frame is as you would think, it's a list. In our case it's a list of all the farms in the U.S. that we know of, farms or potential farms. The problem is, if you think about it, how can you keep up? There are all these small farms, especially the urban farms coming into business, going out of business, so although we devote a lot of resources to this, there is some incompleteness in the list and we have things on the list that aren't really farms.
The area frame is thinking about just looking at all of the U.S. and that's the area that we're covering. We don't cover Alaska but we cover Hawaii and the contiguous U.S. and so we divide those into little areas and that makes up our area frame. So by definition, you've covered the whole thing. It's just whether you've identify the farms that are there when you show up.
Pennington : I'm wondering, since you mentioned the use of satellite and radar for the information about the crops' impacted by Harvey and Irma, is NASS exploring or using sort of satellite imagery to sort of help supplement these lists that they're creating, to sort of map, so here's where we see there might be agricultural land and then looking at the list, we say, "Oh yes, we have an idea of this is where a farm should be."
Young : We would like to move more in that direction. The challenge that we face is that we have tried to use satellite imagery to actually develop a list and not used web scraping as well to do a third type of list. The challenge is, with the satellite imagery we can afford, you can't see whether or not it's a farm.
Pennington : So what you're saying is that you need more money.
Young : You know, so people have said, "Well, use drones to fly over." But there's a confidentiality issue. Some have crashed on farmer's farms and that would be a public relations nightmare that we haven't been willing to face yet.
Pennington : I don't blame you.
Bailer : Me neither. So why not Alaska?
Young : For the area frame?
Bailer : Yeah.
Young : Because farming is sparse and the resources required to cover Alaska would be immense.
Bailer : I ask in part because of a colleague who has a flower farm in Alaska who works up there because they are able to produce at times that are out of season with many other places in the rest of the world.
Young : Now, we don't do the area frame in Alaska. But we do a list frame.
Bailer : Oh, okay.
Young : So we do report on Alaska, but we just don't use the area frame for that.
Pennington : We've been talking a lot with the people who've been on the program about the visualization of data, and I know that on NASS's website you can access data sets. I was actually playing around earlier today with some of the data about Ohio. I was wondering if there are things that your agency is considering doing to help visualize data to make it more accessible to maybe the public.
Young : We actually have an initiative started to put more visualization up on our website. The literature that we send out we do these fact sheets. It tends to have more graphics than it used to. It's a work in progress for us. So yes, that is definitely a push within our agency.
Pennington : You're listening to Stats & Stories. Our discussion today focuses on agricultural data. Our guest is Chief Mathematical Statistician and Director of Research & Development of USDA's National Agricultural Statistics Service, Linda Young.
Linda, I'm wondering obviously as a reporter, a former reporter, what do you find frustrating in the way that agricultural data or information is reported in news stories?
Young : Sometimes, the full story is not given. I think people think it should be easy to get these numbers. We spend a lot of time and effort getting them out the best we can, but overall, I think the agricultural reporting in the U.S. is pretty good.
Pennington: Oh wow.
Bailer : I would think just in your descriptions of the challenges, adjustments, under coverage and nonresponse and using capture/recapture in comparing different frames that you're adding a lot of nuance to a story. So when you think about reporting out some of this statistical nuance, how difficult is that to try to communicate?
Young : Often in the story, we don't try to overly communicate it. We put out our estimates and the measures of uncertainty. And then in supplementary materials, we write the information about how we come to the estimates. For the census, that's always in Appendix A, and I spent quite a bit of time getting that written in a way that we felt like communicated the information effectively and accurately.
Bailer : Very good. You know, I worry sometimes when I read things that the reports of the uncertainty estimates are ignored. The point estimate everyone focuses on that, even though there might be some bound or sommargin of error. And it's not attended to. So I wonder about that communication.
Young : We can communicate it, but it is true that people often need a number. For example, there's ARC-CO program in which farmers give the price of a commodity, say corn, drops below a certain level, then they get a payment and, well, or a yield I should say, the yield for their county. And our numbers are used to set the county's estimate for yield and production. Production is just the average yield times the area. So it's how much is produced in the county. If the yield is below our number, they get paid, and if it's above our number, it doesn't get paid.
Bailer : Oh, boy.
Young : And uncertainty there is ignored. That's very frustrating because farmers sometimes feel, and accurately so, that that number is not exactly right. We know it's not. We know there's uncertainty associated with it. And if they're not getting paid, they want it lower. All of them would probably like it to be lower because it's money. And yet that's - I don't know what else they would do. They have to set a number and there has to be a point that determines payment.
Pennington : You've clearly made a case about why farmers pay attention or why anyone interested in agriculture should pay attention to what you're doing. But for the general public, why would the work you're doing be important for John or I, who, again, probably our whole lives have been impacted by the work your agency is doing but we have no idea about that impact.
Young : Well, one reason you probably don't have any idea is that you don't have to worry about getting food when you go to the grocery store.
Pennington : That's a good point.
Young : So, our estimates help people understand not only what is being produced, but how much is in storage and what the capacity we have as a nation to feed our people is, which is really important. And that's one reason that urban agriculture is becoming of greater interest, because people would like to be able to afford to buy fresh food close to where they live.
Bailer : So I had no idea that you are also involved in estimating what's in storage. If you're doing something like estimating what's in storage, you probably have to think about how long it can be kept in storage and still be viable as a product. So do you, is that also integrated into the analysis that you do?
Young : No, because the farmers are reporting to us what they have in storage, and I am sure that they are going to get it out before it goes bad.
Bailer : Okay, so when you say it's in storage, it's in storage on the farm.
Young : Or in other locations, yes. But generally the farmers will keep their corn or wheat or whatever they want in storage, either on the farm or in elevators.
Bailer : Oh, okay, okay. That was just my ignorance, sorry.
Pennington : So when it comes to survey data, the concern is always about the issue of self-reporting and how thorough and how accurate the information is that people are reporting. So with something like this where you are thinking about how much food is stored, how much arable land there is in the United States to produce more food, is there ever any concern about that issue of self-reporting and whether it's not as - honest is not the word I'm going to say - but maybe underestimated or overestimated, and are there ways that you, when you're crunching the numbers, will try to sort of control for that?
Young : Well, there are a few surveys in which it seems that there is a little bit of bias. Over time, we've begun to figure that out. For example, one way you can see that is if you have administrative data that is pretty solid. Examples would be slaughter data. So that we have information on how many cattle have been slaughtered over a given time frame and if our estimates of the number of cattle are way below those slaughtered, we know there is a problem. So we can make some adjustments there.
The same with cotton. We work very closely to see how many bales of cotton have been ginned in any given year. So that too is another way we continually look to see how our survey data are performing relative to these accurate administrative benchmarks.
Bailer : Oh, that's cool. That seems like a critical thing to do for calibration. That's a neat thing. Among all the studies that are conducted, what's the hardest study to conduct? What's the hardest question to try to answer in the work that you guys do?
Young : That's an interesting question. I immediately think of urban agriculture, because that's dominated with small farms that are very diverse in nature, widely dispersed, and going in and out of business. Some are more transient than the traditional rule of agriculture that many people think of. And that type of a population is very challenging if we want to say, "How many urban farms are there in various states?" That's an extremely challenging question to answer.
Pennington : Is there anything that you've seen in the data over time that gives you pause, that makes you stop and think a little bit or that you find surprising?
Young : You know, I can't think of a specific example, but it's not uncommon for me to walk down the hall and have someone say, "I need to talk to you about such and such, because something doesn't look right to me." When people are working in a particular area for a particularly long time, any time they tell me that, I stop and look, because something has happened. It could be in the data collection. It could have been a weather event that disrupted data collection or caused some harm to the crop and something is happening. We just want to be sure that, if that is the case, that we begin to adjust for how we account for that commodity.
Pennington : Well, Linda, thank you so much for being here. I grew up in farm country, and I feel like I learned so much from talking to you that I had no idea about, having grown up amongst corn fields and cattle. So again, thank you for being here.
Young : Thank you so much.
Pennington : That's all the time we have for this episode of Stats & Stories. Stats & Stories is a partnership between Miami University's Departments of Statistics and Media, Journalism & Film and the American Statistical Association. You can follow us on Twitter or ITunes. If you would like to share your thoughts on our program, send your email to email@example.com . And be sure to watch for future editions of Stats & Stories where we discuss the statistics behind the stories and the stories behind the statistics.