Karen Kafadar, Rudy Professor of Statistics and Physics at Indiana University, joined the Stats+Stories regulars to discuss forensic science, the evidence used help solve crimes. She was a member of a National Research Council (NRC) committee that produced the report Strengthening Forensic Science in the United States: A Path Forward, and she has written on Statistical Issues in Assessing Forensic Evidence.
+ Full Transcript
Bob Long: More than likely, you’ve found yourself glued to the television set on a Tuesday night watching Abby do all kinds of forensic evidence tests to help find the bad guys on the CBS show NCIS, or maybe you’ve seen the forensic analysts on CSI: Miami. Watching these shows may make you feel like, “Wow, I’m a forensic expert now,” but there’s a lot you don’t know. What does science tell us about, for example, the validity of fingerprints or footprints? Is DNA evidence a sure thing? I’m Bob Long; welcome to Stats and Stories. It’s a program where we look at the statistics behind the stories and the stories behind the statistics. Our focus today will be on forensic evidence. Before we start talking to our special guest, though, about that particular topic, we asked our Stats and Stories reporter Emily Potten to go out and talk to the Ohio Bureau of Criminal Identification and Investigation, BCI. It’s an agency that performs all kinds of lab tests, on everything from ballistics to fingerprints to DNA evidence for police departments throughout the state of Ohio.
Emily Potten: If you’re a fan of NCIS or CSI: Miami, you know the critical role forensic science plays in solving crimes today. People once thought eye-witness testimony was the most important in convicting someone accused of a serious crime, but prosecutors will tell you the most convincing evidence for a jury is scientific evidence: DNA, fingerprints, hair follicles, fire arms testing, tire tracks, the list goes on. Local police detectives are schooled in proper techniques for collecting this evidence, but it’s the laboratory scientists at places like the Ohio Bureau of Criminal Identification who analyze it for possible matches. BCI administrator Erin Reed says most people don’t understand how specialized this work is. Reed notes there are numerous labs within BCI and specialists are assigned to only one, such as latent fingerprints.
Erin Reed: They examine things like ridges on feet so they can look at your footprints. Palms, the center or your hand, and of course, fingertips. And the ridges that they look at are developed based on movement in the womb, so no two individuals have identical prints.
Potten: Reed says the use of DNA has improved dramatically in recent years, and so have the precautions that are taken in handling evidence. It’s standard procedure to wear gloves for all types of lab work, but Reed says precautions go beyond that in handling things like hair follicles.
Reed: Our biological sciences people will often times be wearing masks, and they don’t speak to each other when they have evidence out. The tests that they can do now are growing to be more sophisticated and more sensitive, meaning they can find DNA on more evidence than they could previously.
Potten: Enhanced chemistry has allowed BCI to re-open some cold cases and find DNA that wasn’t found before, and it also has helped people who are wrongly convicted. Reed feels shows like NCIS and CSI may help people better understand the importance of DNA, but she says TV makes it look easier than it is.
Reed: Those shows simplify some very complicated matters. That can sometimes result in some unreasonable expectations. You don’t always find DNA evidence at a crime scene, and that’s not necessarily because you don’t look hard enough, it just may not be there.
Potten: Erin Reed says the Ohio Bureau of Criminal Investigation has seen a tremendous in demand for its services the past few years. She says BCI has been able to expedite the results of its laboratory work thanks to increased staff and the use of robots. For Stats and Stories, I’m Emily Potten.
Long: And joining us on Stats and Stories today for our discussion of forensic evidence, the man who comes up with a lot of the topics for this show, that’s Miami University Statistics Department Chair John Bailer, and our very special guest today is Indiana University Statistics Professor Dr. Karen Kafadar, and she’s done a lot of research on forensic science. Karen, welcome to our show today.
Karen Kafadar: Thank you for having me.
Long: You know, I always date myself when I do this, but when I was a young journalist, back in the 1970s, covering criminal trials, I can remember things like blood work that was brought up as evidence, and fingerprinting, and things like that, but it just seems to me that this whole field has just grown exponentially in the last 20-30 years; is that correct?
Kafadar: That is correct. It’s just not fingerprint identification as the BCI investigator was mentioning; it’s now shoeprints and tire tracks and blood splatter and arson, hand writing analysis. One of the more recent areas is computer forensics.
Long: I know, for example, Miami University has a gentleman on the police force here who’s involved in that kind of work, doing computer analysis of your hard drive. Again, that’s another area that’s just probably opened in the last five to ten years.
Kafadar: That is correct, in about the last five or ten years.
Long: Speaking of tire tracks, I remember a famous murder case I sat through that, I had never seen anything like this, they actually rolled tire prints out on a great, big, long conference table, and it was one of the keys pieces of evidence in the case, showing the guy’s tire tracks versus the ones that they had from the lab, just another pair of tire tracks that were similar. It’s really interesting what you find.
Kafadar: It raises a number of issues. For example, if they did that just once in the courtroom, what if they did it two times, or three times, or four times? How many differences would they have actually observed in a controlled situation like that?
John Bailer: That sort of starts the issue of my question for you, how did you get involved in this? I mean it sounds like there is this long history when you go back to Sherlock Holmes talking about forensic ideas and solving mysteries and cases. So how did you get involved in the statistical components? And what are some of the statistical components that you’ve encountered and addressed as part of your work?
Kafadar: That’s a really interesting question. I never really thought much about it. I certainly always loved mystery stories, but I didn’t really think much about it, but about ten years ago I was asked to serve on a committee for the National Research Council that would was asked to look at the reliability and validity of the tests that they were using in comparing lead in bullets, so the lead that was found in a bullet at a crime scene versus the lead that was found in the bullets at a potential suspect’s house. So they would find a suspect and they would say, “I think you’re a suspect,” they’d seize the bullets, they’d do chemical analyses on the two bullets and the question was “Are the statistical tests they are using valid?”
Bailer: So what does validity mean here?
Kafadar: Validity here would mean that if the tests according to their procedures indicated yes, there is a match, was it really a match? Did those two bullets really come from the same source? Did that crime scene bullet really come from the potential suspect? If the report comes back and says no, they don’t match, is that true? Were there really different sources? So the issue of validity is “Were the statistical tests valid for making that kind of an assessment? Were they coming up with the right answers?”
Long: You’re listening to Stats and Stories where we discuss the statistics behind the stories and the stories behind the statistics. And we’re focusing this time, as you can tell, on the whole issue of forensic science, especially in these criminal cases. I’m Bob Long. With me are our regular panelist, Miami University Statistics Department Chair John Bailer, and our special guest, Indiana University Statistics Professor Dr. Karen Kafadar. And we wanted to find out what people on the street know about what we’re discussing today, so one of the questions we wanted to ask them is “How reliable do you think fingerprint matches are?”
Woman on the street 1: Extremely important because fingerprints are so different, everyone is so different so makes it a lot easier to solve crimes.
Woman on the street 2: I think they're important because everybody's is different so it's kind of easier to identify the person if you have their fingerprints.
Woman on the street 3: I think they're really important because then you can find the exact person.
Woman on the street 4: I think it's very important because there's a lot of people who get accused of certain crimes, and because there's no real evidence that's against them or for them, it's just really hard to determine who actually did it.
Woman on the street 5: I've watched a lot of crime shows, and it's different than the real world. But it seems like fingerprints are very helpful and I know there are crime labs. I know New York uses them a lot. Cincinnati, I don't think we have a big problem with crimes, so we don't really have a lab, but it seems fingerprints are very useful in solving crimes.
Long: Well I think Karen, you mentioned your work with the National Research Council, you mentioned one you did about ten years ago dealing with basically ballistics, but you’ve also done some work with fingerprint evidence, have you not?
Kafadar: That’s right. Following that committee, it opened up another whole issue which was, well, if there are issues with the reliability and validity of the testing of bullet lead, then what about other forensic methods? So about four or five years later, there was another National Research Council committee which asked to look at a broad range of forensic evidence, how reliable are these methods that we think of that some of which I mentioned earlier: shoeprints, handwriting analysis, tire tracks, hair analysis. DNA was not covered because an awful lot of attention has been devoted to DNA, and the scientific procedures there were fairly well documented. But what about all these other procedures? That was where I became much more interested in all the other kinds of evidence, and hair analysis and fingerprint analysis because they are the most common.
Bailer: So you’ve talked about validity and reliability issues, and one thing that’s implicit in a lot of what you said is there are errors that can occur in these decisions. It seems like these types of errors are the same kind that seem to arise when we talk about health screening studies.
Kafadar: That’s exactly correct.
Bailer: So very similar kinds of ideas. Can you talk a little bit about how you might study how these systems work, or how these types of evidence work and how you would evaluate whether one’s better than another?
Kafadar: That’s a great question. And let’s use the health screening modalities as a baseline because there’s an area where, in medicine, they’ve worked very hard to design the studies so that they are above criticism. The National Institutes of Health just recently, through the National Cancer Institute, released the results of a study to see just how sensitive and specific is, say, the test for prostate cancer, the PSA test. And they have a randomized study, they flipped a coin, for someone who entered the study, do they get the PSA test on a regular basis, or do they follow their usual medical care? And at the end of it, they were able to compare the death rates in the two groups, a very well controlled study. Now in contrast, you would take something like the different methods of identification, say by fingerprints. So there’s really not been the same kind of parallel investigations. How often do they claim that somebody had prostate cancer when it turns out the biopsy shows they didn’t versus the other way around. Those kinds of studies haven’t been done with almost all of the other kinds of forensic evidence.
Bailer: So if you were going to design the perfect study to evaluate fingerprint evidence, what would be some of the things you would think about, some of the factors you would consider, and what would it look like?
Kafadar: And that’s a really important issue, to think about working with the latent print examiner, what are the factors that can influence the decision that’s made? And some of them would be how much of the print is there, how good of quality is the print, is there even a measure of quality of the print so that we could put in the study prints that are say, low quality versus prints that are high quality. Is there an objective measure for that? We’re working on that right now. What are some of the other factors of the examiners, level of experience of the examiner, how the fingerprint was collected. Was it collected on the table, was it collected on wood, was it collected on metal. So you would want to have a study where you would have a number of different factors and you would especially want to make sure that the examiners has been given two prints to identify whether or not they actually came from the same source. You would want to make sure that person didn’t have any prior knowledge and the person who gave them the prints didn’t have any prior information. That’s what we call medical studies as double-blind. At this point, there have been no solid studies that a statistician would claim as above criticism. If a study can be above criticism, then you’ll feel a lot more reliable about it. Rather than being able to say it’s a study, but they didn’t do this.
Long: I think what you’re saying, and a lot of times when you talk to crime scene investigators, one thing they’ll say is you do get, as you were talking about, partial prints, and obviously that’s going to be a little bit different than if you got my full hand on some object or something like that. But also, the material itself, I can see where it may not show up as well, for example, on wood versus metal, or glass, those kinds of things, there would be differences, wouldn’t there?
Kafadar: Absolutely, to give you an idea how hard this is to really collect reliable fingerprints, while I was on this committee looking at all methods of forensic science, one of the committee members had his house broken into, and the police could not lift even one fingerprint, not even his own. So that’s how hard it is to get really any kind of fingerprint, much less a usable one that could actually, later, be associated with something from a ten-print fingerprint system.
Long: The interesting thing, though too, you mentioned the expertise of the lab specialist. For example, in reading some things about BCI here in Ohio, they have lab analysts who have an average tenure of about twenty years. So when you’re talking about that, you’re talking about people who when they go into a court of law, people are much more likely to believe what they’re going to say based on their experience doing this.
Kafadar: I think that’s very correct, and there’s an advantage of that and a danger to it. The danger is, of course, if you’re sitting on a jury and somebody says I’ve been doing this for twenty years, you just don’t question their expertise. The advantage to that is that those people do have a lot of insight. So how can we use them to say, “What are you really looking at when you’re looking at a fingerprint?” And by that, try to get some ideas of which features should we investigate for how distinctive they are. As she mentioned in that introductory piece, the BCI investigator said, “We look at things like ridges, and furrows, and bifurcations, and so forth.” Well how distinctive are those? Do we have any measures of how distinctive they are? Those are the kinds of questions that need to be answered.
Long: You’re listening to Stats and Stories, and again we’re focusing this time on the importance of forensic science in our criminal investigative system today. I’m Bob Long; with me, our special guest today is Indiana University Statistics Professor Dr. Karen Kafadar, and of course our regular panelist Miami University Statistics Department Chair John Bailer. You know, one thing that I remember years ago when DNA was first being discussed, any intelligent defense lawyer is going to go, “Oh now wait a minute. Can we believe this new type of science?” I can remember some criminal cases where DNA, we’d find it hard to believe today, but DNA was called into question, was this really valid? So talk a little bit about DNA evidence and how it has developed in the last decade or so.
Woman on the street 1: No. I think they make it seem too simple. They make it seem as if the process is really easy and really quick when in reality, it's really long and tedious.
Woman on the street 2: Probably not because people say it's 'cause they're really attractive. But I think it's maybe they don't show any injustice that happens within the criminal justice system or how fast the process actually goes.
Woman on the street 3: Not at all, because it takes a lot longer than a week to figure out crimes.
Woman on the street 4: Way too easy for them. It shows them in the labs and stuff, so I figured that was pretty accurate. But I guess the time spent is inaccurate.
Woman on the street 5: It just seems like that is really not probable, like stuff kind of happens… and the whole… I don't know.
Woman on the street 6: I feel like they play it up a lot more. I feel like it's not all the truth, but I feel like there's some truth to it. I think they're trying to get people interested, and they're just trying to exaggerate just like you can do with other shows like the courtroom. You can make it seem a lot more interesting than it really is.
Kafadar: That’s a great story because in fact, when the National Research Council, I was not involved in this panel, but when the National Research Council was asked to look at this, their first report came out in 1992, and it did not have as much statistics in it as it really should have, and the statistics community poked holes in the report, and it was so embarrassing they had to conduct another report. And they did, and they did it well. That’s a situation where it’s a good contrast between fingerprints. Fingerprint analysis, based on a lot of experience, I will not discount the experience, but they’ll look at the print and they’ll identify features, but you don’t go to a DNA sample, and say, “Well, I’ll look at this feature and this one.” It’s all well specified. Which aspects of DNA are they investigating? Why? Because that NRC report identified thirteen specific features which are very distinctive; they’re very, very distinctive. So for John’s DNA at those thirteen places, to match exactly mine is just really unlikely to happen by chance alone. So there’s been a lot of science behind that.
Bailer: So one of the things, just as a follow-up, that the idea of a standard protocol for data collection, it seems like a key distinction between some aspects of what you would say would be really high quality and validated forensic evidence and less validated, less supported. You also mentioned a couple of concept earlier that the idea of sensitivity and specificity, can you talk about those ideas? There’s this important distinction between what you know and what you’re trying to predict, in that case, versus trying a positive predictive value of some end point, so could you talk just for a second about that?
Kafadar: And those are really key concepts in statistics for assuring validity and reliability. Sensitivity says, suppose I give you two items and they really did come from the same source, I know that because I gave them to you. You’re going to give them to the examiner. The question is if you did that a whole bunch of times, how likely is it that the examiner comes back and says, yeah, they came from the same source. You want that to be high, right? You really did give them two samples of the same things. How specific is it is to say, I know I gave you two different sources. I know that because I’m the one that prepared the samples. How likely is it that you came back and said, yeah, they’re different? And you got the right answer. So those are sensitivity and specificity. You want them very high. The question that comes up, as Bob was in the courtroom, and he sees examiners saying it’s a match. Well, that’s the reverse. Somebody comes in and says it’s a match or it’s not a match; the question we want to know is how likely it is that they really came from the same source. That’s the real issue, because that’s what happens in real life. You’re presented with the results of a test and you want to know: did you get the right answer? Now those two concepts, what’s the predictive value of that? Are you positive about it or did you discount it? Those are related to sensitivity and specificity. They’re also related to how likely it is that they would match in the first place. So there are those three aspects, but they’re all very important for this final answer, how likely is it that the examiner came up with the right answer?
Long: Are there, from the work that you’ve done, the research you’ve done, the kinds of forensic evidence that you would say today are probably the most reliable?
Kafadar: So there are two issues here. One is the actual science; where has the science actually been validated with reliable studies. The other one is the procedures. You are always going to have quality control issues. You’re always going to have samples being mixed up, so apart from that, and those are quality control procedures, that’s the job of the manager of the lab to make sure that the process is going smoothly to minimize errors in process. But apart from that are the scientific underpinnings. DNA has a scientific foundation; there are still some issues with it, for example, mixtures. If there’s a sample that happens to have two different mixes of DNA, can we adequately resolve those? Do we have the technology to do that? It’s a good statistics problem. Apart from DNA, there have been no other, this is what the report came out with in 2009, and it said apart from nuclear DNA, there is no other forensic evidence which matches the scientific validity of DNA.
Bailer: So now we get back to these shows that were eluded to, and for people listening, the CSI: Miami is not CSI: Miami University, just in case there’s any confusion. What do you think about the impression that these types of programs give in terms of this type of evidence, in terms of forensic evidence, and just the power and the broad applicability and strength of it in identification?
Kafadar: I think they’re great shows. They probably don’t match reality; they solve the whole crime within an hour.
Bailer: They always seem to get matches too, on the fingerprints and the forensic evidence is highly reliable.
Kafadar: There is a great deal more uncertainty in the results than is being conveyed, and conveying uncertainty doesn’t sell shows.
Long: I think one thing that you also can have happen, which I’ve seen happen in courtrooms too, is, I’m thinking of one case where the palm print turned out to be the decisive piece of evidence, and the reason it was a decisive piece of evidence, a guy had broken into the home of a couple that he knew, thinking they were away at church on a Sunday night, but the wife was home sick. So he broke the window on the back door of the house and he reached in to unlock the door, and of course she was there in the kitchen and he had to end up killing her because she would have been able to identify him, but the only way they really nailed him in this particular case, this BCI investigator testified that he broke the glass, and somehow a piece of the glass that would have been down inside the wood was pulled up; his palm print was on that. The only person that could potentially have put it there, according to that agent, was the guy who shattered the window, which was really an interesting thing. And when I talked to jurors afterwards, that’s what led them to this. And it was interesting though, the point I wanted to make was that there was another defense called, a forensics expert who tried to dispute the BCI claim. So do you run into some issues like that too, where you might have one scientist look at the evidence and say one thing and somebody else say something else?
Kafadar: Yes, that’s exactly the kind of thing that can happen because of the issue of not knowing which features of the palm print one would look at. So it’s not like DNA where they already have identified thirteen highly sensitive, highly, highly specific features. Forensic expert one might say, well I notice these four are the same, and another forensic expert may say well I notice that these four are different. So which features are the ones that you should be focusing on?
Long: You’re listening to Stats and Stories, again we’re focusing today on the importance of forensic science, and our special guest is Karen Kafadar from Indiana University Department of Statistics, and my cohort here, Miami University Statistics Department Chair John Bailer. We’re running out of time today, getting close, so John I wanted to turn it over to you for any kind of final questions you might have today for Karen.
Bailer: Well Karen, this has been great. It has been a delightful visit and conversation and I appreciated the distinction between what you’re conditioning on when you’re thinking about sensitivity versus predictive value. You made a statement just about the shows, conveying uncertainty doesn’t sell shows, but it seems that conveying uncertainty reflects reality, so one of the challenges that we might say is how might we facilitate the conveying of uncertainty in the reporting in this type of material in terms of the production and promotion of these types of ideas.
Kafadar: Uncertainty actually pervades everything, and we see it a little bit, for example, you see it now in polls. They will normally say the margin of error in this poll is three percent. So there’s scientific underlying in that three percent. The fact is that they are communicating some uncertainty. So how do you convey enough of an uncertainty and yet show that, despite this uncertainty, we still are able to make an association here, or a lack of association. And the only way to really do that is to encourage more studies, more science, and more research in these methods; so ultimately, you will narrow the uncertainty so that you can make these kinds of assessments. People always like yes or no answers, they don’t want, well yes, but maybe it’s not yes, and I think that everyone would feel a lot better if they knew that well, we believe that this is the answer, but there is a small chance that we’re wrong.
Bailer: So part of our role is to make sure that people hear that. And also maybe to make them expect that.
Kafadar: That is correct. It should be something where you come to expect it now in a poll. You come to expect to see what was the margin of error? Was it 3%? Was it 1%? Was it 5%?
Long: And I know from talking to BCI analysts through the years, they spend a lot of time going to conferences learning different techniques, and I think it’s like what we’ve learned today. This whole thing is still evolving and always probably will be. We’re always going to be looking for more scientific ways to be sure that what we’re presenting as evidence is going to be valid in a courtroom.
Kafadar: This is a great role for statisticians because one of the important things we want to be able to do with statisticians is encourage the design of studies which evaluate which method is more reliable.
Bailer: And even having statisticians working with journalists has proven to be an okay thing.
Kafadar: Absolutely, but I think that’s a really good point. It’s a great time to be a statistician.
Long: Karen Kafadar, as always we want to thank you for sharing your thoughts today about forensic science; a very interesting part of our show today on Stats and Stories.
Kafadar: Thank you for having me.
Bailer: Thanks for coming.
Long: If you’d like to share your thoughts on our program, well you can do that too. You can do that by sending us an email to firstname.lastname@example.org. Be sure to listen for future editions of Stats and Stories, where we’ll always talk about the statistics behind the stories and the stories behind the statistics.