Dr. Mike Orkin is a Professor of Statistics Emeritus at California State University, where he was a professor and chair of the statistics department for many years before becoming a consultant and a nationally known authority on probability and gambling games. Since then he has appeared in numerous forms of media ranging from CBS Evening News, NBC’s Dateline, a Google Tech Talk series as well as authored serval books.
Check out the full article at Significance.
Episode Description
A long pass down the sideline is caught in bounds. Or was it? The referees ruled it a catch, but the opposing team was unconvinced. In the NFL there's a way to challenge a referee call that comes with a potential risk which is the focus of this weeks episode of Stats+Stories.
+Full Transcript
John Bailer
A long pass down the sidelines during a football game is caught in bounds, well, or was bobbled and incomplete, the referees ruled it a catch, but the coach of the opposing team was not convinced. In the National Football League, there is an option for a coach to challenge a referee's call. This challenge comes at a potential cost. Our guest today has a strategy to help a coach decide about challenging a referee's decision. I'm John Bailer. I'm joined by Rosemary Pennington, chair of the department of media, journalism and film. Stats and Stories is a production of Miami University's departments of statistics and media, journalism and film, as well as the American Statistical Association. Our guest today is Dr. Mike Orkin. Orkin is a professor of statistics emeritus at California State University, where he was a professor and chair of the statistics department for many years before becoming a consultant and a nationally known authority on probability and gambling games. He is the author of a number of books, including the recently published The Story of Chance: What's luck got to do with it? Our focus today will be on his Significance magazine article, to challenge or not to challenge? Well, that is the question, thank you so much for joining us today, Mike, it's so nice to have you back as a guest.
Mike Orkin
And I'm very happy to be back here as a guest.
John Bailer
So can you start with a description of how the red flag challenge works in the National Football League?
Mike Orkin
Yeah. So for some situations, for some types of plays, the coach of either team has the right to challenge the result of the play as called by the refs, and if they think the ref is wrong, they can, for instance, the example you just gave is the perfect example, because that's one of the situations you can challenge, and that is the receivers running down the field, the quarterback throws him a pass, he goes out of bounds, he appears to have caught up, looking on the TV monitors. It looks like he may have bobbled the ball, and so that would be an incomplete pass. The refs on the field call it complete. So it's up to the coach of the defensive team if he wants to or not, to throw the red flag on the field, which is the challenge flag, and to challenge the referee's call and say, Hey, wait a minute, that's an incomplete pass. So if this is a crucial situation, as it was in a couple of the examples I gave in the article, in particular, for one example that you were just sort of referring to, namely, the receiver on the team who has the ball, catches the ball and looks like it seems to bobble it as he's going out of bounds. If that's in a crucial situation, then the coach on the defensive team can throw the challenge. Now you're only allowed a limited number of these challenges. And if you're wrong, you lose a time out, and they keep changing the rules on how many challenges you're allowed. So starting now, starting in the 2024 season, you're allowed two challenges, but if you get either of those right, then you're allowed a third challenge sometime in the game. So the coach doesn't want to just start tossing the flag and discriminately, because if he does and is wrong, he's going to lose a time out, and he's going to lose the challenge opportunity, another challenge opportunity. So the coach has to be somewhat knowledgeable. You can see the instant replay on the screens in the stadium, and, of course, the coach has a staff up in the booth looking at the TV instant replays. Then you only get a few seconds to decide whether to throw that challenge slack?
Rosemary Pennington
I would say this is a relatively new thing in American football. What sense do we have of how often coaches are doing this, and are they doing it in particular kinds of situations? Are they more likely to do it in the playoffs versus a regular season game?
Mike Orkin
Well, it's been around for a little while. It's been around various types of instant replay have been around for at least 20 years, instant replay in general, so there are certain times in the game, like the last two minutes of the game or of the half, they will do an instant replay of any key plays like making a touchless score, or things like that, or a turnover. And to answer your question, roughly 50% of instant replay reviews have been overturned, not that it's not only coaches challenges. That's also just the general instant replay reviews that the refs have to do at certain times of the game. But that's a pretty high percentage, yeah. And the reason, of course, is you can't see that in situations that are difficult. You can't see exactly what happened with fumbles, is another one. And so the instant replay really helps to be able to see on TV exactly, or on your screen, on your video screen, exactly what happened.
John Bailer
I was gonna say it sounds like there's, there's sort of different value placed on the importance of timeouts there. Also, you're gonna tell us a little bit about using craps as a model for thinking about this. But there's also another model about kind of what, what weight do you place on the value of, you know, what's the benefit of winning versus the cost of losing a timeout, right?
Mike Orkin
rWhich is sort of part of my model, yeah. And that's exactly right. And in fact, Shanahan was quoted in my article about that very thing, and then he has made that comment many times, a similar comment, so you have to, that's exactly right. That's sort of the key to understanding this. You have to weigh the value of a time out and compare it to the value of a successful challenge. And then, of course, there's also the fact that you only get two challenges in the game, or maybe three, if you're successful in the first on one of the first two. So that's another factor.
Rosemary Pennington
So you mentioned your model, and your model tries to help coaches figure out should they throw the flag or not? It was based on craps. I enjoy blackjack and poker. I do not really understand craps all that well, so I wonder why craps as this model for helping coaches sort this out.
Mike Orkin
Okay, so it isn't really craps, per se. It's the type of game that craps is, namely you either win or you lose, and you win with a clearly defined probability. Just count combinations on rolling a pair of dice when you make one of many craps bets, and then when you win, you get a certain payoff odds. So for instance, in the particular craps bet that I used in the article, if you bet that seven will come up, if you roll a pair of dice and seven comes up, you win, and you get paid four to one, which means for every dollar you bet, you get $4 and then the chance of winning is 1/6, because if you roll a pair of dice, there's six ways out of 36 that a seven will come up. So those are the two factors that are involved when you have to decide if you want to think about it probabilistically, whether it's a good bet for simple bets like that. So now you mentioned poker and blackjack, which are more complicated, because in poker, you play against the other players, not against the casino, and so things like bluffing can apply. And in blackjack, you have what are called dependent events, because when you deal cards out of a deck, once a certain card is dealt, then the deck composition changes. And so there are strategies where you sort of count, what's called card counting, where you can have good strategies. But in the situation that I'm talking about, it's more like a craps game, because the coach has to make a quick decision, and he has to weigh the payoffs against the winning payoff against the loss. And there has to be some idea of a probability involved. So now this is not something where you can just count combinations like you can when you roll a paradise. So this is what we could call maybe a subjective probability, or the probability of an expert that the coach has to figure out with his staff, two numbers. I think that most coaches can do that, and probably would like that kind of structure. But even so, if the coach can think of two numbers, the probability of a successful challenge, and there's data that I mentioned a few minutes ago to help with that. And then, what is the payoff? Odds? How much is it worth, or what's the value of a successful challenge to the coach?
John Bailer
So, you know, continuing that idea of sort of this, this payoff, and the value, you talk about expected value as part of the story here, when you're setting this up, this model for the coaches, and you know, you're the expectation, the expected value being negative is one of the reasons why I don't frequent most, you know, casinos in gaming, that doesn't attract me. So can you talk a little about what expected value means in the context of the craps games that are being played, and then how you then use that as a place to step off towards thinking about the decision about the red flag?
Mike Orkin
Right? So when you're playing craps, we're playing any casino game, there's this notion of expected value, which is really just how much you win on the average or lose. And then. Winnings is a loss in repeated play, so in craps, for example. And if you do this over and over with where the games are in, your bets are independent of each other, then the law of averages will give us a nice, cozy end result if you do it a lot of times. So that's not that it's much harder in blackjack and a little different in poker, but in craps, if you bet on seven, the example I used, and you win, you win $4 with probability 1/6. And that means that, roughly speaking, or put another way, in the long run, you'll win. If you bet $1 you'll win $4 about 1/6 of the time, and then you'll lose the other five, six of the time, install on the average. So if you win $4 one six of the time and lose $1 5/6 of the time, you'll end up on the average, losing $1 with probability overall, out of every six plays, you lose $1 on average. So that means your expected value, or your average per dollar bet, is minus one six or 17 cents. Now that, as John pointed out, is not a particularly wholesome environment. But some people, they think they're going to be lucky, or they'll bet, thinking they have some strategy, which they don't. And craps, it's a popular game. It's a group activity, lots of people standing around cheering, whatever. But it's a long run losing proposition. There's no way you can win a game of chance like that in the long run. And the way to measure that is with expected value.
John Bailer
You're listening to Stats and Stories. Our guest today is Mike Orkin, who's now going to tell us about how this expected value from craps translates into thinking about the red flag decision for a coach.
Mike Orkin
The expected value for just betting on seven and craps, is not good. It's negative. So we don't, I don't tell the coach to do something like that, because that's a losing strategy on the average. So what I thought about was, how can you make craps? And people sometimes cheats using loaded dice to come into a casino and try to win craps, because the probability of whatever they're betting on, let's say the probability of a seven gets higher because they have trick dice. Now that's not a very good idea, because if they get caught, they'll be in serious trouble, but you can think about it mathematically. What probability for these loaded dice do you need to make the game a to have a positive expected value? And it turns out that's pretty easy. Namely, instead of the probability being 1/6 it has to be 1/5 for you to win that bet. In order to break even on that bed in the long run. So the way to think of it is four times 1/5 minus one times four fifths. You can see that that equals zero. That's your expected value for a bet on seven using these loaded dice, where the probability of getting a seven is higher than it is ordinarily. But then that got me to thinking that it's very easy to balance off the payoffs and the probabilities, and if you put that together, you can see very quickly when it's a good bet. So when you're going to throw the challenge flag, the coach has a few seconds, and all he or someone on the staff needs to do is give them two numbers, and those two numbers can be quickly weighed. In fact, I have a little table in my article to see if it's a good bet, namely, if it has positive expected value. This is not exactly like a positive expected value in crafts, of course, because we're not in a casino. You can't count combinations. These are just subjective or expert opinions of the coach and the staff.
Rosemary Pennington
Have you shopped this around to the NFL?
Mike Orkin
No, I haven't yet. I just, I'm planning to send it around a little bit. That's a good question. One of the reasons that I thought of shopping it around is that some coaches for teams that I root for, like the 49ers, living out in the San Francisco area, make foolish decisions, and namely, really conservative decisions on when to throw the challenge flag. And I think someone like Kyle Shanahan, the coach of the Niners, who has a very smart staff could very easily use a strategy like this. Just think of two numbers. In fact, one of those numbers is backed by data, namely, what's a chance, in general, of a challenge flagging, of an instant replay being overturned? Causing a play to be overturned, but so they could just think, well, let's see. Like this, the example I give in the article, the Eagles. This is the champion NFL champ, NFC Championship game in 2023, the Eagles. They have fourth down there in the 49ers territory. Jalen Hurts, the quarterback on the Eagles, throws a pass, and if that pass is not complete, it'll now be the Niners ball, and they'll get out of a serious problem, namely, their opponent is ready, is getting ready to score. And so this is a high value immediately, the coach can tell if it's high value here, when it's fourth down and you have a challenge overturned, it's the other team's ball. So anyway, it's very clear that in this particular case, there was a high value that a coach would assign to this play getting overturned, which would overwhelm the possible loss of a timeout. And then there's the probability of success. But the TV replay, even, as even Shanahan, clearly shows that the receiver is bobbling the ball.
John Bailer
You know, it's interesting. I was thinking about this value discussion, and I looked at your table, you had it as value versus probability of when, and you know you, like you said, there is data on kind of how likely these types of calls are, have been overturned, but boy, the the context in which these calls would be made, you know, if I'm down to, well, if I don't have any timeouts left, I can't even call, I can't even throw the flag, right, you know. But if I'm sitting with three timeouts, it's going to be pretty easy to think about throwing it without, you know, I might be willing to say, Okay, I'm sitting on two if I'm wrong here. So the context is in the time of the game and where it occurs, you know, kind of all right, this incomplete pass call that was being made at the 50 yard line, and I'm gonna hit the punt. Otherwise, that might be different than if you're about the score. So there's sort of all these, these flavors of when things happened. So I imagine, you know, trying to think about doing this very quickly. You know, you say you only have seconds to make this decision, because they're going to hustle back to the line and try to get the next play off, you know, before, right?
Mike Orkin
Yeah, well, yeah, so that's right. So it's not easy, but these coaches in the NFL have to make these kinds of split decisions all the time, quick decisions, all the time, split second decisions, and they have a staff backing them up. There's a team of people sitting up high in the stands with TV monitors in front of them so they can see instantly what happened. So yes, it's very difficult, but to be an NFL Coach, you have the support and the knowledge to be able to do that sort of thing. And I think most coaches wouldn't have, once they sort of got fluent with it, wouldn't have much trouble estimating a probability of getting an overturned, a challenge overturned, and also of assigning a value, which is how much it's worth to the coach, when you compare where the loss part of it is all the things you just mentioned, John.
Rosemary Pennington
Yeah, as we've been talking, I've been thinking about all these other moments during a football game when a coach is gonna have to make a tough decision, do we go for it on fourth down? Detroit yesterday just pulled off this really amazing trick play, right and got a touchdown. And it seems like there are a lot of coaches who are doing increasingly riskier things on the field compared to even just 10 years ago. And I wonder just sort of how something like this might work out for someone, or whether there's another kind of formula to help coaches sort of navigate the Should we go out on fourth down feels like maybe it's closer than Should we do a flea flicker or something? Right?
Mike Orkin
So in general, some of these other situations that you mentioned might be more complicated than whether to toss the challenge flag, but yeah, the coaches have to make all these now. Nowadays, for some reason, you're right. They seem to see more trick plays, more going for it on fourth down, and this going forward on fourth down has actually been studied by using data, and their quantitative results about that have been surprising to the NFL that that they should go for it on fourth down more often.
Rosemary Pennington
This, your paper, will lead to that for challenge flags.
Mike Orkin
Well, actually, I try to just, I like to try to quantify things, and sort of what I do, and I thought it would be, it's quantifying this by just having two numbers that the coach has to and the staff have to think of is something they do. And I hope that Kyle Shanahan uses something like this. I don't care about the others. Andy Reid. I mean, I root for the chiefs. Andy Reid does that. He does, yeah, he's, well, he's more liberal than Kyle Shanahan. Anyway, in fact, in the article, I'll give an example of where he threw the challenge flag, and he didn't get it. Wasn't successful, but he still was sort of using these basic ideas, even if he didn't quantify it with two numbers. So he does that. He's more liberal. Let's put it that way. So go ahead.
John Bailer
I was now sort of picturing, you know, this whole red flag, and in review, has come about because of technology. You know, this has been something that, right, that's a change in the rules, and the change of play as a contact is a consequence of having this novel technology. And I'm just sort of imagining now, you know, if you're going to take your time machine into the future, Mike, you know, and thinking about technology, is there going to come a time when there'll be no need to have a red flag challenge, because it'll immediately be processed, you know, there'll be some kind of predictive model behind the scenes that will say, No, no, that's not. Was that not successful? Yes, that was. I mean, I've just sort of imagined that why will not, not the technology go to this, go to this next level?
Mike Orkin
Well, yeah, that's then. There's talk of that. That's not just a statistician's dream, but it's there. They're actually, they have plenty of technical people who work for the NFL. And there's, as you say, there have been advanced, huge advances, in technology. I mean, the next thing they're going to be saying is, oh, we're going to use AI to make this decision, which is really just a catchphrase for, you know, software and, and, but new software that does stuff like that. So, yeah, I think, I think that there will be a slow shift, or maybe not so slow, to making this decision more often by the NFL itself, with using technology. There's talk of doing that, of course, in baseball too, and they're starting to do things like that, like the sports in the strike zone. Well, it sort of depends on the umpire, and there are certainly ways to use technology to help with that.
John Bailer
Well, I'm afraid that's all the time we have for this episode of Stats and Stories. Mike, thank you so much for joining us today, for being here.
Mike Orkin
My pleasure, anytime.
John Bailer
Stats and Stories is a partnership between Miami University’s Departments of Statistics, and Media, Journalism and Film, and the American Statistical Association. You can follow us on Twitter, Apple podcasts, or other places you can find podcasts. If you’d like to share your thoughts on the program send your email to statsandstories@miamioh.edu or check us out at statsandstories.net, and be sure to listen for future editions of Stats and Stories, where we discuss the statistics behind the stories and the stories behind the statistics.