As the Superbowl is approaching, let’s have a look at the interesting mathematical problem around the game of American football.

Before jumping straight in and for those of you who may not be familiar with American football, some brief description of the gameplay is needed to formulate the mathematical problem:

- The gameplay consists mostly of plays or “downs” starting at the line of scrimmage.
- The team in possession of the ball will have four attempts to advance with it at least ten yards in order to gain a “first down” which resets the attempts back to four for the next advancement
- The attacking team can score points by reaching the end zone with the ball in their possession, this is called a touchdown, or by kicking a field goal.
- However, if the first three downs are unsuccessful, the team faces a decision on what to do with their final attempt. This situation is then referred to as “4th-and-X” where X is the number of yards required.

The options are:

- Playing the fourth down as any other down, to try and gain the missing yards. If successful, the team will gain a new first down and be able to continue their attack (or score a touchdown if the downs have started close enough to the end zone). If unsuccessful, the opposing team will start their attacking turn from the spot where the fourth down ended.
- Attempting a field goal. If successful, the team will score 3 points, the attacking turn will end and the opposing team will receive a kickoff to start their own attack. If missed, the possession is overturned to the opposing team.
- Punting the ball. If the team is still too far to attempt a field goal one option is to punt the ball as far as possible from their own end-zone for the opposition to start their attacking turn.

For formulating the mathematical problem, the input data needs to be defined. For the single play, at least the following probabilities relevant to the situation need to be determined:

- What’s the probability of a successful fourth down when the yards required are known?
- If the fourth down is successful, how likely is the following attacking drive to result in points scored by either a touchdown or a field goal?
- If the fourth down is unsuccessful, what are the position probabilities of the starting point of the opposing offensive drive?
- How likely is a field goal from the current position (ie. the distance from the current position to the goal posts)?
- If the field goal is missed, what are the position probabilities of the starting point of the opposing offensive drive?
- If the ball is punted, what are the position probabilities of the starting point of the opposing offensive drive?

Additionally, data from previous seasons can be used to study the expected outcomes of an attacking drive when the starting location is known. Then the location probabilities above can be turned into probability distribution for a number of points the opponent is expected to score from a turnover.

But instead of just focusing on the outcome of a single play the main objective is winning the match. Thus, a metric to measure the impact of the selected play to the match winning probabilities would be required.

One way to do this would be to use the data from previous seasons as an input to machine learning algorithms to create a function to return the winning probability when eg. the current scoreline, time left on clock, starting location for the attacking drive are given.

Then combining this function with the probabilities of the single play will allow calculation of the match winning probabilities without populating the entire decision tree for the plays remaining in the game.

So what kind of choices should a team make on the fourth down and how do the suggestions given by the mathematical model compare to the choices made by coaches?

Most of the models suggest that a 4th-and-1 anywhere on the pitch should be a situation when the team should just go for it.

Similarly, in 4th-and-2, only if the team is very near to their own end zone should they even consider punting. This is justified by the reward of reaching first down.

Another example where the decisions suggested by the model differ from the actual plays are the situations where the fourth down starts on the attacking half but near the middle line (halfway line) of the field just out of range for a field goal.

Then even in situations like 4th-and-7 it would be favourable to attempt to reach the first down, as punting has a risk of going out of bounds from the end zone.

Thus forcing the opposition attack from their own 20-yard line which is not that much different, risk-wise, to their 40-yard line, where the fourth down might have ended.

**Pride or Math?**

The researchers (such as Ben Baldwin) have been running simulations on plays from the past seasons and found out that the coaches tend to be too conservative when choosing their plays.

The more cautious options of kicking the ball are favoured even in situations where playing the down would be the better option.

This is intuitively understandable as failure in the fourth down makes the coaches look like they have made a bad decision in the eyes of the general public, although it would have been the correct decision.

As their jobs tend to be quite volatile - there is an element of risk management in these decisions by the coaches, in a way this is the sporting analogy of “no-one has been fired for buying from IBM”.

However, management by data is penetrating the sports world too and there’s a visible tendency where choices of plays have been changing into a more opportunity-seizing direction.

Still, the teams/coaches very rarely decide to go for it, when kicking is the more suitable option, but playing the fourth down is becoming a more used option when circumstances are leaning more towards their favour.