Nash Equilibrium for Terminal Maneuvers

2026-04-02T13:52:16Z

Last year Ethan Heilman wrote about a simple game he calls Terminal Maneuvers. This game simulates a missile attacking an interstellar ship. The ship has a laser defence system. One player controls the missile, and the other player controls the laser. If the missile hits the ship, Missile wins. If the laser hits the missile, the missile is destroyed and Laser wins.

The complicating factor is that, due to the relative motion of the laser and the ship being a significant fraction of the speed of light, Laser has to aim not at the missile but where the missile will be. This distance allows the missile to perform erratic manoeuvers to prevent Laser from knowing what its future position will be. However, Missile must expend fuel to perform these manoeuvers.

The Terminal Maneuvers game proceeds in five rounds, giving the laser five opportunities to hit the missile. In each round, Missile secretly commits to an amount of fuel they will expend. Laser must “aim” by guessing the amount of fuel expended by Missile. If they guess correctly, there is some probability of destroying the missile, which depends on how far away the missile is and how much fuel the missile expended. The table below shows the probabilities of the missile being destroyed in the various rounds.

Probability of Laser destroying the missile when correctly guessing Missile’s fuel expenditure
Fuel Cost Round 1 Round 2 Round 3 Round 4 Round 5
0 Fuel 100% 100% 100% 100% 100%
1 Fuel 1/6 2/6 3/6 4/6 5/6
2 Fuel 0% 1/6 2/6 3/6 4/6
3 Fuel 0% 1/6 2/6 3/6
4 Fuel 0% 1/6 2/6
5 Fuel 0% 1/6
6 Fuel 0%

The missile has a limited amount of fuel at the start of the game. Fuel spent earlier in the game means less fuel available later in the game when it is most needed. The amount of starting fuel selects the difficulty of the game. Ethan suggests starting with seven fuel, which empirically gives Missile about a 25% chance of winning.

Laser knows how much fuel the missile has at the start of each round, so it is imperative that Missile does not run out of fuel in the middle of the game. If Laser knows the missile is out of fuel, Laser will predict zero fuel used and will always successfully destroy the missile. That said, as long as Missile has some fuel, choosing to burn zero fuel is still a legitimate option.

Starting with seven fuel, one strategy for Missile would be to burn one fuel on the first four rounds and burn the remaining three fuel on the last round. However, if Laser realizes this is Missile’s strategy, Laser can always predict the correct amount of fuel that will be used by the missile. Taking the product of all the probabilities of Missile’s survival in each round, Missile only has a 4.6% chance of winning. Clearly, Missile’s optimal strategy should be non-deterministic.

I figured this game would be a fun exercise in learning about mixed-strategy (i.e., non-deterministic) Nash equilibrium. This game is a small finite game, so it is reasonably easy to analyze, but it is significantly more complicated than trivial games often used in Nash equilibrium examples.

For these calculations, it is best to find the Nash equilibrium strategy at the endgame and work backward from there. To that end, let us start with the simplest non-trivial endgame. Missile has survived to Round 5 and has 1 fuel left. Missile can choose to burn their last fuel or not, and Laser can choose to aim at no fuel burned or not. This yields the following, game-theoretic payoff matrix, listing the probabilities of missile or laser winning:

Payoff matrix for Round 5 with 1 fuel remaining
Predict 0 Predict 1
Burn 0 0, 1 1, 0
Burn 1 1, 0 1⁄6, 5⁄6

This is a constant-sum game, because the total score of all players is always the same, no matter the outcome. Constant-sum games are also known as zero-sum games since they can be translated into games where the sum of each outcome is zero without affecting any strategy.

The definition of Nash equilibrium is a pair of strategies, one for each player, where neither player individually can change strategies to improve their outcome. Therefore, one potential way for Missile to devise a strategy is to find one where Laser’s chance of winning is the same regardless of the move that they make. Such a strategy is not necessarily going to be possible, but we can give it a try.

Let p be the probability that Missile will burn 0, and let q be the probability that Missile will burn 1. If Laser predicts 0, the probability of them winning is p. If Laser predicts 1, the probability of them winning is 5⁄6 ⁢q. If Laser cannot make a choice between these two options to improve their odds, then p = 5⁄6 ⁢q. Missile’s probabilities must add up to 1, so we also require p + q = 1.

We have a linear system of two equations and two unknowns, so we can try to solve it. The solution is p = 5⁄11 and q = 6⁄11. Missile burns no fuel with probability 5⁄11 and burns its one fuel with probability 6⁄11. This provides Missile a 6⁄11 chance of winning, regardless of which prediction Laser makes.

On the flip side, Laser’s Nash equilibrium can be computed by choosing a set of probabilities so that Missile’s outcome is the same regardless of whether they choose to burn fuel or not. This time, let p be the probability that Laser will predict 0, and let q be the probability that Laser will predict 1. If Missile burns 0, the probability of them winning is q. If Missile burns 1, the probability of them winning is p + 1⁄6 ⁢q. Again, if Missile cannot make a choice between these two options to improve their odds, then q = p + 1⁄6 ⁢q. Laser’s probabilities also must add up to 1, so we also require p + q = 1.

Rearranging q = p + 1⁄6 ⁢q, we get 5⁄6 ⁢q = p, which happens to be the exact same equation Missile had. Thus, their solutions are identical. Laser predicts no fuel burned with a probability of 5⁄11 and predicts one fuel burned with a probability of 6⁄11. This provides Laser a 5⁄11 chance of winning no matter whether Missile has chosen to burn their fuel or not. This chance is the complement to Missile’s 6⁄11 chance of winning, as it has to be.

Payoff matrix for Round 5 with 2 fuel remaining
Predict 0 Predict 1 Predict 2
Burn 0 0, 1 1, 0 1, 0
Burn 1 1, 0 1⁄6, 5⁄6 1, 0
Burn 2 1, 0 1, 0 2⁄6, 4⁄6

If the game ends in Round 5 with Missile having 2 fuel left, we have the above payoff matrix. We can solve similar linear algebra problems on three variables to find strategies for each player so that the other player’s outcome is the same no matter which of their three choices they make. The solution has Missile burn 0 fuel with probability 10⁄37, burn 1 fuel with probability 12⁄37, and burn 2 fuel with probability 15⁄37, giving Missile a 27⁄37 chance of winning regardless of what Laser’s prediction is.

Laser makes the predictions with the same probability distribution, giving Laser a 10⁄37 chance of winning no matter how much fuel Missile chooses to burn. This sort of distribution is what we might expect: somewhat evenly distributed with a bias towards burning more fuel, which provides some evasion for Missile.

What I found surprising is how when Laser plays at their Nash equilibrium, they simply do not care how much fuel Missile has secretly chosen to burn. Their odds of winning are the same regardless of what Missile reveals. It is as if Laser is no longer playing against Missile at all. Missile’s choices no longer matter. This result is called the “indifference principle.” Later we will see that Missile’s choices sometimes can matter.

Missile feels the same when playing at their equilibrium. No matter what prediction Laser ultimately makes, Missile’s odds of winning have already been fixed by playing at the Nash equilibrium.

In theory, this is what playing poker at a Nash equilibrium should feel like. Based on the state of the board, you make a random selection of calls, folds, or raises according to some appropriate distribution, and your distribution has fixed your expected payout at that point, independent of the choices the other players are going to make. No need to stress over whether your bluff will be called or not.

Before moving on to analyzing Round 4, we can complete a chart of the probability of Missile winning when playing at their Nash equilibrium depending on how much fuel they have remaining.

Probability table for Round 5
Remaining Fuel Missile Win Probability Laser Win Probability
0 0% 100%
1 6⁄11 ≈ 54.5% 5⁄11 ≈ 45.5%
2 27⁄37 ≈ 73.0% 10⁄37 ≈ 27.0%
3 47⁄57 ≈ 82.5% 10⁄57 ≈ 17.5%
4 77⁄87 ≈ 88.5% 10⁄87 ≈ 11.5%
5 137⁄147 ≈ 93.2% 10⁄147 ≈ 6.8%
6+ 100% 0%

If Missile starts Round 4 with 1 fuel remaining, they are in big trouble. They can only burn fuel in at most one of the two remaining rounds. Therefore, Laser can win by predicting 0 fuel burned in both Round 4 and Round 5. Laser is guaranteed to destroy the missile on one of those two rounds. Missile must start Round 4 with at least 2 fuel remaining if they are to have a chance of winning.

Since the game in each round depends only on the state of Missile’s remaining fuel and not on the specific choices of how that state came to be, we can simplify the analysis of Round 4’s payoff matrix by using each player’s probability of winning Round 5 as their scores in Round 4. Note that Missile does not have the option of burning all their fuel in Round 4, since starting Round 5 with 0 fuel is a guaranteed loss for them, and Laser knows it.

Payoff matrix for Round 4 with 2 fuel remaining
Predict 0 Predict 1
Burn 0 0, 1 27⁄37, 10⁄37
Burn 1 6⁄11, 5⁄11 2⁄11, 9⁄11

To compute Missile’s strategy, we define p and q as before. This time Missile needs to solve the equations p + 5⁄11 ⁢q = 10⁄37 ⁢p + 9⁄11 ⁢q and p + q = 1. The solution has Missile burn 0 fuel with a probability of 148⁄445 and burn 1 fuel with a probability of 297⁄445, which is roughly a 1⁄3rd–2⁄3rd split. This provides Missile a chance of winning with a probability of 162⁄445, or about 36.4%.

Meanwhile, Laser needs to solve the equations 27⁄37 ⁢q = 6⁄11 ⁢p + 2⁄11 ⁢q and p + q = 1. The solution has Laser predict 0 fuel with probability 223⁄445 and predict 1 fuel with probability 222⁄445, which is nearly evenly split. This provides Laser a chance of winning of 283⁄445, or about 63.6%.

In Round 5, each player’s individual payoff matrix was symmetric, which led to Missile and Laser having identical strategies. In Round 4, the individual player’s payoff matrices are no longer symmetric, and Missile and Laser end up with different strategies. Laser picks a nearly 50–50 split because the differences of column scores, 5⁄11 − 1 vs. 9⁄11 − 10⁄37, are nearly equal in magnitude. Whereas Missile picks a 1⁄3rd–2⁄3rd split because the difference of row scores, 27⁄37 vs. 2⁄11 − 6⁄11, differs in magnitude by close to a factor of two.

We can proceed as before, using linear algebra to compute equilibrium strategies for Round 4 with various states of remaining fuel for the missile. However, we run into a problem when Missile has 4 fuel remaining.

Payoff matrix for Round 4 with 4 fuel remaining.
Predict 0 Predict 1 Predict 2 Predict 3
Burn 0 0, 1 77⁄87, 10⁄87 77⁄87, 10⁄87 77⁄87, 10⁄87
Burn 1 47⁄57, 10⁄57 47⁄171, 124⁄171 47⁄57, 10⁄57 47⁄57, 10⁄57
Burn 2 27⁄37, 10⁄37 27⁄37, 10⁄37 27⁄74, 47⁄74 27⁄37, 10⁄37
Burn 3 6⁄11, 5⁄11 6⁄11, 5⁄11 6⁄11, 5⁄11 4⁄11, 7⁄11

Let us try to solve for Laser’s equilibrium strategy, the probability distribution where Missile’s outcome is the same no matter what move they make. We let p, q, r, and s be the probabilities of predicting 0 through 3 fuel burned, respectively. In addition to having p + q + r + s = 1, we require

all be equal to each other. Solving this system of equations gives us

Apparently, predicting 3 fuel used is such a terrible move for Laser that our “optimal” solution wants us to predict it with a negative 19.5% probability! Unfortunately, Laser cannot actually select moves with negative probability. We have to add constrains to our acceptable solutions to ensure all probabilities are non-negative.

Adding linear constraints to our problem brings us into the realm of linear programming. Since we are entering this realm, we can take this opportunity to compute the minimax solution for each player. For Laser, the minimax solution is to compute a probability distribution that minimizes Missile’s score, i.e., their probability of winning, which we will denote by z, subject to the constraint that Missile will choose the move that maximizes their score for that distribution. This leads to the following system of linear constraints:

where we want to minimize z.

Using linear programming, we can optimize this system. The optimal solution is

That is, Laser’s strategy is to predict 0 fuel burned 30.5% of the time, predict 1 fuel burned 38.1% of the time, predict 2 fuel burned 31.4% of the time, and never predict 3 fuel burned. This lets Missile win at most 61.5% of the time, or equivalently, it lets Laser win at least 38.5% of the time.

Is this minimax strategy really an optimal strategy? Let us look at Missile’s minimax strategy. For Missile, we need to optimize the following system of linear constraints:

to minimize z.

The optimal solution is

That is, Missile’s strategy is to burn 0 fuel 19.9% of the time, burn 1 fuel 32.0% of the time, burn 2 fuel 48.2% of the time, and never burn 3 fuel. This lets Laser win at most 38.5% of the time, or equivalently, it lets Missile win at least 61.5% of the time.

This pair of strategies is optimal because Laser wins at least 38.5% of the time by their strategy, and Missile wins at least 61.5% of the time by their strategy, which adds up to 100%. It turns out that for zero-sum games, the minimax, maximin, and Nash equilibrium strategy sets are all identical, and furthermore, these strategies form a convex set. Using a general-purpose Nash equilibrium solver will produce the same pair of optimal strategies.

Still, these strategies surprised me. Laser is not even aiming at Missile burning 3 fuel. Shouldn’t Missile avoid being hit by Laser entirely by choosing to burn 3 fuel? But Missile’s optimal strategy also says to avoid burning 3 units of fuel. Why?

Upon closer examination, we see that with Missile’s computed optimal strategy, they have a 61.5% chance of winning. If Missile were to burn 3 fuel, yes, they would avoid being hit by Laser in Round 4. However, they would begin Round 5 with only 1 remaining fuel. In that state they would only have a 54.5% chance of winning, worse odds than their optimal strategy that avoids burning 3 fuel.

Laser is not aiming at Missile burning 3 fuel because Laser would love for Missile to burn 3 fuel. Doing so would increase Laser’s odds of winning from 38.5% to 45.5%. We see that Laser’s strategy does not entirely rule out all consequences of Missile’s choices. It only makes it indifferent to Missile’s choices within the support of Missile’s optimal mixed set of moves. Technically an opponent’s choices can affect the outcome of the game; they can still make moves that benefit the other player.

Continuing with linear programming, we can fill out the table for the probability of winning for Round 4.

Probability table for Round 4
Remaining Fuel Missile Win Probability Laser Win Probability
1- 0% 100%
2 ≈ 36.4% ≈ 63.6%
3 ≈ 50.5% ≈ 49.5%
4 ≈ 61.5% ≈ 38.5%
5 ≈ 69.9% ≈ 30.1%
6 ≈ 76.4% ≈ 23.6%
7 ≈ 81.6% ≈ 18.4%

Continuing this way, we can work backwards and compute probability tables for all the rounds until we reach round 1.

Probability table for Round 1
Starting Fuel Missile Win Probability Laser Win Probability
7 1005005076075⁄3110959445024 ≈ 32.3% 2105954368949⁄3110959445024 ≈ 67.7%

In conclusion, we found that playing optimally, Missile has an approximately 32.3% chance of winning, which is a little higher than the 25% estimate given by Ethan. I leave it as an exercise to determine the most fair amount of starting for Missile to start with.

Tags

,

Russell O’Connor: contact me