Background of Game Theory
Game theory is the study of strategic decision making; that is, “the study of mathematical models of conflict and cooperation between intelligent rational decision-makers” ([Myersonl991]). Game theory has applications in many areas of business, military, government, networks, and industry. For more information on applications of game theory in these areas, see [CS2001], [Cantwell2003], [EK2010], and [Aigingerl999], Additionally, [MF2009] discusses game theory in warlord politics which blends military and diplomatic decisions.
The study of game theory began with total conflict games, also known as zero-sum games, such that one person’s gain exactly equals the net losses of the other player(s). Game theory continues to grow with application to a wide range of applied problems. A “Google Scholar” search returns over 3 million items.
The Nash equilibrium for a two-player, zero-sum game can be found by solving a linear programming problem and its dual solution ([Dantzigl951] and [Dantzig2002], [Dorfmanl951]). In their work, Dantzig and Dortman, respectively, assume that every element of the payoff matrix containing outcomes or payoffs to the row player M,j is positive. More current approachs (e.g., [Fox2008] and [Fox2010]) show the payoff matrix entries can be positive or negative.
7.2.1 Two-Person Total Conflict Games
We begin with characteristics of the two-person total conflict game following [Straffinl993]:
There are two participants: the first, Rose, is the row player, and the other, Colin, is the column player.
Rose must choose from among her 1 to m strategies, and Colin must choose from among his 1 to n strategies.
If Rose chooses the ith strategy and Colin the jth strategy, then Rose receives a payoff of al} and Colin loses the amount aXJ. In Table 7.2, this is shown as a payoff pair where Rose receives a payoff of Mij and Colin receives a payoff of Nj.
Games are simultaneous and repetitive.
There are two types of possible solutions. A pure strategy solution is where each player achieves their best outcomes by always choosing the same strategy in repeated games. A mixed strategy solutions is where players play a random selection of their strategies in order to obtain the best outcomes in simultaneous repeated games.
Although we do not address them in this chapter, in sequential games, the players look ahead and reason back.
Table 7.2 shows a generic payoff matrix for simultaneous games.
TABLE 7.2: General Payoff Matrix of a Two-Person Total Conflict Game
A game is a total conflict game if and only if the sum of the pairs Mij+Nij always equals either 0 or the same constant c for all strategies г and j. If the sum equals zero, then we list only the row payoff Mij.
For example, if a player wins x when the other player loses x, then the sum Mi j + Nij = x — x = 0. In a business marketing strategy, if one player gets x% of the market, then the other player gets y% = 100 — x% based upon 100% of the market . We list only x% as the outcome because when the row player receives x%, the column player loses x%.
A movement diagram has arrows in each row (vertical arrow) and column (horizontal arrow) from the smaller payoff to the larger payoff. If there exists one or more payoffs where all arrows point towards it, then those payoffs constitute pure strategy Nash equilibriums.
Example 7.1. Baseball Fr anchises.
Several minor league baseball teams want to enter the market in a new area. The teams can choose to locate in a more densely populated area, or less densely populated town surrounded by other towns. Assume that both the National and American Leagues are interested in the new franchise. Suppose the National League will locate a franchise in either a densely populated area or a less densely populated area. The American League is making the same decision they will locate either in a densely populated area or a less dense area. This situation is similar to that of the Cubs and the White Sox both being in Chicago, or the Yankees and Mets both being in New York City. Analysts have estimated the market shares. We place both sets of payoffs in a single game matrix. Listing the row player’s payoff, the American League’s payoff, as first in the ordered pair, we have the payoff matrix shown in Table 7.3. The payoff matrix represents a constant-sum total-conflict game. Arrows are added to the table to create the movement diagram.
TABLE 7.3: New Baseball Franchise Payoff Matrix and Movement Diagram
The payoff (65,35) only has arrows pointing in for the densely populated areas choice for franchise strategies for both players, no arrow exits that outcome. The movement diagram indicates that neither player can unilaterally improve their solution giving a Nash equilibrium ([Straffinl993]).
Linear Programming in Total Conflict Games
Von Neumann’s minimax theorem ([vNeumannl928]) states that for every two-person, zero-sum game with finitely many strategies, there exists an outcome value V and a set of strategies for each player, such that
Equivalently, Player l’s strategy guarantees him a payoff of V regardless of Player 2’s strategy, and similarly Player 2 can guarantee a payoff of —V regardless of Player l’s strategy'. The name minimax arises from each player minimizing the maximum payoff possible for the other; since the game is zero-sum, the Player also minimizes his own maximum loss; i.e., maximize his minimum payoff.
Every total conflict game may be formulated as a linear programming problem ([Dantzigl951] and [Dorfmanl951]). Consider a total-conflict two-person game in which maximizing Player X has m strategies ancl minimizing Player Y has n strategies. The entry (Mij,Nij) from the ith row and jth column of the payoff matrix represents the payoff for those strategies. The following formulation, using only the elements of Мц for the maximizing player, provides results for the value of the game and the probabilities Xi of outcomes ([GFH2014], [Fox2011 Maple], and [Winston2002]).
If there are negative entries in the payoff matrix, a slight modification to the linear programming formulation is necessary since all variables must be non-negative when using the simplex method. To obtain a possible negative value solution for the game, use the method described in [Winston2002]: replace any variable that could take on negative values with the difference of two positive variables. Since only V. the value of game, can be positive or negative, replace V with V = Vj — Vj with both new variables positive. The other values we are looking for are probabilities which are always non- negative. In these games, players want to maximize the value of the game that they receive. The Linear Program (7.1) is a linear programming formulation for finding the optimal strategies and value of the game.
The weights ад yield Rose’s strategy, V is the value of the game to Rose. When the solution to this total conflict game is obtained, we also have the solution to Colin’s game through the solution of the dual linear program ([Winston2002]). As an alternative to the dual, we can formulate Colin’s game directly as shown in (7.2) using the original N,jS. We call the value of the game for Colin v to distinguish it from Rose’s value V. Colin’s linear program is
The weights у* yield Colin’s strategy, v is the value of the game to Colin.
Put Example 7.1 into the two formulations and solve to obtain the solution
The overall solution is that each league should place its new franchise in a densely populated area giving the solution of a (65,35) market share split.
The primal-dual simplex method only works in the zero-sum game format ([Fox2010]). We may convert this game to a zero-sum form to obtain the solution via linear programming. Since this is a constant sum game, whatever the American League gains, the national League loses. For example out of 100%, if the American League franchise gains 65%, then the National League franchise loses 65% of the market as in Table 7.4
TABLE 7.4: Zero-Sum Game Payoff Matrix for a New Baseball Franchise
For a zero-sum game, we only need a single formulation of the linear program. The Row Player maximizes and the Column Player minimizes with rows’ values constituting a primal and dual relationship. The linear program used in zero-sum games is equivalent to the formulation in (7.1) with for Mij designating the zero-sum outcomes for the Row Player; the linear program is shown in (7.3).
where V is the value of the game, a;are the payoff-matrix entries, and xt’s are the weights (probabilities to play the strategies).
For the baseball franchise example, place the payoffs into (7.3), letting V be the value of the game to the Row Player, the American League, giving
Maple solves this LP easily.
We applied /normal to the result to eliminate numerical error artifacts.
Now for the other player.
Note that the solutions are not exact; this is due to numerical methods used internally by Maple and why we applied fnormal to the result.
The optimal solution strategies found are identical, as before, with both players choosing a more densely populated area as their best strategy. The use of linear programming is quite efficacious for large games between two players each having many strategies ([Fox2010], [Fox2014], and [GFH2014]). We note that the solution to the National League franchise game is found either as the dual solution, (See [Winston2002], Section 11.3) or by simply re-solving the linear program from Column Player’s perspective. In Chapter 4 of Advanced Problem Solving with Maple™: A First Course, we showed how Maple can be used to solve both the primal and dual linear programs.
The Partial Conflict Game
Partial conflict games are games in which one player wins, but the other player does not have to lose. Both players could win something or both could lose something. Solution methods for partial conflict games include looking for dominance, analyzing movement diagrams, and finding equalizing strategies. Here we present an extension from the total conflict game to the partial conflict game as an application of linear programming; see [Fox2010] and [Fox2014], Because of the nature of part ial conflict games where both players are trying to maximize their outcomes, we can model all players’ strategies as their own maximizing linear programs. We treat each player as a separate linear programming maximization problem.
Again, use the payoff matrix of Table 7.2. Now assume that My + JVy is not always equal to zero or the same constant for all i and j. In non- cooperative partial-conflict games, we first look for a pure-strategy solution using a movement diagram.
The Row Player, Rose, maximizes payoffs, so she would prefer the highest payoff in each column. Vertical arrows are in columns with Rose’s values. Similarly, The Column Player, Colin, maximizes his payoffs, so he would prefer the highest payoff in each row. Draw an arrow to the highest payoff in that row. Horizontal arrows are in rows with Colin’s values. If all arrows point to a cell from every direction, then that cell will be a pure Nash equilibrium.
If all the arrows do not point at a value or values, i.e., there is no pure Nash equilibrium, then we must use equalizing strategies to find the weights (probabilities) for each player. For a game with two players having two strategies each, proceed as follows:
Rose’s game: Rose maximizing and Colin “equalizing” is a total-conflict game that yields Colin’s equalizing strategy.
Colin’s game: Colin maximizing and Rose “equalizing” is a total-conflict game that yields Rose’s equalizing strategy.
Note: If either side plays its equalizing strategy, the other side cannot unilaterally improve its own situation— the other player is stymied.
This analysis translates into two maximizing linear programming formulations shown in (7.4) and (7.5) below. The LP formulation in (7.4) provides the Nash equalizing solution for Colin with strategies played by Rose, while the LP formulation in (7.5) provides the Nash equalizing solution for Rose with strategies played by Colin.
If there is a pure strategy solution, it is found through movement diagrams or dominance. The linear programming formulations will not find pure strategy results; LPs only provide the Nash equilibrium using equalizing strategies ([Straffinl993]).
For games with two players and more than two strategies each, Bazarra et al. (See [BSC2013]) presented a nonlinear optimization approach. Consider a two-person game with a standard payoff matrix. Separate the payoff matrix into two matrices M and N for Players I and II. Then solve the nonlinear optimization formulation given in expanded form in (7.6).
Example 7.2. A Partial Conflict Equalizing Strategy Game Solution.
Table 7.5 shows a partial-conflict game with revised payoff estimates of market shares. The arrows in the movement diagram indicate that the game has no pure strategy solution.
TABLE 7.5: Partial-Conflict Movement Diagram
We use (7.4) and (7.5) to formulate and solve these partial conflict games for their Nash equalizing strategies.
The solutions to this partial conflict game are
This game results in Colin playing Large City and Small City each half the time, ensuring a value of 15.00 for Rose. Rose plays a mixed strategy of 4/7 Large City and 3/7 Small City which yields a value of the game of 22.857 for Colin.
To be a solution, the Nash equilibrium must be Pareto optimal (no other solution is better for both players, northeast region) as defined by Straffin (See [Straffinl993]). Visually, we can plot the coordinates of the outcomes and connect them into a convex set (the convex hull). The Nash equilibrium (15, 22.85) is an interior point, and so it is not Pareto optimal; see Figure 7.1.
FIGURE 7.1: Payoff Polygon with Nash Equilibrium
When the results are mixed strategies, the implication is that the game is repeated many times in order to achieve that outcome in the long run.
Example 7.3. A 3 x 3 Nonzero Sum Game.
Rose and Colin each have three strategies with payoffs shown in Table 7.6. TABLE 7.6: Rose and Colin Strategies
First, we use a movement diagram to find two Nash equilibrium points. They are R^Ci = (2,1) and R3C3 = (1.2). These pure strategy solutions are not equivalent and trying to achieve them might lead to other results. We might employ the nonlinear method described earlier to look for other equilibrium solutions, if they exist. We find the nonlinear method does produce another solution when p = q = 0.667 when a,q = 0, a: 2 = 0.667, X3 = 0.333, 2/1 = 0.333, 2/2 = 0, and 2/3 = 0.667. The Maple statements to obtain this solution are:
Since the objective function is quadratic, use Maple’s quadratic program solver, QPSolve, from the Optimization package.
Communications and Cooperation in Partial Conflict Games
Allowing for communication may change the game’s solution. A player may consider combinations of moves, threats, and promises to attempt to obtain better outcomes. The strategy of moves is explored in several sources listed in References and Further Reading (pg. 336).
Nash Arbitration Method
When we have not achieved a solution by other methods that is acceptable to the players, then a game may move to arbitration. The Nash Arbitration Theorem (Nash, 1950) states that
There is a unique arbitration solution which satisfies the axioms Rationality: The solution point is feasible.
Linear Invariance: Changing scale does not change the solution. Symmetry: The solution does not discriminate against any player.
Independence of Irrelevant Alternative: Eliminating solutions that would not be chosen does not change the solution.
Prudential Strategy (Security Levels)
The security levels are the payoffs to the players in a partial conflict game where each player attempts to maximize their own payoff. We can solve for these payoffs using a separate linear program for each security level.
The LP formulations are (7.9) and (7.10).
The weights Xi yield Rose’s prudential strategy for the security level V.
The weights yi yield Colin’s prudential strategy for the security level v
Revisit Example 7.2 to illustrate finding security levels. Let SLR and SLC represent the security levels for Rose and Colin, respectively. We use linear programming to find these values using (7.9) and (7.10) yielding
The solution yields both how the game is played and the security levels. Rose always plays R, and Colin plays 4/7 C and 3/7 C-2- The security level is (10,22.86).
Using this security level, (10,22.86), as our status quo point, we can formulate the Nash arbitration scheme. We restate more formally the four axioms stated above that are met using the Nash arbitration scheme.
Axiom 1: Rationality. The solution must be in the negotiation set.
Axiom 2: Linear Invariance. If either Player l’s or Player 2’s utility functions are transformed by a positive linear function, the solution point should be transformed by the same function.
Axiom 3: Symmetry. If the polygon happens to be symmetric about the line of slope +1 through the status quo point, then the solution should be on this line. No player is favored, no player is discriminated against.
Axiom 4: Independence of Irrelevant Alternatives. Suppose N is the solution point for a polygon P with status quo point SQP. Suppose Q is another polygon which contains both SQP and N, and is totally contained in P. Then N should also be the solution point to Q with status quo point SQP, i.e., the solution point is not changed by non-solution points being eliminated from consideration.
Theorem. Nash’s Arbitration Theorem (Nash, 1950).
There is one and only arbitration scheme which satisfies the four axioms rationality, linear invariance, symmetry, and independence of irrelevant alternatives. The arbitration scheme is: If the status quo (SQP) point is (xo,yo), then the arbitrated solution point N is the point (x, y) in the polygon with x > a.’o, у > yo which maximizes the product Z = (x — xQ(y — yo).
We apply Nash’s theorem in a nonlinear optimization framework (Ktthn- Tucker conditions). The formulation for our example is
Maple finds the solution easily using QPSolve. In this example, for the status quo point (10,22.86), QPSolve gives our Nash arbitration point as (17.86,46.43).