The concept of game models.

19.03.2019

A two-person zero-sum game is called, in which each of them has a finite set of strategies. The rules of the matrix game are determined by the payoff matrix, whose elements are the payoffs of the first player, which are also the losses of the second player.

Matrix game is an antagonistic game. The first player receives the maximum guaranteed (not dependent on the behavior of the second player) payoff equal to the price of the game, similarly, the second player achieves the minimum guaranteed loss.

Under strategy is understood as a set of rules (principles) that determine the choice of a variant of actions for each personal move of a player, depending on the current situation.

Now about everything in order and in detail.

Payoff matrix, pure strategies, game price

IN matrix game its rules are determined payoff matrix .

Consider a game in which there are two participants: the first player and the second player. Let the first player have m pure strategies, and at the disposal of the second player - n pure strategies. Since a game is being considered, it is natural that there are wins and losses in this game.

IN payment matrix the elements are numbers expressing the gains and losses of the players. Wins and losses can be expressed in points, money or other units.

Let's create a payoff matrix:

If the first player chooses i-th pure strategy, and the second player j-th pure strategy, then the payoff of the first player is aij units, and the loss of the second player is also aij units.

Because aij + (- a ij ) = 0, then the described game is a zero-sum matrix game.

The simplest example of a matrix game is tossing a coin. The rules of the game are as follows. The first and second players toss a coin and the result is heads or tails. If heads and heads or tails or tails are rolled at the same time, then the first player will win one unit, and in other cases he will lose one unit (the second player will win one unit). The same two strategies are at the disposal of the second player. The corresponding payoff matrix would be:

The task of game theory is to determine the choice of the strategy of the first player, which would guarantee him the maximum average gain, as well as the choice of the strategy of the second player, which would guarantee him the maximum average loss.

How is a strategy chosen in a matrix game?

Let's look at the payoff matrix again:

First, we determine the payoff of the first player if he uses i th pure strategy. If the first player uses i-th pure strategy, then it is logical to assume that the second player will use such a pure strategy, due to which the payoff of the first player would be minimal. In turn, the first player will use such a pure strategy that would provide him maximum win. Based on these conditions, the payoff of the first player, which we denote as v1 , is called maximin win or bottom at the cost of the game .

At for these values, the first player should proceed as follows. From each line, write out the value of the minimum element and choose the maximum from them. Thus, the payoff of the first player will be the maximum of the minimum. Hence the name - maximin win. The line number of this element will be the number of the pure strategy chosen by the first player.

Now let's determine the loss of the second player if he uses j-th strategy. In this case, the first player uses his own pure strategy, in which the loss of the second player would be maximum. The second player must choose such a pure strategy in which his loss would be minimal. The loss of the second player, which we denote as v2 , is called minimax loss or top game price .

At solving problems on the price of the game and determining the strategy to determine these values ​​for the second player, proceed as follows. From each column, write out the value of the maximum element and choose the minimum from them. Thus, the loss of the second player will be the minimum of the maximum. Hence the name - minimax gain. The column number of this element will be the number of the pure strategy chosen by the second player. If the second player uses "minimax", then regardless of the choice of strategy by the first player, he will lose at most v2 units.

Example 1

.

The largest of the smallest elements of the rows is 2, this is the lower price of the game, the first row corresponds to it, therefore, the maximin strategy of the first player is the first. The smallest of the largest elements of the columns is 5, this is the upper price of the game, the second column corresponds to it, therefore, the minimax strategy of the second player is the second.

Now that we have learned how to find the lower and upper price of the game, the maximin and minimax strategies, it's time to learn how to designate these concepts formally.

So, the guaranteed payoff of the first player is:

The first player must choose a pure strategy that would provide him with the maximum of the minimum payoffs. This gain (maximin) is denoted as follows:

.

The first player uses his pure strategy so that the loss of the second player is maximum. This loss is defined as follows:

The second player must choose his pure strategy so that his loss is minimal. This loss (minimax) is denoted as follows:

.

Another example from the same series.

Example 2 Given a matrix game with a payoff matrix

.

Determine the maximin strategy of the first player, the minimax strategy of the second player, the lower and upper price of the game.

Solution. To the right of the payoff matrix, we write out the smallest elements in its rows and mark the maximum of them, and from the bottom of the matrix - largest elements in columns and choose the smallest of them:

The largest of the smallest elements of the rows is 3, this is the lower price of the game, the second row corresponds to it, therefore, the maximin strategy of the first player is the second. The smallest of the largest elements of the columns is 5, this is the upper price of the game, the first column corresponds to it, therefore, the minimax strategy of the second player is the first.

Saddle point in matrix games

If the upper and lower price of the game are the same, then the matrix game is considered to have a saddle point. The converse is also true: if a matrix game has a saddle point, then the upper and lower prices of the matrix game are the same. The corresponding element is both the smallest in the row and the largest in the column, and equal to the price games.

Thus, if , then is the optimal pure strategy of the first player, and is the optimal pure strategy of the second player. That is, equal lower and upper prices of the game are achieved on the same pair of strategies.

In this case the matrix game has a solution in pure strategies .

Example 3 Given a matrix game with a payoff matrix

.

Solution. To the right of the payoff matrix, we write out the smallest elements in its rows and mark the maximum of them, and from the bottom of the matrix - the largest elements in the columns and select the minimum of them:

The lower price of the game is the same as the upper price of the game. Thus, the price of the game is 5. That is . The price of the game is equal to the value of the saddle point. The maximin strategy of the first player is the second pure strategy, and the minimax strategy of the second player is the third pure strategy. This matrix game has a solution in pure strategies.

Solve the matrix game problem yourself, and then see the solution

Example 4 Given a matrix game with a payoff matrix

.

Find the lower and upper price of the game. Does this matrix game have a saddle point?

Matrix games with optimal mixed strategy

In most cases, the matrix game does not have a saddle point, so the corresponding matrix game has no pure strategy solutions.

But it has a solution in optimal mixed strategies. To find them, it must be assumed that the game is repeated enough times that, based on experience, one can guess which strategy is preferable. Therefore, the decision is associated with the concept of probability and average (expectation). In the final solution, there is both an analog of the saddle point (that is, the equality of the lower and upper prices of the game), and an analog of the strategies corresponding to them.

So, in order for the first player to get the maximum average gain and for the second player to have the minimum average loss, pure strategies should be used with a certain probability.

If the first player uses pure strategies with probabilities , then the vector is called the mixed strategy of the first player. In other words, it is a "mixture" of pure strategies. The sum of these probabilities is equal to one:

.

If the second player uses pure strategies with probabilities , then the vector is called the mixed strategy of the second player. The sum of these probabilities is equal to one:

.

If the first player uses a mixed strategy p, and the second player - a mixed strategy q, then it makes sense expected value the first player wins (the second player loses). To find it, you need to multiply the first player's mixed strategy vector (which will be a one-row matrix), the payoff matrix, and the second player's mixed strategy vector (which will be a one-column matrix):

.

Example 5 Given a matrix game with a payoff matrix

.

Determine the mathematical expectation of the first player's gain (the second player's loss), if the mixed strategy of the first player is , and the mixed strategy of the second player is .

Solution. According to the formula for the mathematical expectation of the first player's gain (loss of the second player), it is equal to the product of the first player's mixed strategy vector, the payoff matrix, and the second player's mixed strategy vector:

The first player is called such a mixed strategy that would provide him with the maximum average payoff if the game is repeated a sufficient number of times.

Optimal mixed strategy The second player is called such a mixed strategy that would provide him with the minimum average loss if the game is repeated a sufficient number of times.

By analogy with the notation of maximin and minimax in the cases of pure strategies, optimal mixed strategies are denoted as follows (and are associated with mathematical expectation, that is, the average of the gain of the first player and the loss of the second player):

,

.

In this case, for the function E there is a saddle point , which means equality.

In order to find the optimal mixed strategies and saddle point, i.e. solve the matrix game in mixed strategies , you need to reduce the matrix game to a linear programming problem, that is, to an optimization problem, and solve the corresponding linear programming problem.

Reduction of a matrix game to a linear programming problem

In order to solve a matrix game in mixed strategies, you need to compose a straight line linear programming problem And its dual task. In the dual problem, the augmented matrix, which stores the coefficients of variables in the constraint system, constant terms, and coefficients of variables in the goal function, is transposed. In this case, the minimum of the goal function of the original problem is associated with the maximum in the dual problem.

Goal function in direct linear programming problem:

.

The system of constraints in the direct problem of linear programming:

Goal function in the dual problem:

.

The system of constraints in the dual problem:

Denote the optimal plan of the direct linear programming problem

,

and the optimal plan of the dual problem is denoted by

Linear shapes for relevant optimal plans denote and ,

and you need to find them as the sum of the corresponding coordinates of the optimal plans.

In accordance with the definitions of the previous section and the coordinates of optimal plans, the following mixed strategies of the first and second players are valid:

.

Mathematicians have proven that game price is expressed in terms of linear forms of optimal plans as follows:

,

that is, it is the reciprocal of the sums of the coordinates of the optimal plans.

We, practitioners, can only use this formula to solve matrix games in mixed strategies. Like formulas for finding optimal mixed strategies respectively the first and second players:

in which the second factors are vectors. Optimal mixed strategies are also vectors, as we already defined in the previous paragraph. Therefore, multiplying the number (the price of the game) by the vector (with the coordinates of the optimal plans), we also get a vector.

Example 6 Given a matrix game with a payoff matrix

.

Find the price of a game V and optimal mixed strategies and .

Solution. We compose the linear programming problem corresponding to this matrix game:

We get the solution of the direct problem:

.

We find the linear form of optimal plans as the sum of the found coordinates.

Consider a paired finite game. Let the player A has T personal strategies, which we denote

Let the player IN available P personal strategies, let's denote them. They say that the game has a dimension T X P.

As a result of the choice by the players of any pair of strategies, the outcome of the game is uniquely determined, i.e. win A;. player A(positive or negative) and losing (-ay) player IN. Let's assume that the values A.. known for any pair of strategies (A:, B;.). Matrix P =(a..), i == 1, 2, ..., mj = 1, 2, ..., P, whose elements are the payoffs corresponding to the strategies A. And bj, called payment matrix, or game matrix. General form such a matrix is ​​presented in Table. 12.1. The rows of this table correspond to the player's strategies A, and the columns are the player's strategies IN.

Table 12.1

Let's make a payoff matrix for the next game.

12.1. Search game.

Player A can hide in one of two shelters (I and II); player IN looking for a player A, and if he finds, he receives a fine of 1 den. units from A, otherwise pays the player A 1 day units It is necessary to build the payoff matrix of the game.

D e s h e n i s. For compiling payment matrix should analyze the behavior of each of the players. Player A can hide in shelter I – we denote this strategy by A v either in shelter II - strategy A. g Player IN can look for the first player in the shelter I - strategy IN(or in shelter II - strategy IN.,. If the player A is in hideout I and is discovered there by the player IN, those. a couple of strategies are being implemented ν IN{), then the player A pays a fine, i.e. A n = -1. Similarly, we get A. n = -1 (A 2, IN.,). Obviously, the strategies (A, IN.,) and (R2, /1,) give the player A win 1, so A P = a. n = I. Thus, for the game "search" of size 2x2, we get the payoff matrix:

Consider the game T X P with matrix P = a j) , i = 1,2, ..., τη; j= 1, 2, ..., and and determine the best among the strategies A at A v..., A m. Choosing a strategy A jy player A should expect the player IN will answer it with one of the strategies IN., for which the payoff for the player A minimal (player IN seeks to "harm" the player A).

Denote by a; player's lowest payoff A when he chooses strategy L; for all possible player strategies IN(smallest number in i-th line payoff matrix), i.e.

Among all numbers a (r = 1,2,..., T) choose the largest: . Let's call and the lower price of the game, or maximum payoff (maximin). This Guaranteed payoff of player A for any strategy of player B. Hence,

(12.2)

The strategy corresponding to the maximin is called maximin strategy. Player IN interested in reducing the player's payoff A; choosing a strategy IN., it takes into account the maximum possible payoff for A. Denote

Among all numbers β. choose the smallest

and call β top game price, or minimax payoff (minimax). This guaranteed loss of player B. Hence,

(12.4)

The minimax strategy is called minimax strategy.

The principle that dictates to players the choice of the most "cautious" minimax and maximin strategies is called the principle minimax. This principle follows from the reasonable assumption that each player seeks to achieve the opposite goal of the opponent. Let us determine the lower and upper prices of the game and the corresponding strategies in problem 12.1. Consider the payoff matrix

from problem 12.1. When choosing a strategy A, (the first row of the matrix) minimum win is equal to a, =min(-l; 1) = -1 and corresponds to the strategy β1 of the player IN. When choosing a strategy L 2 (second row of the matrix) the minimum payoff is A 2 = min(l; -1) = -1, it is achieved with the strategy IN.,.

Guaranteeing the maximum win for any strategy of the player IN, i.e. the lower price of the game a = max(a, a2) = max(-l; -1) = -1, player A can choose any strategy: Aj or A 2, i.e. any of his strategies is maximin.

Choosing strategy B, (column 1), the player IN understands that the player A will respond with a strategy A 2 to maximize your gain (loss IN). Therefore, the player's maximum loss IN when he chooses strategy B, is equal to β, = max(-1; 1) = 1.

Similarly, the maximum loss of player B (gain A) when he chooses strategy B2 (column 2) is equal to β2 = max(l; -1) = 1.

Thus, for any strategy of the player A the guaranteed minimum loss of player B is equal to β = πιίη(β1, β2) = min(l; 1) = 1- the top price of the game.

Any strategy of player B is minimax. By adding Table. 12.1 line β; and column a;, we get the table. 12.2. At the intersection of additional rows and columns, we will record the upper and lower prices of the games.

Table 12.2

In Problem 12.1 above, the upper and lower costs of the game are different: a f β.

If the upper and lower prices of the game are the same, then general meaning the upper and lower prices of the game α = β = υ is called the net price of the game, or the price of the game. The minimax strategies corresponding to the price of the game are optimal strategies and their totality the optimal solution or decision games. In this case the player A receives the maximum guaranteed (independent of the player's behavior) IN) the payoff is υ, and the player IN achieves the minimum guaranteed (regardless of the behavior of player L) loss υ. The solution to the game is said to have stability, those. if one of the players sticks to his optimal strategy, then it cannot be advantageous for the other to deviate from his optimal strategy.

A couple of pure strategies A. and V. gives optimal solution game if and only if the corresponding element r is both the largest in its column and the smallest in its row. Such a situation, if it exists, is called saddle point(similar to the surface of a saddle, which curves up in one direction and down in the other).

Denote A* And IN* are a pair of pure strategies on which the solution of the game in the problem with a saddle point is achieved. Let us introduce the payoff function of the first player on each pair of strategies: P(A:, IN-) = and at. Then the optimality condition at the saddle point satisfies the double inequality: P(Aj, B*)<Р(А*, В*)<Р(А", В ), which is true for everyone i = 1, 2, ..., m;j = 1, 2, ..., P. Indeed, the choice of strategy A* the first player under the optimal strategy IN" the second player maximizes the minimum possible payoff: P(A*, B")> P(A G IN"), and the choice of strategy B" the second player, with the optimal strategy of the first one, minimizes the maximum loss: P(D , IN*)<Р(А", В).

12.2. Determine the lower and upper price of the game given by the payoff matrix

Does the game have a saddle point?

Table 12 3

Solution. All calculations are conveniently carried out in a table in which, in addition to the matrix R, entered column a; and line )

Similar articles