Games in pure strategies. Game Theory and Statistical Decisions

01.03.2019

Description of the bimatrix game. All games that were considered belonged to the class zero sum games. However, a number of conflict situations that develop in the course of actions are characterized by the fact that the gain of one side is not exactly equal to the loss of the other. Game-theoretic models such situations are non-cooperative games with non-zero sum. Such games are called bimatrix, because the task of each such game is reduced to the task of two matrices and the same form: .

Process bimatrix game consists in the independent choice of a number by player I and a number by player II, after which player I receives a payoff, and player II receives a payoff.

The row numbers of the matrices and will be called player's pure strategies I, and the column numbers of these matrices are player's pure strategies II. Then pairs of the form will be situations in pure strategies bimatrix game, and the numbers and are the payoffs of players I and II in the situation . Accordingly, the probability distribution of applying the pure strategies of player I is and player II - we will call mixed strategies. Then pairs of the form represent situations bimatrix game V mixed strategies, and the numbers And are the expected payoffs of players I and II.

An equilibrium situation of a bimatrix game in mixed strategies we will call a pair such that:

(8.2)
,

Where - expected value the payoff of player I;

Mathematical expectation of player's payoff II;

Optimal mixed player strategy I;

Optimal mixed player strategy II.

Task

Construction and solution of a bimatrix game. Suppose that a country's anti-submarine submarine is searching for a state's missile submarine that is maneuvering in a strictly defined part of a combat patrol area. An ASW submarine operates in the rest of the area and searches for the ASW. Let each anti-submarine boat to detect the enemy can use its hydroacoustic station either in the active mode, turning it on periodically, or only in the passive mode, performing a continuous search.

Both an anti-submarine submarine, and a missile submarine with the detection of sonar signals, can evade the enemy. However, the frequency of turning on the sonar makes detection possible, but unreliable.

In such conflict situation one of the players is an anti-submarine submarine, and the other is an anti-submarine submarine. Obviously, a missile submarine cannot be a player, since it has only one mode of action, which is to covertly maneuver and perform evasive action with the detection of sonar signals.

Characteristic here is that each of the players pursues different, but not opposite goals. Indeed, the purpose of an ASW submarine is to locate a missile submarine, and the purpose of an ASW submarine is to locate an ASW. Therefore, to assess the achievement of the goal by each of the players, depending on the chosen methods of action (strategy), it is necessary to have two efficiency criteria and, accordingly, two payoff functions. Then the model of such a conflict situation will be a finite game with a non-zero sum, described by two matrices of the same form And , called bimatrix.

Let's take for efficiency criterion anti-submarine submarine (player I) the probability of detecting a missile submarine, and for efficiency criterion anti-submarine submarine (player II) - the probability of detecting an anti-submarine submarine . Then the bimatrix game will be given by a matrix (figure 9.a) and a matrix (figure 9.b).


Rice. 9.a.


Rice. 9.b.

Where - use of the active mode;

Using passive mode.

The two-player zero-sum matrix game can be viewed as the following abstract two-player game.

The first player has m strategies i = 1,2,...,m, the second has n strategies j= 1,2,...,n. Each pair of strategies ( i, j) is assigned a number A ij , expressing the payoff of player 1 at the expense of player 2 if the first player accepts his i- strategy, and 2 - your j-th strategy.

Each player makes one move: player 1 chooses his i-th strategy ( i= ), 2 – own j-th strategy ( j=
), after which player 1 wins A ij at the expense of player 2 (if A ij < 0, this means that player 1 pays the second player | A ij|). This is where the game ends.

Each player strategy i=
;
j =
often called a pure strategy.

If we consider the matrix

A=

then the execution of each game of the matrix game with the matrix A comes down to player 1's choice i-th line, and player 2 j-th column and player 1 (at the expense of player 2) receives the payoff a ij .

The main thing in the study of games is the concept of optimal strategies for players. This concept intuitively has the following meaning: a player's strategy is optimal if the application of this strategy provides him with the greatest guaranteed payoff for all possible strategies of the other player. Based on these positions, player 1 explores the payoff matrix A as follows: for each value i (i =
) the minimum payoff value is determined depending on the applied strategies of player 2

A ij (i=
)

those. determined minimum win for player 1, provided that he accepts his i-th pure strategy, then such a strategy is found from these minimum payoffs i = i O, at which this minimum payoff will be maximum, i.e. located


A ij =
=(1)

Definition . Number defined by formula (1) is called lower net price games and shows what minimum payoff player 1 can guarantee himself by applying his pure strategies for all possible actions of player 2.

Player 2, with his optimal behavior, should strive to minimize the payoff of player 1 as much as possible at the expense of his strategies. Therefore, for player 2,

A ij

those. the max payoff of player 1 is determined, provided that player 2 applies his j-th pure strategy, then player 2 finds such a j = j 1 a strategy in which player 1 gets min payoff, i.e. finds


a ij =
=(2).

Definition . Number , determined by formula (2), is called net top price games and shows what maximum gain player 1 can guarantee himself due to his strategies.

In other words, by applying his pure strategies, player 1 can secure a payoff no less than , and player 2, by applying his pure strategies, can prevent player 1 from winning more than .

Definition . If in the game with matrix A =, then this game is said to have saddle point in pure strategies and net price games

 = =.

saddle point is a pair of pure strategies (i O , j O ) players 1 and 2, respectively, under which equality is achieved =. This concept has the following meaning: if one of the players adheres to the strategy corresponding to the saddle point, then the other player cannot do better than to adhere to the strategy corresponding to the saddle point. Mathematically, this can be written in another way:


Where i, j are any pure strategies of players 1 and 2, respectively; (i O , j O ) are strategies that form a saddle point.

Thus, based on (3), the saddle element
is the minimum in the i o -th row and the maximum in the j o -th column in the matrix A. Finding the saddle point of the matrix A is as follows: in the matrix A, successively in each line find the minimum element and check if this element is the maximum in its column. If yes, then it is a saddle element, and the pair of strategies corresponding to it forms a saddle point. A couple of pure strategies (i O , j O ) players 1 and 2, forming a saddle point and a saddle element
, is called game decision . Wherein i O And j O called optimal clean strategies players 1 and 2 respectively.

Example 1

The saddle point is the pair ( i O = 3;j O= 1), at which = == 2.

Note that although the payoff in situation (3;3) is also equal to 2 = =, it is not a saddle point, because this payoff is not the maximum among the payoffs of the third column.

Example 2

From the analysis of the payoff matrix, it can be seen that
, i.e. this matrix has no saddle point. If player 1 chooses his pure maximin strategy i = 2, then player 2, having chosen his minimax j= 2, only 20 will lose. In this case, it is profitable for player 1 to choose strategy i = 1, i.e., deviate from his pure maximin strategy and win 30. Then it will be profitable for player 2 to choose the strategy j = 1, i.e. deviate from his pure minimax strategy and lose 10. In turn, player 1 must choose his 2nd strategy to win 40, and player 2 responds by choosing his 2nd strategy, and so on.

Consider an example. Let the game matrix (4) be given:

It is required to find the lower price of the game α, the upper price of the game β and minimax strategies and check whether they are stable. Solution. From the analysis of additional columns and rows, we get: α = 5, β = 5. Maximin is equal to minimax! The case is special. What follows from this? Let's take a couple of minimax strategies: K 2 and C 3 . If both stick to these strategies, then the payoff will be 5. Now, let's say we learned about the behavior of the opponent. What do we do? And nothing! We will continue to stick to the K 2 strategy, because any deviation from it is unprofitable for us. Whether we know or do not know about the behavior of the enemy, we will still stick to the K 2 strategy! The same applies to the "blue" - it makes no sense for them to change their strategy C 3 . In this example, the pair of strategies K 2 and C 3 is stable, i.e., it represents an equilibrium position and gives a solution to the game. Why did it happen? Because there is a special element 5 in the matrix; it is the minimum in its row and at the same time the maximum in its column. Such an element is called saddle point. If the matrix has a saddle point (i.e., the lower price of the game is equal to the upper one), then the game has a solution in pure strategies: it is a pair of strategies that intersect at the saddle point. The saddle point itself gives the price of the game - in our example it is equal to 5. The class of games that have a saddle point is of great importance in game theory. In particular, it has been proved that if, according to the rules of the game, each of the players knows the result of all previous moves, both his own and the opponent's (the so-called game with complete information), then the game has a saddle point and hence has a solution in pure strategies. Examples of games with complete information are: chess, checkers, "tic-tac-toe", etc. Let us give an example of a game with complete information, the solution of which is easy to find. Two players - K and C - alternately put the same coins on the round table. The position of each coin is chosen arbitrarily, so long as it does not overlap with others. The winner is the player who puts the coin last (when there is no room for others). It takes a little thought to make sure that the outcome of this game is always a foregone conclusion and that there is a well-defined strategy that guarantees the win to the player who puts the coin first (let it be K). Namely, K must put the first coin in the center of the table, and then for each move C must respond with a move exactly symmetrical relative to the center of the table! Poor C can behave as he pleases, he still has no salvation... Obviously, such a game makes sense only for those who do not know the solution. It is curious that exactly the same is the case with such popular game like chess! This game only makes sense until a solution is found. It has been theoretically proved that a solution exists and the outcome of a chess game is essentially a foregone conclusion: if each side uses its own optimal strategy, then the game will either always end in a win for White, or always in a win for Black, or always in a draw! But what exactly? We do not yet know this, since the number of possible strategies is too large to construct the matrix of a chess game and find a saddle point in it... Probably, chess lovers are interested in the fact that the chess game will not be solved soon. In conclusion, we note that there can be not one, but several saddle points in the matrix; then there are as many solutions to the game in pure strategies as there are saddle points. Each of them gives a win, equal to the price games.

Pure strategy player I is to choose one of the n rows of the payoff matrix A, and player II's pure strategy is to choose one of the columns of the same matrix.

Optimal pure strategies of players differ from mixed strategies by the presence of a mandatory unit p i = 1, q i = 1. For example: P(1,0), Q(1,0). Here p 1 = 1, q 1 = 1.

Task 1
Based on the payoff matrix, find the optimal pure strategies using the principle of strict dominance. As an answer, write down the vectors P*, Q*.



R1

R2

R3

R4

S1

3

1

2

5

S2

2

0

0

3

S3

-3

-5

-5

-2

S4

0

-2

-2

1

Solution:

We solve all problems using the Matrix Game calculator.

We assume that player I chooses his strategy in such a way as to maximize his payoff, and player II chooses his strategy in such a way as to minimize the payoff of player I.

PlayersB1B2B3B4a = min(Ai)
A 13 1 2 5 1
A22 0 0 3 0
A 3-3 -5 -5 -2 -5
A40 -2 -2 1 -2
b = max(Bi)3 1 2 5
We find the guaranteed payoff determined by lower price game a = max(a i) = 1, which indicates the maximum pure strategy A 1 .
The upper price of the game b = min(b j) = 1.
The saddle point (1, 2) indicates a solution to a pair of alternatives (A1,B2). The value of the game is 1.
2. Check the payoff matrix for dominant rows and dominant columns.
Sometimes, based on a simple consideration of the game matrix, it can be said that some pure strategies can enter the optimal mixed strategy only with zero probability.
They say that i-th 1st player's strategy dominates him k-th strategy if a ij ≥ a kj for all j E N and for at least one j a ij > a kj . In this case it is also said that i-th strategy (or line) is dominant, k-th- dominated.
They say that j-th 2nd player's strategy dominates him l-th strategy if for all j E M a ij ≤ a il and for at least one i a ij< a il . В этом случае j-th strategy (column) is called dominant, l-th- dominated.
Strategy A 1 dominates strategy A 2 (all elements of row 1 are greater than or equal to the values ​​of the 2nd row), therefore we exclude the 2nd row of the matrix. Probability p 2 = 0.
Strategy A 1 dominates strategy A 3 (all elements of row 1 are greater than or equal to the values ​​of the 3rd row), therefore we exclude the 3rd row of the matrix. Probability p 3 = 0.
3 1 2 5
0 -2 -2 1

From the standpoint of losses of player B, strategy B 1 dominates strategy B 2 (all elements of column 1 are greater than elements of column 2), therefore, we exclude the 1st column of the matrix. Probability q 1 = 0.
From the standpoint of player B's losses, strategy B 4 dominates strategy B 1 (all elements of column 4 are greater than elements of column 1), therefore, we exclude the 4th column of the matrix. Probability q 4 = 0.
1 2
-2 -2

We reduced the 4 x 4 game to a 2 x 2 game.



Game solution ( 2 x n


p 1 = 1
p2 = 0
Game price, y = 1
Now we can find the minimax strategy of player B by writing the corresponding system of equations
q 1 = 1
q1 +q2 = 1
Solving this system, we find:
q1 = 1.
Answer:
Game price: y = 1, players strategy vectors:
Q(1, 0), P(1, 0)

∑a ij q j ≤ v
∑a ij p i ≥ v
M(P 1 ;Q) = (1 1) + (2 0) = 1 = v
M(P 2 ;Q) = (-2 1) + (-2 0) = -2 ≤ v
M(P;Q 1) = (1 1) + (-2 0) = 1 = v
M(P;Q 2) = (2 1) + (-2 0) = 2 ≥ v

Since rows and columns were removed from the original matrix, the found probability vectors can be written as:
P(1,0,0,0)
Q(0,1,0,0)

Task 2
Using the payoff matrix, find the lower and upper price of the game. In the presence of a saddle point, write down the vectors of optimal pure strategies P*, Q*.



R1

R2

R3

S1

-6

-5

0

S2

-8

-3

-2

S3

-3

-2

3

Solution:
1. Check if the payoff matrix has a saddle point. If yes, then we write out the solution of the game in pure strategies.
PlayersB1B2B3a = min(Ai)
A 1-6 -5 0 -6
A2-8 -3 -2 -8
A 3-3 -2 3 -3
b = max(Bi)-3 -2 3

We find the guaranteed payoff determined by the lower price of the game a = max(a i) = -3, which indicates the maximum pure strategy A 3 .
The upper price of the game b = min(b j) = -3.
The saddle point (3, 1) indicates a solution to a pair of alternatives (A3,B1). The price of the game is -3.
Answer: P(0,0,1), Q(1,0,0)

Task 3
Using the payoff matrix, find the vectors of optimal strategies P*, Q* and the price of the game. Which player is the winner?



R1

R2

R3

R4

S1

-6

-6

2

4

S2

2

-2

7

-1

Solution:
1. Check if the payoff matrix has a saddle point. If yes, then we write out the solution of the game in pure strategies.
We assume that player I chooses his strategy in such a way as to maximize his payoff, and player II chooses his strategy in such a way as to minimize the payoff of player I.
PlayersB1B2B3B4a = min(Ai)
A 1-6 -6 2 4 -6
A22 -2 7 -1 -2
b = max(Bi)2 -2 7 4

We find the guaranteed payoff determined by the lower price of the game a = max(a i) = -2, which indicates the maximum pure strategy A 2 .
The upper price of the game b = min(b j) = -2.
The saddle point (2, 2) indicates a solution to a pair of alternatives (A2,B2). The price of the game is -2.
3. We find a solution to the game in mixed strategies.
We solve the problem by a geometric method, which includes the following steps:
1. In the Cartesian coordinate system, a segment is plotted along the abscissa axis, the length of which is equal to 1. The left end of the segment (point x = 0) corresponds to strategy A 1, the right end - to strategy A 2 (x = 1). Intermediate points x correspond to the probabilities of some mixed strategies S 1 = (p 1 ,p 2).
2. The payoffs of strategy A 1 are plotted on the left y-axis. On the line parallel to the y-axis, the payoffs of strategy A 2 are plotted from point 1.
Game solution ( 2 x n) we draw from the position of player A, who adheres to the maximin strategy. None of the players has dominating and duplicating strategies.

The maximin optimal strategy of player A corresponds to the point N, for which we can write next system equations:
p1 = 0
p2 = 1
Game price, y = -2
Now we can find the minimax strategy of player B by writing the corresponding system of equations, eliminating the strategy B 1 ,B 3 ,B 4 , which gives a clearly greater loss to player B, and hence q 1 = 0,q 3 = 0,q 4 = 0 .
-2q 2 = -2
q2 = 1
Solving this system, we find:
q2 = 1.
Answer:
Game price: y = -2, players strategy vectors:
Q(0, 1, 0, 0), P(0, 1)
4. Check the correctness of the game solution using the strategy optimality criterion.
∑a ij q j ≤ v
∑a ij p i ≥ v
M(P 1 ;Q) = (-6 0) + (-6 1) + (2 0) + (4 0) = -6 ≤ v
M(P 2 ;Q) = (2 0) + (-2 1) + (7 0) + (-1 0) = -2 = v
M(P;Q 1) = (-6 0) + (2 1) = 2 ≥ v
M(P;Q 2) = (-6 0) + (-2 1) = -2 = v
M(P;Q 3) = (2 0) + (7 1) = 7 ≥ v
M(P;Q 4) = (4 0) + (-1 1) = -1 ≥ v
All inequalities are satisfied as equalities or strict inequalities, therefore, the solution of the game is found correctly.

Task 4
Give a detailed answer to the question

Pure Strategy- deterministic (excluding randomness) plan of action. In the previous chapter, we considered only pure strategies. Mixed strategies will be discussed in Section 2.2, but for now, unless otherwise stated, by strategy we always mean pure strategy.

Very often in the process of presentation we will illustrate the concepts of solution with examples of bimatrix games, so we will give the corresponding definitions.

Definition 2.1. end game is a game in which the set of players and the sets of strategies of each player contain a finite number of elements. The ultimate game of two persons is called bimatrix game.

The last name comes from a convenient form of recording winnings in such a game - using a double matrix.

For further analysis, it is convenient to divide the strategies in an arbitrary strategy profile s into the strategy of some /-th player s, and the strategies of all other players s_ (. Formally, s = (.y, s,). It is not implied here that we swap the coordinates of the strategy profile , we only introduce another way to denote it.

The first concept of solving the game that we will consider is equilibrium in dominant strategies.

Definition 2.2. The strategy of the /-th player strictly dominated his strategy s" if Uj(s jt s ,) > h,(s", s ,) for any set s , of the strategies of the remaining players. In this case, the strategy s" is called strictly dominated.

Essentially, this means that for any fixed in the set of strategies of the remaining players, the i-th player, choosing a strategy s, obtains strictly bigger win than when choosing a strategy s". It is logical to assume that a rational player should not choose strictly dominated strategies. Such an assumption in the simplest games may be sufficient to find a solution to the game.

Definition 2.3. Strategies profile s* =(s*, s^,..., s*) is called balance in (strictly) dominant strategies, if for any i-th player the strategy s" strictly dominates any other of his strategies.

It may seem that this concept of solution can only lead to trivial conclusions. Each player has among his strategies one that will give him a payoff more than any other, no matter how his opponents act. Then he will apply exactly this strategy in equilibrium. Everything is pretty obvious. But it is precisely this situation that is typical for, perhaps, the most famous and very important for the analysis of a number of practical situations of the game “prisoners' dilemma”.

Example 2.1 (prisoners' dilemma). The two criminals are in custody in different cells and cannot communicate. The investigation has sufficient evidence to convict each of them for a minor crime for one year. But for a major crime, for which criminals have been facing ten years in prison, the investigation does not have enough evidence. Representatives of the investigation offer each of the criminals a deal: the criminal will receive a term for

one year less if he gives evidence against his partner, which will be enough to charge the latter with a major crime. Assume that criminals are only concerned with the number of years they will spend in prison, each additional year is minus one unit of utility. Then the payoffs of criminals can be represented by the following double matrix:

In the case when the participants in the game are not named, we will assume that the different strategies of the first participant correspond to the rows of the double matrix, and the strategies of the second participant correspond to the columns. If in our example the first prisoner testifies and the second does not testify, then the first will be released, and the second will receive ten years in prison.

It is easy to see that, no matter how the other prisoner acts, the gain is greater (the term of imprisonment is shorter) if you give evidence (for the first player, the first coordinates in the first row of the double matrix are strictly greater than in the second row, for the second player, the second coordinates in the first column double matrix is ​​strictly greater than in the second column). Then the equilibrium in the dominant strategies will be the profile of the strategies (testify, testify).

What is interesting about this example is that the players, choosing behavior that increases their payoff, end up in a situation where their payoffs are low compared to the opposite situation where both choose to remain silent. The explanation lies in the presence of a strong external effect, i.e. the strong influence of the actions of one player on the payoffs of another player. As a result, the equilibrium profile of strategies turns out to be the only Pareto inefficient one in this game. Note that Pareto efficiency, desirable from the point of view of the participants in the game, may not be desirable from a social point of view, as in this case.

Situations like the Prisoner's Dilemma often occur in the analysis of economic situations. Consider, for example, a competition between two stores selling a similar set of products. For simplicity, let's assume that stores can charge only two price levels - high or low. Consumers naturally prefer to buy from a store with lower prices. Then the payoffs of stores, characterized by their profits, may look, for example, as follows:


From the point of view of equilibrium, the situation here is analogous to the Prisoner's Dilemma - equilibrium in dominant strategies (low prices, low prices) is the only Pareto inefficient profile (and also desirable from a social point of view).

The already mentioned wide popularity of the Prisoner's Dilemma was the reason why, using its example, they tried experimentally to test the correctness of the predictions of game theory. The test was that two strangers it was proposed to play a game for money with prizes (for example, in dollars) close to those indicated for the game of two stores. Each of the participants made a decision separately (often anonymously) and did not know the decisions of the other player before receiving the winnings. It turned out that under such conditions, in many plays of the game, the players did not arrive at an equilibrium result, if we assume that cash prizes correctly estimate their winnings. Of course, it does not follow from the results of these experiments that the predictions of game theory are incorrect, but only that, when evaluating their payoff, the players took into account non-monetary factors - considerations of altruism, fairness, etc. If the payoffs of the players are correctly estimated, then the players should prefer the dominant strategy, and hence choose it (in the spirit of revealed preferences in microeconomics). Therefore, the value of experiments of this kind is not in testing game-theoretic predictions, but in assessing the role of non-material motivation in the actions of individuals.

Significantly less than the concept of strong dominance, game theory uses the concept of weak dominance.

Definition 2.4. The strategy of the /-th player s, weakly dominant his strategy s" if m,(s, s ,) > m ; (sJ, s ,) for any set of other players' strategies s_j, moreover, for at least one set of strategies of other players, the inequality is strictly satisfied. Then the strategy s" is called weakly dominated.

In the case of non-strict inequalities, it is no longer possible to assert that a rational player will not choose a weakly dominated strategy, although such behavior seems quite logical. There is, although rarely used, a definition of equilibrium in weakly dominant strategies analogous to the case of strongly dominance.

Definition 2.5. The strategy profile s* = (s*, Sj,..., s*) is called equilibrium in weakly dominant strategies, if for any i-th player the strategy s" weakly dominates any other of his strategies.

Example 2.2 (closed second price auction). A closed auction of the second price is held among two persons. The auction is arranged as follows. Each of the participants indicates a non-negative rate, not knowing the rates of other participants (in the envelope). Member who made the highest bid, pays the maximum amount among other participants' bids (i.e. the amount of the second but the value of the bid) and receives some item. If, for example, the players' bids were 100 and 90, then the participant who made the bid 100 wins the auction, he acquires the item for 90 - the size of the second bid. Let each participant have an assessment of the subject, expressed in monetary units, v2> 0. These estimates are known to all participants. Let, for simplicity of describing the game, if both participants indicate the same rate, then the object goes to the first participant.

In this game, the strategy of the first player s will be the size of his bet. Since the rate is non-negative, the set of all possible strategies

5, = 0 = u,(o, s 2) > w,(s, s 2) = u, - s 2 v x weakly dominates strategy s,.

We have shown that for the first player, the strategy to name one's score as a bet weakly dominates any other strategy. It is easy to check that a similar statement is true for the second player as well. We note that in our reasoning we never used the fact that the player knows the estimate of the other player, and hence, in the case of a game with incomplete information in closed auction the second price to name your assessment will be no less profitable than to make any other bet.

It may seem that it is unprofitable for the seller to arrange an auction of the second price, when he can arrange an auction of the first price and receive the value of not the second, but the first bid. However, the value of rates in the case of an auction of the first price in equilibrium will be lower. We will talk more about the yield of auctions in Chap. 5. In the meantime, we note that the second price auction is very popular and is widely used, for example, by companies Google and "Yandex" when selling contextual advertising on the Internet.

Equilibrium in dominant strategies exists only in small class games. Typically, players do not have a single strategy that dominates all others. But the concept of dominance allows finding solutions in a wider class of games. To do this, you need to conduct consistent reasoning about the actions of the players. We have already noted that a rational player will not choose a strictly dominated strategy. But this means that the other player can analyze the game, ignoring the possibility of the opponent's choice of such a strategy. Perhaps some analysis will reveal that another player has a dominated strategy that was not dominated in the original game. And so on. Let us give a formal definition.

Process sequential exclusion of strongly dominated strategies is set as follows. Let us exclude all strictly dominated strategies of the players from consideration, i.e. consider a new game in which all dominated strategies are excluded from the set of possible strategies of the players. Then in this new game we eliminate all strictly dominated strategies, and so on.

It is possible that such a process will end when the players have several strategies left, but it is possible that each player will have only one non-excluded strategy, then it is logical to consider a set of these strategies as a solution to the game.

Definition 2.6. If, as a result of sequential elimination of strongly dominated strategies, each player is left with a single strategy, then the profile of these strategies is called dominance equilibrium.

In Example 1.1, we have obtained just such an equilibrium. Let's consider one more example.


The strategy profile (N, P) is the only Nash equilibrium in this game. But note that in order to choose P, the second player must be sure that the first player does not choose B. But the payoff of the first player is the same if the second player chooses II. In addition, by choosing B, the first player may not be afraid that the second player will choose L. Perhaps the rational second player will think about choosing strategy C.

The second question, for which no unambiguous answer has yet been found: how do players come to Nash equilibrium?

The ideal theoretical scenario is as follows. Players independently form expectations about the actions of other players, and then choose the actions that maximize their payoff given the given expectations. If, in this case, the expectations correspond to the actions actually chosen by the players, then we obtain the Nash equilibrium. This line of reasoning allows us to call the Nash equilibrium a situation with self-fulfilling expectations. But where do expectations come from? And which of the Nash equilibria, if there are several, will be chosen as a result of the described process? In the framework of the considered scenario, these questions remain unanswered.

Another approach involves the presence of player training. Players either theoretically learn how to play a given game (imagine students Faculty of Economics), or have experience in similar interactions (for example, an experienced worker comes to new team), which allows them to correctly form expectations and choose the optimal behavior. This scenario helps to explain the formation of expectations, but it, firstly, reduces the scope game models only to standard, studied and frequently occurring situations of interaction, and secondly, it can lead to the fact that situations of single and repeated interaction are not distinguished, and the latter differ significantly in terms of strategies and solution methods within the framework of game theory, which will be discussed in more detail. said in ch. 4.

The third scenario is that there is a prior agreement between the players, or customs, or laws, or instructions from third parties that govern the interaction of the players. In this case, agreements or instructions may not be binding, but if it is recommended to play the Nash equilibrium, then none of the players has a desire (alone) to deviate from the prescribed behavior. It is clear that such a scenario is not possible in every situation. In addition, the very process of forming an agreement or involving third parties can become part of the game.

Finally, the third natural question that arises when studying the concept of Nash equilibrium is the following: is there any empirical evidence that real players usually choose equilibrium strategies? Here again it is extremely difficult to give a short and unambiguous answer. At the same time, the nature of the problems that arise is more consistent with the subject matter of experimental economics. Therefore, we confine ourselves to the recommendation to turn to specialized literature, for example, the book, where the questions of experimental methodology are excellently analyzed and a number of results are presented.

There are games that do not have an equilibrium in pure strategies (see Example 3.1), so the question arises: what conditions are sufficient for the existence of such an equilibrium? Let us formulate and prove the assertion about the existence of a Nash equilibrium in pure strategies in games that are not finite.

Statement 2.3. If the sets of strategies for each of the players S t are non-empty convex compacta in Euclidean space, and the payoff function of each player And- continuous in s and quasi-concave in 5, then the game has a Nash equilibrium in pure strategies.

Proof. Recall the formulation Kakutai's theorems, which we will use in the proof. Let X- non-empty convex compact set in R n , X* is the set of its subsets and/ is such an upper semicontinuous mapping from X V x*, that for each point x e x a bunch of f(x) non-empty, closed and convex. Then the mapping / has a fixed point.

The idea of ​​proving our assertion is to construct a mapping that satisfies the conditions of Kakutani's theorem. To do this, we slightly redefine the display of the best answer. We will, purely technically, assume that the best answer depends not only on the strategies of other players, but also on the player's own strategy s y (s). With a change in the player's own strategy, with the strategies of the other players fixed, the best answer, of course, will not change. Now let's introduce a notation for displaying the best answer for all players as a Cartesian product s(s) = s,(s) x s 2 (s) x... x s n (s). This mapping to each profile assigns a set of profiles in which each player the best way responds to the strategies of other players. The fixed point of the mapping S, i.e. profile s such that s e s(s)> is by definition a Nash equilibrium. Let us show that the mapping 5 satisfies the conditions of Kakutani's theorem. Verification of each condition will constitute a separate point of proof.

  • 1. Let us show that the set S all profiles - a convex compact. Since, under the condition of asserting the set of strategies of each of the players S, are non-empty convex compact sets, then the Cartesian product S = S t X S2 X...x S n is a convex compact.
  • 2. Display s has non-empty images. By the Weierstrass theorem, the continuous function And- reaches on a closed bounded set 5, its own maximum value. Hence, s has non-empty images.
  • 3. Display images s closed and convex. Since the payoff function of each player u t quasi-concave in s if then, by the property of a quasi-concave function, the set $. = (s. | u t (s i9 s .) > k) for fixed s .and k is closed when the domain of definition is closed and is convex if it is not empty. Since this is true for any k, then it is also true that the set 5. = (5/1 u t(s", 5 ,) > maxw.(s., s .)}

convex. But then the Cartesian product 5(5) = s x (s) X s2(S) x... x s n CS) is closed and convex.

4. Let us show that the mapping § semi-continuous from above. We use the continuity condition for the function And, by s. We will prove by contradiction. Let's assume that the display § ns is upper semicontinuous. Then there are sequences of strategy profiles s m And s m , Where T - sequence element number, such that for any T s"" e S, s m e s(s""), lim s"" = s° e S, but lim s"" = s° g lim s(s""). This means that there is a

t~* oo t->/And -? oo

rock for which strategy s f ° is not the best response to s 0 , i.e. there is a strategy s" such that and,(s", s 0 ,) > u,(s] s°;). Then one can find e > 0 such that m,(s/, s 0 ,) > m,(s ; °, s 0 ,) + Ze, whence

Since, by assumption, the function m is continuous, lim s m = s°, lim s"” = s°,

m*oo m-*oo

with a large enough m right

Combining inequalities (2.8)-(2.10) into one chain, we obtain

It follows from relations (2.11) that u,(s", s"") > m,(s/", s"") + s, but this contradicts the condition s"" e s(s""), since s" gives a strictly greater payoff than s/", in response to s"". They came to a contradiction. Therefore, our original assumption that s is not upper semicontinuous was wrong.

We have shown that the mapping S satisfies all the conditions of Kakutani's theorem, and hence has a fixed point. This fixed point is the Nash equilibrium. Assertion 2.3 is proved. ?

Statement 2.3, in particular, guarantees the existence of a Nash equilibrium in Example 2.7, but not in Example 2.8, where the payoff functions of the players are discontinuous.

"Example from work.

Mathematical Methods and models in the economy

Matrix games

Introduction

In economic practice, situations often arise in which different parties pursue different goals. For example, the relationship between a seller and a buyer, a supplier and a consumer, a bank and a depositor, etc. Such conflict situations arise not only in the economy, but in other activities. For example, when playing chess, checkers, dominoes, lotto, etc.

A game is a mathematical model of a conflict situation involving at least two persons using several various ways to achieve your goals. The game is called steam room, if there are two players involved. The game is called antagonistic if the gain of one player is equal to the loss of the other. Therefore, to define the game, it suffices to specify the payoffs of one player in different situations.

Any way a player can act depending on the current situation is called strategy. Each player has a certain set of strategies. If the number of strategies is finite, then the game is called final, otherwise - endless . The strategies are called clean, if each of the players chooses only one strategy in a certain way, and not randomly.

Game decision is to choose a strategy that satisfies the optimality condition. This condition is that one player gets maximum win, if the second sticks to its strategy. Conversely, the second player gets minimum loss, if the first player sticks to his strategy. Such strategies are called optimal . Thus, The goal of the game is to determine the optimal strategy for each player.

Playing in pure strategies

Consider a game with two players A And IN. Let's assume the player A It has m strategies A 1, A 2, ..., A m, and the player IN It has n strategies B 1 , B 2 , … , B n . We assume that the player's choice A strategies And i , but a player IN strategies Bj uniquely determines the outcome of the game, i.e. win aij player A and winning b ij player IN. Here i=1,2,…,m, j=1,2,…,n.

The simplest two-player game is the antagonistic game. , those. a game in which the interests of the players are directly opposed. In this case, the payoffs of the players are related by the equality

b ij =-a ij

This equality means that the gain of one of the players is equal to the loss of the other. In this case, it suffices to consider only the payoffs of one of the players, for example, the player A.

Each pair of strategies A i And Bj match win aij player A. It is convenient to write all these payoffs in the form of the so-called payment matrix

The rows of this matrix correspond to the player's strategies A, and the columns are the player's strategies IN. In general, such a game is called (m×n)-game.


Example 1 two players A And IN throw a coin. If the sides of the coin are the same, then the winner is A, i.e. player IN pays the player A some amount equal to 1, and if they do not match, then player B wins, i.e. on the contrary, the player A pays the player IN the same amount , equal to 1. Create a payment matrix.

Solution. According to the task



Similar articles