2.Decision Theory - Decision Tables and Decision Trees, Game Theory

2.1 Introduction and Basic Terms

Decision theory represents a generalized approach to decision making. It enables the decision maker

Decision theory problems are characterized by the following: There is a wide range of management decision problems. Among them there are capacity and order planning, product and service design, equipment selection, location planning, and so on.

Some examples of alternatives and possible events for these alternatives are shown in Table 2.1.
 

Alternatives
Events
To order 10, 11, … units of a product Demand for the product may be 0, 1, … units
To make or to buy a product The cost of making may be 20, 22, …$thousands
To buy or not to buy accident insurance An accident may occur, or may not occur

Table 2.1 “Examples of Alternatives and Events”

Various properties of decision problems enable a classification of decision problems.

Solution to any decision problem consists of these steps:

  1. Identify the problem
  2. Specify objectives and the decision criteria for choosing a solution
  3. Develop alternatives
  4. Analyze and compare alternatives
  5. Select the best alternative
  6. Implement the chosen alternative
  7. Verify that desired results are achieved
Decisions problems that involve a single decision are usually best handled through payoff tables, whereas problems that involve a sequence of decisions, are usually best handled using decision trees (see Section 2.9).

Rows of a payoff table (called also decision matrix) relate to the alternatives, columns relate to the states of nature. Elements of a decision matrix represent the payoffs for each alternative under each possible event.

We will construct a payoff table for the following simple example.

A grocer solves a problem of how much pastry to order every day. His profit depends on a demand that can be low, moderate, or high. Values of the profit per day (in $) for these situations and for small, medium or large order are shown in Table 2.2. The body of this table represents a decision matrix.
 

Demand®

Order 

Low
Moderate
High
Small
50
50
50
Medium
42
52
52
Large
34
44
54

Table 2.2 “Payoff Table for Order Planning”

The environments in which decisions are made can be classified according to the degree of certainty. There are three basic categories: certainty, risk, and uncertainty. The importance of these three decision environments is that they require different techniques of analysis.

2.2 Decision Making under Certainty

When the decision maker knows for certain which state of nature will occur, he will choose the alternative that has the highest payoff (or the smallest loss) under that state of nature.

In the above grocer´s decision problem, suppose that it is known with certainty that demand will be low. The highest profit for low demand is $50 (see Table 2.2) and therefore the small order should be elected.

2.3 Decision Making under Uncertainty

Under complete uncertainty, either no estimates of the probabilities for the occurrence of the different states of nature are available, or the decision maker lacks confidence in them. For that reason, probabilities are not used at the choice of the best alternative.

Most of the rules for decision making under uncertainty express a different degree of decision maker´s optimism.

We will describe the most used rules for the choice of the best alternative supposing that the decision criterion expresses the requirement of maximization (the modification of these rules for minimization criteria is not difficult). We will use all the described rules to solution of the order planning problem given in Table 2.2.

The maximax rule is appropriate for extreme optimists that expect the most favourable situation (they choose the alternative that could result in the maximum payoff). Under this rule, the decision maker will find the largest payoff in the decision matrix and select the alternative associated with it (the largest payoff is determined for each alternative and then the largest payoff from these values is selected; therefore “maximax”). This procedure for the order planning problem is shown in Table 2.3.
 

Demand®

Order 

Low
Moderate
High
Row Maximum
 
Small
50
50
50
50
 
Medium
42
52
52
52
 
Large
34
44
54
54
¬ Maximum

Table 2.3 “Maximax Solution for Order Planning Problem”

The best overall profit is $54 in the third row. Hence, the maximax rule leads to the large order (the grocer hopes that the demand will be high).

The maximin rule (Wald criterion) represents a pessimistic approach when the worst decision results are expected. The decision maker determines the smallest payoff for each alternative and then chooses the alternative that has the best (maximum) of the worst (minimum) payoffs (therefore “maximin”).

In Table 2.2, the smallest numbers in the rows are 50, 42, 34. Since 50 is the largest, the low order should be chosen (if order is low, the $50 grocer‘s profit is guaranteed).

The Hurwicz-criterion represents a compromise between the optimistic and the pessimistic approach to decision making under uncertainty. The measure of optimism and pessimism is expressed by an optimism - pessimism index <0,1> . The more this index is near to 1, the more the decision maker is optimist. By means of the index , a weighted average of the best payoff (its weight = ) and the worst payoff (its weight = 1-) is computed for each alternative and the alternative with the largest weighted average should be chosen (see formula for Hurwicz criterion).

If =1, the above rule is the maximax criterion, whereas if =0, it is the maximin rule.

If we choose = 0.7 at determining the best size of the order in Table 2.2, the weighted average (WA) of the largest and the smallest profit for each size of the order has the following values:

WA (small) = 0.7* 50 + 0.3 * 50 = 50

WA (medium) = 0.7* 52 + 0.3 * 42 = 49

WA (large) = 0.7* 54 + 0.3 *34 = 48

Maximizing the weighted average of the largest and the smallest profit, the small order should be selected.

The maximax and maximin rules and the Hurwicz criterion can be criticized because they focus only on extreme payoffs and exclude the other payoffs. An approach that does take all payoffs into account is the minimax regret rule (Savage criterion). This rule represents a pessimistic approach used for an opportunity loss table. The opportunity loss reflects the difference between each payoff and the best possible payoff in a column (it can be defined as the amount of profit foregone by not choosing the best alternative for each state of nature). Hence, opportunity loss amounts are found by identifying the greatest payoff in a column and, then, subtracting each of the other values in the column from that payoff. The values in an opportunity loss table can be viewed as potential ”regrets” that might be suffered as the result of choosing various alternatives. Minimizing the maximum possible regret requires identifying the maximum opportunity loss in each row and, then, choosing the alternative that would yield the minimum of those regrets (this alternative has the “best worst”).

To illustrate the described procedure, we will recall the decision problem given in Table 2.2 and first construct the corresponding regret table.

Payoff table:

Demand®

Order 

Low
Moderate
High
Small
50
50
50
Medium
42
52
52
Large
34
44
54

In the first column of the payoff matrix, the largest number is 50, so each of the three numbers in that column must be subtracted from 50. In the second column, we must subtract each payoff from 52 and in the third column from 54. The results of these calculations are summarized in Table 2.4. A column with the maximum loss in each row is added to this table.
 

Demand®

Order 

Low
Moderate
High
Maximum Loss
 
Small
0
2
4
4
¬ Minimum
Medium
8
0
2
8
 
Large
16
8
0
16
 

Table 2.4 “Regret Table for Order Planning”

Minimizing the maximum loss, the small order should be chosen.

The minimax regret criterion disadvantage is the inability to factor row differences. It is removed in the further rule that incorporates more of information for the choice of the best alternative.

The principle of insufficient reason (Laplace criterion) assumes that all states of nature are equally likely. Under this assumption, the decision maker can compute the average payoff for each row (the sum of the possible consequences of each alternative is divided by the number of states of nature) and, then, select the alternative that has the highest row average. This procedure is illustrated by the following calculations with the data in Table 2.2.

EMV (small) = (50 + 50 + 50)/3 = 50

EMV (medium) = (42 + 52 + 52)/3 = 48 2/3

EMV (large) = (34 + 44 + 54)/3 = 44

Since the profits at the small order have the highest average, that order should be realized.

2.4 Decision Making under Risk

In this case, the decision maker doesn´t know which state of nature will occur but can estimate the probability of occurrence for each state. These probabilities may be subjective (they usually represent estimates from experts in a particular field), or they may reflect historical frequencies.

A widely used approach to decision making under risk is expected monetary value criterion.

The expected monetary value (EMV) of an alternative is calculated by multiplying each payoff that the alternative can yield by the probability for the relevant state of nature and summing the products. This value is computed for each alternative, and the one with the highest EMV is selected.

In the discussed grocer´s decision problem, suppose that the grocer can assign probabilities of low, moderate and high demand on the basis of his experience with sale of pastry. The estimates of these probabilities are 0.3, 0.5, 0.2, respectively. We will recall the payoff table for the considered problem.

Payoff table:

Demand®

Order 

Low
Moderate
High
Small
50
50
50
Medium
42
52
52
Large
34
44
54

The EMV for various sizes of the order are as follows.

EMV (small) = 0.3*50 + 0.5*50 + 0.2*50 = 50

EMV (medium) = 0.3*42 + 0.5*52 + 0.2*52 = 49

EMV (large) = 0.3*34 + 0.5*44 + 0.2*54 = 43

Therefore, in accordance with the EMV criterion, the small order should be chosen.

Note that the EMV of $50 will not be the profit on any one day. It represents an expected or average profit. If the decision were repeated for many days (with the same probabilities), the grocer would make an average of $50 per day by ordering the small amount of pastry. Even if the decision were not repeated, the action with the highest EMV is the best alternative that the decision maker has available.

The EMV criterion remains as the most useful of all the decision criteria for decision making under risk. For risky decisions, a sensible approach is first to calculate the EMV, and then to make a subjective adjustment for the risk in making the choice.

Another method for incorporating probabilities into the decision making process is to use expected opportunity loss (EOL). The approach is nearly identical to the EMV approach, except that a table (or matrix) of opportunity losses (or regrets) is used. The opportunity losses for each alternative are weighted by the probabilities of their respective states of nature and these products are summed. The alternative with the smallest expected loss is selected as the best choice.

We will use this procedure in the regret matrix shown in Table 2.4. We will recall this table.

Regret table:

Demand®

Order 

Low
Moderate
High
Small
0
2
4
Medium
8
0
2
Large
16
8
0

Supposing that the probabilities of various sizes of the demand are 0.3, 0.5, 0.2, we can determine the EOL for each size of the order.

EOL (small) = 0.3*0 + 0.5*2 + 0.2*4 = 1.8

EOL (medium) = 0.3*8 + 0.5*0 + 0.2*2 = 2.8

EOL (large) = 0.3*16 + 0.5*8 + 0.2*0 = 8.8

Since the small order is connected with the smallest EOL, it is the best alternative.

The EOL approach resulted in the same alternative as the EMV approach. The two methods always result in the same choice, because maximizing the payoffs is equivalent to minimizing the opportunity losses.

Another possibility for decision making under risk is using the maximum - likelihood decision criterion. According to this criterion, we consider only the state of nature with the highest probability and choose the best alternative for that state of nature.

If we suppose in the decision problem given in Table 2.2 that the probabilities of various sizes of the demand are 0.3, 0.5, 0.2, the moderate demand is most likely. Under this situation, the medium order is the best.

Since the maximum-likelihood criterion takes only one uncertain state of nature into account it may lead to bad decisions.

The expected value approach (the calculation of the EMV or the EOL) is particularly useful for decision making when a number of similar decisions must be made; it is a “long-run” approach. For one-shot decisions, especially major ones, approaches for decision making under uncertainty may be preferable.

2.5 Dominance Criteria

In some payoff tables, dominance criteria can be used at the choice of the best alternative. There are three forms of dominance. All these forms are illustrated in the example with the following payoff table.
 

States of Nature ®

Alternatives 

S1
S2
S3
A1
5
-1
2
A2
2
1
0
A3
4
2
5
A4
-1
1
4
Probability
0.1
0.4
0.5

Table 2.5 “Payoff Table for Illustration of Dominance Criteria”

In outcome dominance one alternative dominates some second alternative, if the worst payoff for this alternative is at least as good as the best payoff for the second alternative.

In Table 2.5 the worst payoff outcome for alternative A3 is 2 (when S2 occurs), the best payoff for A2 is also 2 (when S1 occurs). Hence, alternative A3 dominates A2 by outcome dominance.

Event dominance occurs if one alternative has a payoff equal to or better than that of a second alternative for each state of nature (regardless of what event occurs, the first alternative is better than the second).

In Table 2.5 for each state of nature, the payoff for alternative A3 is greater than that for A2and A4. Hence, A3 dominates A2 and A4 by event dominance.

The third form of dominance is called probabilistic dominance.

Suppose that payoffs are discrete random variables with a known probability distribution. Then we can count the probability that a payoff will obtain the value X or more. This probability will be designated P(X or more).

One alternative probabilistically dominates a second, if P(X or more) for the first is at least as large as P(X or more) for the second for all values of X. In other words, if a fixed amount money is given, the decision maker prefers an alternative that has a greater chance of obtaining that amount or more. If, for any amount of money, one alternative has a uniformly equal or better chance of obtaining that amount or more, then that alternative dominates by probabilistic dominance.

To demonstrate probabilistic dominance in Table 2.5, we reorganize the data of this table into the form shown in Table 2.6. Here, the payoff values X are ordered from lowest (-1) to highest (5) and the P(X) and the P(X or more) columns for each alternative are shown.
 

Payoff
A1
A2
A3
A4
X
P(X)
P(X or more)
P(X)
P(X or more)
P(X)
P(X or more)
P(X)
P(X or more)
-1
0.4
1
0
1
0
1
0.1
1
0
0
0.6
0.5
1
0
1
0
0.9
1
0
0.6
0.4
0.5
0
1
0.4
0.9
2
0.5
0.6
0.1
0.1
0.4
1
0
0.5
4
0
0.1
0
0
0.1
0.6
0.5
0.5
5
0.1
0.1
0
0
0.5
0.5
0
0

Table 2.6 “Probabilistic Dominance Table”

Since for each value of payoffs, the P(X or more) probability for alternative A3 is at least as large as that for the other alternatives, A3 dominates all the remaining alternatives by probabilistic dominance.

Of the three forms of dominance, outcome dominance is the strongest, event dominance next, and probabilistic dominance the weakest. In other words, if an alternative dominates by outcome dominance, it will also dominate by event dominance and probabilistic dominance; but the reverse is not true (for example, in Table 2.5 alternative A3 dominates A1 by probabilistic dominance, but does not dominate by either event dominance or outcome dominance).

If some alternative dominates all the others, it is the best (in Table 2.5 it is the alternative A3). Unfortunately, in many cases no single alternative dominates all the others by any of the three forms of dominance (for instance, it is the decision problem with payoffs in Table 2.2 and with probabilities 0.3, 0.5, 0.2 for states of nature). The dominance criterion fails in these cases. However, it can be useful in eliminating some alternatives and thus narrowing down the decision problem.

If one alternative dominates a second, the expected monetary value of the first alternative is greater than the EMV of the second. The reverse, however, is not true. We can verify this statement on the above example:

EMV (A1) = 1.1 EMV (A3) = 3.7

EMV (A2) = 0.6 EMV (A4) = 2.3

Alternative A3 dominates the remaining alternatives by probability dominance and therefore its EMV is greater than the EMV of the other alternatives. On the other hand, EMV (A1) > EMV(A2 ) and in spite of it alternative A1 does not dominate alternative A2 by any dominance criterion.

2.6 Utility as a Basis for Decision Making

The expected monetary value criterion for choice of the best alternative does not take into account the decision maker´s attitude toward risk. If the amounts of money involved in the risky decision problem are small, or if the decision is a repetitive one, risk is usually not important, and expected monetary value is a reasonable criterion to use. On the other hand, when the amounts of money involved in the risky decision problem are large and especially if it is a one-of-a-kind decision, the expected value criterion is not adequate. In these cases, utility theory gives a possibility to incorporate risk into decision analysis.

2.6.1 Utility Function

Utility of a payoff is a measure of the personal satisfaction associated with that payoff. It is evident that the same amount of money has not the same value (for example, the value of a dollar added to $10 is larger than that of one added to $1,000).

Utility in the von Neumann sense is associated with risky situations.

Utility function represents the subjective attitude of the individual to risk. Hence, a utility function of a person can be used to evaluate decision alternatives involving uncertain payoffs.

Utility function makes possible to determine a person´s utility for various amounts of money. Its form depends on the individual´s attitude toward risk taking. If a person prefers risk (is risk taker), the utility function is concave. If a person is risk averter, the utility function is convex. If a person is risk neutral, the utility function is linear.

Many utility functions express decreasing risk aversion (the individual becomes less risk averse as the money amount increases). This property seems reasonable for most business and many personal decisions. The possibility of a loss of a given size becomes less significant the more wealth the decision maker possesses.

An important term of utility theory is the certainty equivalent (CE). This number represents a subjective valuation of the risky situation by a certain or sure amount of money which is equivalent to uncertain outcomes of that situation. In other words, utility of the certainty equivalent is equal to expected utility of outcomes in the risky situation.

The certainty equivalent can be interpreted as the maximum insurance that the decision maker would pay to be freed of an undesirable risk.

The relationship between CE and EMV of outcomes in a risky situation determines the attitude of the decision maker towards risk.

CE = EMV => neutral attitude

CE < EMV => risk aversion

CE > EMV => risk preference

The difference between EMV and CE is called the risk premium (RP). For a risk averter, the RP is positive, for a risk taker, the RP is negative. Risk premium can be interpreted as the amount of money that the decision maker is willing to give up to avoid the risk of loss.

The relationship between money and utility can be described graphically with the help of a curve termed the utility curve.

A utility curve can be constructed by measuring the attitude of the decision maker toward risk. Four such curves are shown in Picture 2.1. The shape of each curve is a function of the individual´s attitude toward risk (the s-shaped curve b describes a person who is risk taker when is poor, but once he has accumulated wealth, will become risk averse).

Picture 2.1 ”Different Utility Curves”

Once a person´s utility curve is known, it is possible to replace any monetary value by its utility equivalent for that person.

To construct a utility curve, we need several points (then we draw the curve through these points). For convenience, we find the largest and the smallest monetary value involved in the decision problem and assign utility values of zero and one to these monetary values. Further points of the utility curve can be constructed so that the decision maker visualizes a hypothetical gambling situation and determines the certainty equivalents for various gambles. It is illustrated by the following example.

The extreme outcomes for an entrepreneur is a loss of $10 million and a gain of $100 million. Suppose that the entrepreneur considered a hypothetical gambling situation involving two possible outcomes and found the certainty equivalents for the gambles shown in Table 2.7. Using these data draw a utility curve.
 

Gamble
Certainty Equivalent
0.5 chance at -10 mil. and
0.5 chance at 100 mil.
15 mil.
0.5 chance at 15 mil. and
0.5 chance at 100 mil.
47 mil.
0.5 chance at -10 mil. and
0.5 chance at 15 mil.
-2.5 mil.

Table 2.7 ”Certainty Equivalents for Gambles”

The definition of the certainty equivalent results in the following identities.

Picture 2.2 ”Utility Curve for the Entrepreneur with Risk Aversion”

The utility curve passes through the following points: [-10;0], [-2.5;0.25], [15;0.5], [47;0.75], [100;1]. It is drawn in Picture 2.2. The curve is concave which indicates that the entrepreneur is risk averter.

There is empirical evidence that the most of decision makers are risk averter (therefore they insure against potential losses). On the other hand, risk preferring individuals exist, too (for example they like to gamble).

2.6.2 Expected Utility Criterion

If we translate the monetary values in a payoff table to their utilities, we can replace the EMV criterion by the expected utility criterion. We calculate the expected utility (EU) for each alternative and choose the alternative with the highest EU.

Refer to Table 2.2, suppose that the grocer expressed the utilities of the profit values with the numbers shown in Table 3.8.
 

Profit
34
42
44
50
52
54
Utility
0
0.7
0.8
0.9
0.95
1

Table 2.8 “Utility Values of Profit”

If the probabilities of low, moderate, and high demand are 0.3, 0.5, 0.2, respectively, expected utilities of the alternatives have the following values:

EU (small) = 0.9

EU (medium) = 0.3*0.7 + 0.5*0.95 + 0.2*0.95 = 0.875

EU (large) = 0.3*0 + 0.5*0.8 + 0.2*1 = 0.6

The greatest expected utility is connected with the small order and therefore this order should be realized. The same result has been obtained by means of the EMV criterion (see section 2.4) but generally, the best alternatives selected according to the EU and the EMV criterion must not be identical due to the nonlinear relationship between monetary values and their utilities.

2.7 Decision Making with Additional Information

In some situations it is possible to reduce the outcome´s uncertainty and to ascertain which state of nature will actually occur. In these cases, the decision maker doesn´t know whether to decide immediately or delay a decision until it is clear which state of nature will occur. The important question for the decision maker is whether the cost of waiting (e.g., the cost of an option, storage costs) or of additional information (e.g., using marketing research or a better forecasting technique) outweighs the potential gain that more information would permit.

Possible ways of obtaining additional information depend on the nature of the decision problem. For example, information about consumer preferences might come from market research; additional information about a product could come from product testing; legal experts might be called on; and so on.

The value of the additional information can be quantified by means of the expected value of the perfect information or the imperfect information.
 
 

2.7.1 Expected Value of Perfect Information

The expected value of perfect information (EVPI) represents the increase of the best expected payoff in consequence of obtaining a free, perfect prediction. It can be interpreted as an upper bound on the amount of money the decision maker would be justified in spending to obtain perfect information.

The EVPI is equal to the difference between the expected payoff under certainty (EPC), that is with the perfect information, and the best expected payoff without the perfect information.

The following formulas hold:

EVPI = EPC – max(EMV) for a maximization criterion

EVPI = min(EMV) – EPC for a minimization criterion

There is an important relationship between the EVPI and the EOL. The EVPI is equal to the expected opportunity loss for the best alternative, i. e. to the minimum value of the EOL computed for each alternative. The mentioned relationship can be explained in this way: The perfect information should reduce the opportunity loss that exists under uncertainty due to imperfect information to zero. Hence, the expected opportunity loss of the optimal alternative measures the EVPI and conversely, the EVPI is a measure of opportunities forgone. In this sense, if the EVPI is large, it should be a signal to the decision maker to seek another alternatives.

Refer to Table 2.2, we will determine the EVPI in the grocer´s decision problem.

The best payoff under each state of nature is 50, 52, 54. The probabilities of these states are 0.3, 0.5, 0.2. The expected payoff under certainty is

EPC = 0.3*50 + 0.5*52 + 0.2*54 = 51.8.

The largest expected payoff without the perfect information, as computed in Section 2.4, is

max(EMV) = 50.

Hence,

EVPI = EPC – max(EMV) = 51.8 - 50 = 1.8

The value 1.8 is equal to the lowest expected regret, as computed in Section 2.4.

A useful check on computations is provided by the following identity:

EMV (of any alternative) + EOL (for the same alternative) = EPC

This identity is verified for the grocer´s problem in Table 2.9. The EMV and the EOL were calculated in Section 2.4.
 

 
Order
 
Small
Medium
Large
EMV
50
49
43
EOL
1.8
2.8
8.8
EPC
51.8
51.8
51.8

Table 2.9 “Check on Computations of EMV, EOL, EPC”

2.7.2 Expected Value of Sample Information

Most information that we can obtain on states of nature is imperfect in the sense that it doesn´t tell us exactly which event will occur. Even so, imperfect (sample) information will still have value if it will improve the expected profit.

The expected value of sample information (EVSI) represents the increase of the best expected payoff in consequence of obtaining a free imperfect information. It can be interpreted as an upper bound on the amount of money the decision maker is willing to spend to obtain imperfect information.

The EVSI is equal to the difference between the best expected value of payoff with sample information (EVMS) and the best expected value of payoff without sample information.

The following formulas hold:

EVSI = EVMS – max(EMV) for a maximization criterion

EVSI = min(EMV) – EVMS for a minimization criterion

Sometimes it is useful to express the degree of increase in information from a sample relative to perfect information. One way to judge this degree is the ratio of EVSI to EVPI. This is known as the efficiency of sample information (EFSI). This number can range from 0 to 1. The closer the number is to 1, the closer the sample information is to being perfect; the closer the number is to 0, the less the amount of information there is in the sample.

To demonstrate a calculation of the EVSI, we use the following example.

A manager decides about the size of the future store. His payoffs depend on the size of the store and the strength of demand. These payoffs (in thousands of dollars) are shown in Table 2.10.
 

Demand®

Store 

Low
High
Small
30
50
Large
10
80

Table 2.10 “Payoff Table for Store Planning”

The manager estimates that the probabilities of low demand and high demand are equal (0.5). These probabilities are called prior probabilities. The manager could request that a research firm conducts a survey (cost $2,000) that would better indicate whether the demand will be low or high. In discussions with the research firm, the manager has learned about the reliability of surveys conducted by the firm. This information is shown in Table 2.11 in the form of conditional probabilities.
 

Survey Prediction
Actual Demand
Low
High
Low demand
0.9
0.3
High demand         0.1         0.7

Table 2.11 “Conditional Probabilities of Survey Predictions, Given Actual Demand”

The numbers in Table 2.11 indicate that in past cases when demand actually was low, the survey correctly indicated this information 90 percent of the time, and incorrectly indicated a high demand 10 percent of the time. Moreover, when a high demand existed, the survey incorrectly indicated a low demand 30 percent of the time and correctly 70 percent of the time.

In the further probability calculations we will use these symbols:

event S1 …… low demand
event S2 …… high demand
forecast F1 … information that the demand will be low
forecast F2 … information that the demand will be high

The prior probabilities P(S1) = P(S2) = 0.5 and Table 2.11 contains the conditional probabilities .

The joint probability of both a forecast Fj and event Si is equal to the product . These probabilities are shown in Table 2.12. The last column in this table contains the marginal probabilities of demand level P(Si) for i = 1,2 (it is the sum of the joint probabilities in each row). The last row in Table 2.12 contains the marginal probabilities of survey forecast level P(Fj) for j= 1,2 (it is the sum of the joint probabilities in each column).
 

Potential Level 
Survey Prediction
Marginal Probabilities
of Demand
Low Demand
High Demand
of Demand Level
Low
0.45
0.05
0.5
High
0.15
0.35
0.5
Marginal Probabilities 
of Survey Prediction
0.6
0.4
1.0

Table 2.12 “Joint and Marginal Probabilities Table for Store Planning”

Using the Bayes rule, we can calculate the posterior (revised) probabilities of low and high demand. These values reflect the information from a survey and are given by the following formula:

The posterior probabilities based on survey prediction of “low demand” are:

If the survey shows “high demand”, the posterior probabilities of low and high demand are:

In order to find EVSI, first find the expected payoffs for small store and large store for these situations: survey shows “low demand”, survey shows “high demand”, no survey. These calculations are shown in Table 2.13.
 

Store
Survey Prediction
No survey
Low Demand
High Demand
Small 0.75*30+0.25*50=35 0.125*30+0.875*50=47.5 0.5*30+0.5*50=40
Large
0.75*10+0.25*80=27.5
0.125*10+0.875*80=71.25
0.5*10+0.5*80=45
Table 2.13 “Expected Monetary Values for Store Planning”

Table 2.13 gives these results:

If the survey shows “low demand” (with the probability 0.6), small store is the best alternative (with the expected payoff $35,000). If the survey shows “high demand” (with the probability 0.4), large store is the best alternative (with the expected payoff $71,250). Hence, the expected payoff for using survey is

EVMS = 0.6*35 + 0.4*71.25 = 49.5

The maximum expected payoff for no survey is max (40;45) = 45. The expected value of sample information is

EVSI = 49.5 - 45 = 4.5 thousand, or $4,500.

Because the EVSI is $4,500, whereas the cost of the survey is $2,000, it would seem reasonable for the manager to use the survey. The expected profit would be $2,500.

If we want to determine the efficiency of the survey in the above mentioned decision problem, we must calculate the expected value of perfect information.

EPC = 0.5*30 + 0.5*80 = 55

max(EMV) = 45

Hence,

EVPI = 55 – 45 = 10,

EFSI = 

The EVSI is always less than the EVPI because the survey can give inconclusive or incorrect information as indicated in Table 2.11.

Taking a sample always represents a means of obtaining only an imperfect information, since the sample is not likely to represent exactly the population from which it is taken.

2.8 Sensitivity Analysis

Analyzing decisions under risk requires working with estimated values of the payoffs and the probabilities for the states of nature. Inaccuracies in these estimates can have an impact on the choice of the best alternative, and ultimately, on the outcome of a decision.

Sensitivity analysis examines the degree to which a decision depends on (is sensitive to) assumptions or estimates of payoffs and probabilities. If this analysis shows that a certain decision will be optimal over a wide range of values, the decision maker can proceed with relative confidence. Conversely, if analysis indicates a low tolerance for errors in estimation, additional efforts to pin down values may be needed.

The aim of sensitivity analysis is to increase the decision maker´s understanding of the problem and to show the effect of different assumptions.

In this section, sensitivity of the choice of the best alternative to probability estimates is examined. The aim of this analysis is to identify a range of probability over which a particular alternative is optimal. The decision maker then need only decide if a probability is within a range, rather than decide on a specific value for the probability of a state of nature.

When there are only two states of nature, a graphical way can be used for sensitivity analysis. A graph provides a visual indication of the range of probability over which the various alternatives are optimal. This procedure is illustrated by the following example.

In the decision problem given in Table 2.14, determine the ranges for the probability of state #2, for which each alternative is optimal under the expected value approach (these ranges can easily be converted into ranges for the probability of state #1).
 

Alternatives
States of Nature
 
#1
#2
A
4
3
B
6
1
C
1
4

Table 2.14 “Payoff Table for Sensitivity Analysis”

Picture 2.3 ”Graphical Analysis Sensitivity”

The probability range from 0 to 1 can be depicted on the horizontal axis and the payoffs for each alternative can be plotted on the vertical axes that are constructed in the points 0 and 1 on the horizontal axis (see Picture 2.3). The payoffs for state #1 are plotted on the left axis, for state #2 on the right axis. If we connect the responsive points on both vertical axes, for each alternative we obtain the graph of the expected value of payoff in dependence on the probability of state #2 (let´s designate it p2; then p1 = 1 – p2). The equations of the expected value lines can be easily derived.
 

The graphical sensitivity analysis requires all the expected value lines to be plotted on the same graph. The highest line for any given value of p2 represents the optimal alternative for that probability (it is the heavy line on Picture 2.3). Thus, referring to Picture 2.3, for low values of p2 (and thus high values of p1), alternative B will have the highest expected value; for intermediate values of p2, alternative A is best; and for higher values of p2, alternative C is best.

If we want to ascertain these ranges of probability exactly, we need to determine where the upper parts of the lines intersect. It is an algebraical problem: to solve together two equations that represent the respective lines. These equations have the form y = a + bx, where a is the y-intercept value at the left axis, b is the slope of the line, and x is p2. Because the distance between the two vertical axes is 1, the slope of each line is equal to the difference of payoffs under state #1 and state #2 for the respective alternatives. The slopes and equations of the lines are shown in Table 2.15.
 

Alternative
Slope
Equation
A
3 - 4 = -1
y = 4 - p2
B
1 - 6 = -5
y = 6 - 5p2
C
4 - 1 = 3
y = 1 + 3p2

Table 2.15 “Equations of the Expected Value Lines”

To determine the intersections of the lines that belong to the alternatives A, B, C, we set the right-hand-sides of the respective equations equal to each other and solve for p2.

Alternatives
Equation
p2
p1 = 1 - p2
B, A
6 - 5p2 = 4 - p2
0.5
0.5
A, C
4 - p2 = 1 + 3p2
0.75
0.25

Table 2.16 “Calculation of Intersections of Expected Value Lines”

The results of the above sensitivity analysis are summarized in Table 2.17.
 

 
Range of p1
Range of p2
A is optimal
<0.25;0.5>
<0.5;0.75>
B is optimal
<0.5;1>
<0;0.5>
C is optimal
<0;0.25>
<0.75;1>

At the intersection of the lines that represent the expected values of the alternatives, the two alternatives are equivalent in terms of expected value. Hence, the decision maker would be indifferent between the two alternatives at these points.

A graphical sensitivity analysis can be used also for a minimization criterion. The graph is equal as for a maximization criterion. However, the smallest expected value of the criterion is plotted by the lowest parts of the lines that represent expected values of the alternatives.
 

2.9 Decision Trees

2.9.1 Characteristic and Format of Decision Trees

Decision tree is a graphic tool for describing the actions available to the decision maker, the events that can occur, and the relationship between these actions and events. Decision trees are particularly useful for analyzing situations that involve sequential decisions.

The term ”decision tree” gets its name from the treelike appearance of the diagram. A decision tree can be deterministic or stochastic. As a prototype of decision tree, a tree with deterministic and stochastic elements is considered.

A decision tree is composed of nodes and branches (arcs).The terminology of nodes and arcs comes from network models which have a similar pictorial representation.

A decision tree has three types of nodes: decision nodes, chance event nodes, and terminating nodes.

Decision nodes are denoted by squares. Each decision node has one or more arcs beginning at the node and extending to the right. Each of those arcs represents a possible decision alternative at that decision point.

Chance event nodes are denoted by circles. Each chance event node has one or more arcs beginning at the node and extending to the right. Each of those arcs represents a possible event at that chance event point. The decision maker has no control over these chance events. The events associated with branches from any chance event node must be mutually exclusive and all events included. The probability of each event is conditional on all of the decision alternatives and chance events that precede it on the decision tree. The probabilities for all of the arcs beginning at a chance event node must sum to 1.

A terminating node represents the end of the sequence of decisions and chance events. No arcs extend to the right from a terminating node. No geometric picture is used to denote terminating nodes. Terminating nodes are the starting points for the computations needed to analyze the decision tree.To construct a decision tree, we must list the sequence of decision alternatives and events that can occur and that can affect consequences of decisions

Format of a decision tree is evident from Picture 2.4.

Picture 2.4 ”Decision Tree Format”

2.9.2 Analysis of Decision Trees

After the tree has been drawn, it is analyzed from right to left. The aim of this analysis is to determine the best strategy of the decision maker, that means an optimal sequence of the decisions.

To analyze a decision tree, we must know a decision criterion, probabilities that are assigned to each event, and revenues and costs for the decision alternatives and the chance events that occur.

There are two possibilities how to include revenues and costs in a decision tree. One possibility is to assign them only to terminating nodes where they are included in the conditional value of the decision criterion associated with the decisions and events along the path from the first part of the tree to the end. However, it can be more convenient to assign revenues and costs to branches. This reduces the required arithmetic for calculating the values of the decision criterion for terminating nodes and focuses attention on the parameters for sensitivity analysis.

Analyzing a decision tree, we begin at the end of the tree and work backwards. We carry out two kinds of calculations.

For chance event nodes we calculate certainty equivalents related to the events emanating from these nodes. Under the assumption that the decision maker has a neutral attitude toward risk, certainty equivalent of uncertain outcomes can be replaced by their expected value (see Section 2.6).

At decision nodes, the alternative with the best expected value of the decision criterion is selected.

The analysis of a decision tree is illustrated by the following example.

A firm is deciding between two alternatives: to introduce a new product or to keep the existing product. Introducing a new product has uncertain outcomes in dependence on the demand. If the demand is high, the resulting profit of the firm will be 140. The low demand will be result in the profit 80. The firm estimates the probabilities of a high and low demand 0.7 and 0.3, respectively. If the firm keeps the existing product, its profit will be 110.

The decision tree for this decision problem is drawn in Picture 2.5.

Picture 2.5 ”Analysis of a Decision Tree”

The estimated profit is written at the end of the chance branches. The probabilities of a high and a low demand for the new product are written below the branches leaving the chance node. The nodes are numbered.

For the chance node 2, we calculate the expected value of the profit (0.7*140 + 0.3*80 = 122) and write this value over the node 2.

At the decision node 1, we select the decision alternative with the higher expected profit. Because max (122;110) = 122, introducing the new product is profitable. We write the maximum expected profit over the node 1 and draw double lines through the branch representing the inferior (worse) decision alternative.

Every decision problem that is modelled by means of a decision table can be structured and pictured as a decision tree (in general, the reverse statement is not true). It is shown in Picture 2.6 that represents a decision tree for the order planning problem given in Table 2.2. We recall this table.
 



Picture 2.6 ”Decision Tree for Order Planning”

For the above order planning problem, the use of a decision table in comparison with the use of a decision tree may seem easier and simpler. However, as the decision problem becomes more complex, the decision tree becomes more valuable in organizing the information needed to make the decision. This is especially true if the decision maker must make a sequence of decisions, rather than a single decision, as the next example illustrates.

Suppose the marketing manager of a firm is trying to decide whether or not to market a new product and at what price to sell it. The profit to be made depends on whether or not a competitor will introduce a similar product and on what price the competitor charges.

Note that there are two decisions: (1) introduce the product or not, and (2) the price to charge. Likewise, there are two events: (1) competition introduces a competitive product (or not), and (2) the competitor’s price. The timing or sequence of these decisions and events is very important in this decision. If the marketing manager must act before knowing whether or not the competitor has a similar product, the price may be different than with such knowledge. A decision tree is particularly useful in this type of situation, since it displays the order in which decisions are made and events occur.

Suppose that the firm must decide whether or not to market its new product shortly. However, the price decision can be made later. If the competitor is going to act, it will introduce its product within a month. In three months, the firm will establish and announce its price. After that, the competitor will announce its price.

Note that the given problem is a sequential decision problem. The firm must make a decision now about introduction and subsequently set price, after learning about the competitor’s action. This structure of the problem is diagrammed in the decision tree in Picture 2.7. Estimated profits for every path through the tree are shown at the ends of the tree. The probabilities for each event are shown under the event branches. Note that the probabilities for the competitor’s price behavior are different when the firm’s price is high than when the firm’s price is medium or low.

After the analysis of the drawn decision tree the following strategy can be recommended to the firm: Introduce the new product and charge a high price if there is no competitive entry; but charge a medium price if there is competition. For this strategy, the expected profit is $156,000.

Picture 2.7 ”Decision Tree for Product Introduction Example”

2.9.3 Comment on Decision Trees

Decision trees have many advantages. They make possible to obtain a visual portrayal of sequential decisions, i. e. they picture a series of chronological decisions. They are universal, they make more accurate the structure of the decision process (they have an ”arranging” function), they facilitate a communication among solvers of the decision problem, they force the decision maker to appreciate all consequences of his decisions. Construction and analysis of decision trees by means of computers makes possible to experiment with decision trees and quickly to establish the impact of changes in the input parameters of the tree on the choice of the best strategy.

On the other side, decision trees have some limitations.

Only one decision criterion can be considered.

Like all models, the decision tree is an abstraction and simplification of the real problem. Only the important decisions and events are included; otherwise, the tree becomes too ”bushy” and the number of the calculations – in spite of that they are not difficult – is too large.

We cannot use decision trees if the chance event outcomes are continuous. Instead, we must redefine the outcomes so that there is a finite set of possibilities.

If we are not risk indifferent, it is more appropriate to use utility values instead of monetary values.

The most important result of the analysis of a decision tree is a selection of the best alternative in the first stage of the decision process. After this stage, some changes in the decision situations can come, an additional information can be obtained, and usually, it is necessary to actualize the decision tree and to determine a new optimal strategy. This procedure is necessary before every further stage.

2.10 Solution to Conflict Decision Problems by Use of Game Theory

2.10.1 The Nature of Game Theory Problems and Basic Terms

Game theory deals with decision making under conflict or competition. Decision making of this type appears in parlour games (from this area some terms in game theory were adopted), nevertheless, there are many examples of real-life game theory problems: international military conflicts, choice of marketing strategies, labour-management negotiations, potential mergers and so on.

Game theory has its beginning in the 1920’s, but its greatest advance occurred in 1944, when John von Neumann and Oskar Morgenstern, both at Princeton University, published their landmark book ”Theory of Games and Economic Behavior”.

The main characteristic of games is that two or more decision makers with conflicting objectives are involved and the consequences of the decisions (payoffs) to each depend on the courses of action taken by all. Each decision maker is usually trying to maximize his welfare at the expense of the others.

The following basic terms are used in game theory:

Play is a one-shot decision in a conflict situation.

Game is a series of repetitive decisions (plays).

Player is an active participant of the game (it may be a single person or a group of persons with the same interests). A player can be rational, intelligent (if he aims at the best result of the game) or non-intelligent, indifferent towards the result of the game (usually such a player represents a chance mechanism).

Strategy is a predetermined plan for selecting a course of action. A set of strategies for a player forms a space of strategies for this player.

Payoff is a numerically expressed consequence of the decisions of the players. The payoff depends on the choice of the strategies of all players and therefore we speak about payoff function.

Value of a game is an average payoff per play. A game whose value is zero is called a fair game.

There are some criteria for classification of games.

Games are presented either in normal form or in tree form depending on the type of the game and on the aim of its modelling. Normal form is the common form of games and only it will be discussed in this text.

A solution to game problems provides us with answers to these two questions:

2.10.2 Matrix Games and Their Solution

The simplest type of games are two-person zero-sum games with a finite number of strategies for each player. In these games, the sum of payoffs for both players after each play is zero (a win of one player is a loss of the other player). Since interests of both players with regard to the outcome of the game are diametrically opposed, these games are often called games of pure conflict or antagonisticgames.

Let us consider player A with strategies A1, A2, …, Am and player B with strategies B1, B2, …, Bn. Let us denote the payoff function of player A for strategies Ai, Bj by M1 (Ai, Bj)and the payoff function of player B by M2(Ai, Bj). The equation M1 (Ai, Bj) + M2 (Ai, Bj) = 0 results in the relationship M2(Ai, Bj) = - M1(Ai, Bj). Therefore only one payoff function marked M(Ai, Bj)is sufficient. Its values can be arranged in a m x n payoff matrix (table) with elements aij = M(Ai,Bj), i=1,2,…,m; j=1,2,…,n. Therefore finite two-person zero-sum games are called matrix games.

Rows of a payoff matrix relate to player A, columns relate to player B. Payoff matrices are usually constructed from player A’s standpoint. Under this assumption, a positive number aij indicatesa win for player A and a loss for player B, whereas a negative number aij indicates a loss for player A and a win for player B.

A game payoff table is similar to a decision payoff table (see Section 2.1). The difference between the two is that in a decision table, there is only one decision maker who makes decisions under conditions expressed as “states of nature”. In a game table, there are two decision makers, one on the left and the other at the top of the table. For this reason, a decision table is sometimes viewed as a “one-player game against nature” (see Section 2.10.7).

A game payoff table is illustrated by the following example (it is a variation of the three-finger Morra game).

Players A and B show simultaneously 1 or 2 or 3 fingers. If the sum of the shown fingers is an even number, player A receives from player B so many coins how many fingers the player B showed. If the sum of the fingers is an odd number, the player B receives from player A so many coins how many fingers player A showed.

The payoff matrix of this game is contained in Table 2.18.
 


Table 2.18 ”Payoff Table of the Morra Game”

An important term of game theory is dominance of strategies. Successive elimination of dominated strategies results in reduction of payoff matrix size.

A strategy is said to dominate another when all payoffs in the row (or the column) of that strategy are as good as and at least one is better than the corresponding payoffs of the other row (or column). It is apparent that optimal behavior of players will never require the use of dominated strategies.

In Table 2.18, strategy “1 finger” is for player A as good as or better than strategy “3 fingers”, no matter what player B does (numbers in first row of the payoff matrix are equal or greater than the corresponding numbers in third row). Therefore strategy “3 fingers” is dominated by strategy “1 finger” and player A, being rational, will not select it. A similar dominance of strategies exists for player B (numbers in first column of the payoff matrix are equal or smaller than the corresponding numbers in third column and therefore player B will not choose his third strategy).

Table 2.19 contains the payoff matrix for the above Morra game with remaining strategies after elimination of dominated strategies.


Table 2.19 ”Reduced Payoff Table of the Morra Game”

A matrix game can be solved in pure or in mixed strategies.

Solution in pure strategies means that only one strategy is repeatedly recommended to each player, in contrast to solution in mixed strategies when players change from strategy to strategy if the game is repeated.

Pure strategy of a player represents one of his possible strategies.

If player A has m pure strategies A1, A2,…, Am, then his mixed strategy is a probability distribution over m points, i.e., it is a set of m numbers x1, x2, …, xm such that

A similar definition holds for a mixed strategy of player B, except that the probability distribution is over n points.

It is evident that every pure strategy is a special kind of mixed strategy (for example strategy Ar is a mixed strategy with xr = 1, xi= 0 for i = 1,2,…,m, i? r).

Optimal pure strategy of player A (let us mark it A0) ensures this player a maximal win regardless of what the other player does. Similarly, optimal pure strategy of player B (marked B0) ensures this player a minimal loss regardless of what the other player does. In other words, any player deviating from the optimal strategy will find either no improvement or (usually) a worsening of payoffs, if the other player chose his optimal strategy. This statement can be expressed by the following inequalities:

(2.1)

The value of payoff function M(A0,B0) is called value of the game and is marked v.

The triplet (A0,B0,v) represents solution of the game in pure strategies.

If both players know their optimal pure strategies, playing the game using these strategies is rather boring.

If a matrix game has no optimal pure strategies, we solve it in mixed strategies.

If player A selects a mixed strategy x= (x1, x2,…, xm) and player B independently on player A selects a mixed strategy y = (y1 ,y2,…, yn) in a game with payoff matrix G=(aij), the payoff function represents the expected value of payoff for player A and is given by

. (2.2)

The above formula is identical with the equation E(x,y) = xGyT, where yT is the transposed (column) vector of the row vector y.

After introducing mixed strategies and payoff function related to any selected mixed strategies of both players, we can state the fundamental theorem of game theory:

For every m x n matrix game, there exist strategies x0 and y0 (not necessarily unique) for player A and B, respectively, and a unique number v  such that

E(x0,y) ?v   for every strategy y of B (2.3)

E(x,y0) ?v  for every strategy x of A (2.4)

Any strategies x0, y0that satisfy (3.3) and (3.4) are called optimal mixed strategies for player A and B, respectively. The number v  is called value of the game.

The triplet (x0,y0,v) represents solution of the game in mixed strategies.

It can be shown that game value v is the largest guaranteed expected payoff that player A can obtain, irrespective of B’s play, and the smallest expected payoff that player B can allow for A, irrespective of A’s play.

An immediate consequence of the fundamental theorem of game theory is the fact that the expected payoff for A, when both A and B use optimal strategies, is v. Then, if in the relationships (2.3) and (2.4) the number v is substituted by E(x0,y0), an analogy of the inequality (2.1) is valid:

Since the fundamental theorem of game theory applies to every m x n matrix game, in particular, it applies to games with optimal pure strategies. For these games, x0, y0are pure strategies A0, B0 and E(x0, y0) = M(A0, B0).

2.10.2 Solution to Games with a Saddle Point

Rational players take the minimax approach which maximizes the minimum possible gain (for player A) and minimizes the maximum possible loss (for player B). In other words, both players determine the worst possible payoff associated with each of their strategies and then they select that strategy that yields the best of these worst payoffs.

The minimax approach is illustrated by a game with the payoff matrix involved in the following table.

Player A will choose the largest of the row minimums (in the above case strategy A1), and player B will choose the smallest of the column maximums (in the above case strategy B2). Since the maximin (the maximum of the minimum values of the rows) is equal to the minimax (the minimum of the maximum values of the columns), the game has solution in pure strategies (it is a pure strategy game or a strictly determined game).

Optimal pure strategies in the considered game are A0=A1, B0=B2.

The value of payoff function M(A0, B0) is a saddle point of the payoff table (it is the number that is the smallest in its row and at the same time the greatest in its column). The value of the saddle point represents the value of the pure strategy game (in the solved game v = 1).

It can be proved that any game of perfect information has a saddle point and therefore it can be solved in pure strategies. This theorem guarantees the existence of a saddle point for example for a game of chess that belongs to games of perfect information. Evidently, the value of this saddle point may be either 1 (white must always win), -1 (white must be always defeated), or 0 (each game ends by a draw). The reason why the game of chess is still played consists in the fact that in practice there is no chance to construct the matrix game for the game of chess. The number of all strategies for each player in this game is estimated to be of order 1020, therefore the corresponding matrix would have 1020 *1020 elements.

Payoff matrices can contain several saddle points. For example, the game presented in Table 2.20 has two saddle points with the value 1. These points determine the following optimal pure strategies for both players: (A1,B2), (A3,B2). For both these strategies the value of the game is 1. It is valid generally that if more than one saddle point exists in a matrix game the corresponding numbers are equal and represent the value of the game. An optimal strategy of player A (B) is determined by any row (column) containing a saddle point.

Table 2.20 ”Multiple Solution Game”

It can be shown that any convex combination of alternative optimal strategies of a player represents also an optimal strategy for this player. For example in the game given in Table 2.20, a convex combination of strategies A1 and A3 in the form k1*(1,0,0)+k2*(0,0,1), where k1 ?0, k2? 0, k1+k2=1, represents an optimal mixed strategy x0=(k1,0,k2). According to the formula (2.2), the value of the considered game is E(x0,B2)=1*k1+0*0+1*k2=1.

Not every game payoff table has a saddle point and therefore not every game has solution in pure strategies. For example, in Table 2.19 the condition for existence of a saddle point is not fulfilled, since maximin = -1, minimax = 1. The maximin strategy of player A is A1 (“1 finger”), the minimax strategy of player B is B1 (“1 finger”).

Assume that on the first play, player A selects strategy A1 and player B selects strategy B1. As soon as B finds out that A is consistently choosing A1, player B´s next choice will be strategy B2 (“2 fingers”), since he will win 1 coin (choosing B1 he will lose 1 coin). When A finds out that B selects B2, he will choose A2. Then, as B finds out about this change, he will return back to B1, and so on. Both players will soon find out that it is better to shift from strategy to strategy (mix the strategies) rather than play the same one all the time as with a pure strategy approach. The best strategy is a random selection of all strategies in certain proportions that are designed by a probability distribution over all strategies. It is said that such a game has solution in mixed strategies (it is a mixed strategy game or a nonstrictly determined game).

Solutions to mixed strategy games are based on the assumption of repetition (or multiplicity of conflicts) so that an expected value approach is justified. For that reason, a mixed strategy is not attractive for a business decision maker facing a onetime decision.

3.10.4 Analytical Solution to a 2 x 2 Mixed Strategy Game

If a game with two strategies of each player and with a payoff matrix

has no saddle point, we can determine the optimal mixed strategy of player A using optimal strategy formulas for a 2 x 2 game:

(2.5)

Under the assumption that the game has no saddle point, it can be shown that the denominator of the above fractions will never be zero.

Similarly, the optimal mixed strategy of player B is given by

The value of the game can be found using one of the following four formulas:

either (2.6)

or 

or 

or 

The above formulas can be used for a calculation of optimal mixed strategies in the game with the payoff matrix in Table 2.19.

Since the probability of the dominated strategies A3 and B3 is zero, the optimal strategies of players in the Morra game given in Table 2.18 are

Since the value of the game is zero, the game is fair. Zero is a guaranteed expected gain to player A (it is more than maximin=-1) and a guaranteed expected loss to player B (it is less than minimax=1).

The obtained optimal mixed strategies can be interpreted so that player A will choose randomly strategies A1 and A2 in the proportion 2:1 and player B will change randomly strategies B1 and B2. One way to assure random choice of strategies A1 and A2 is to take 3 pieces of paper and write A1 on 2 of them and A2 on 1 of them. Then mix them up in a container and, without looking, draw one piece of paper for the first play. Return the paper and draw a second piece for the next play, and so on. Random choice of strategies B1 and B2 in the proportion 1:1 can be assured by tossing a coin (heads and tails appear with the equal probability 0.5).

2.10.5 Graphical Solution to a 2 x n or m x 2 Game

Let us consider a game with a payoff matrix

Any mixed strategy x=(x1,x2) of player A in this game can be identified with a point x1,x2 on the line segment of length 1, as in Picture 2.8.

Picture 2.8 ”Graphical Representation of Strategy (x1,x2) on a Unit Interval”

If player A chooses mixed strategy x = (x1,x2) and player B chooses pure strategy B1, the expected payoff to player A is

.

After the substitution x1 =1-x2, the right side of the above equation is a11+(a21-a11)x2. This expression represents a line that connects points a11 on the left axis and a21 on the right axis, if these vertical axes are constructed in the end points of the unit interval (see Picture 2.9).

Picture 2.9 ”Expected Payoff Line for Player A in a 2 x n Game”

In Picture 2.9, the B1 line represents the expected payoff to A from the strategy (x1,x2), when x1 varies from one to zero (x2 varies from zero to one), assuming B uses only B1.

If player B chooses B2, then

,

and a different, but similar diagram results. Superposing these two lines we obtain a drawing of the type shown in Picture 2.10.

The heavy line in Picture 2.10 represents the minimal expected payoff to player A (A‘s gain floor) no matter what player B does. The top of this heavy line determines an optimal solution to the considered game. The horizontal coordinate of this point is equal x02 (then x01=1-x02) and the vertical coordinate represents the value of the game.


Picture 2.10 ”Graphical Solution to a 2 x 2 Game for Player A”

In a similar manner, we can graphically determine an optimal strategy for player B in the considered game.

If player B chooses a mixed strategy y = (y1,y2) and player A chooses his pure strategies A1, A2, the expected payoffs to player A (losses for player B) are given by the following expressions:

The above expressions are geometrically represented by lines, as Picture 2.11 shows. The heavy line represents the maximal expected loss for player B (B‘s loss ceiling) no matter what player A does. The lowest point of the heavy line determines an optimal solution to the considered game. The horizontal coordinate of this point is equal y02 (then y01=1-y02) and the vertical coordinate represents the value of the game.

Picture 2.11 ”Graphical Solution to a 2 x 2 Game for Player B”

The above described proceeding for graphical solution to 2 x 2 games can be used for 2 x n or m x 2 games in which n >2, m >2 and no reduction of the number of strategies is possible.

In a 2 x n game we can graphically determine an optimal strategy x0 = (x01,x02). To find an optimal strategy for B, we determine (by means of the corresponding graph) his pure strategies that influence the optimal strategy of player A and the value of the game (the lines that correspond to these strategies intersect in the top of the graph of the minimal expected payoff of player A). So we can solve a 2 x 2 subgame using analytical or graphical way of solution.

A similar procedure is used solving m x 2 games.

To illustrate some special cases that can occur if a game is graphically solved, we will solve the following games.

Example 1

Solve a game with the payoff matrix

The game is graphically solved in Picture 2.12.

Picture 2.12 ”Graphical Representation of Example 1”

An optimal strategy for player A is x0 the value of the game is 

The top of the graph of the minimal expected payoff of player A is an intersection of three lines that correspond to strategies B1, B2, B3, and therefore three 2 x 2 subgames should be solved.

The subgame with the payoff table

has a saddle point 1 that represents the value of the considered subgame. This value is different from the value of the game 2/3 and therefore strategies B1 and B2 cannot be combined.

In the subgame with the payoff table

an optimal strategy of player B is , the value of the game is 

In the subgame with the payoff table

an optimal strategy of player B is , the value of the game is 

In the original 2 x 3 game, for player B either the strategy  is optimal.

Any convex combination of the above vectors is also an optimal strategy for B. To verify this statement we will express a convex combination of the considered vectors in the form

y0,

where 0 ? l? 1 , and we will show the following properties of the vector y0:
 

The components of the vector y0 have the following values:  It is easy to verify that these values are nonnegative for lÎá 0,1? and their sum is 1.

According to the formula (2.2)

E(x0,y0)= 
 
 

Example 2

Solve a game with the payoff matrix

The game is graphically solved in Picture 2.13.

Picture 2.13 ”Graphical Representation of Example 2”

From the graph it is apparent that an optimal strategy to player A is his second strategy and that the value of the game is 1. The minimum on the graph of the expected maximal loss of player B represents the whole line segment  with end points P=[1/5,1], R= [3/5,1]. Hence, player B has infinitely many optimal strategies (1-y02, y02), where y02Îá1/5,3/5? (his optimal strategy is any convex combination of strategies (4/5,1/5) and (2/5,3/5)). Assuming player A can deviate from his optimal strategy (he chooses strategy A1 or A3) the most advantageous strategy of player B is (3/5, 2/5). The expected loss of player B for this strategy is zero.

2.10.6 Conversion a m x n Game into a Linear Programming Problem

Every two-person zero-sum game can be formulated and solved as a linear programming problem (see Section 2.1).

Let us consider a m x n game with positive elements of game payoff matrix . This requirement does not entail any loss of generality since adding the same positive quantity to all payoff values does not alter the strategic structure of the game (a strategically equivalent game is created).

Conversion of a matrix game into a linear programming problem yields the following problems:

Problem (1) for player A:


 
 

From the solution to this problem we can derive the value of the corresponding game v (with positive values of payoff!) and the optimal strategy for player A x0=(x01, x02, …, x0m). The following relationships hold: 

Problem (2) for player B:

.

.

It is evident that problem (2) is a dual problem (see Section 2.4) to the problem (1).

Therefore fmax = zmin. Between the solution to the problem (2) and the solution to the game from player B’s point of view the following relationships hold: 

If a positive quantity has been added to all payoff values of the original payoff matrix, this quantity must be subtracted from the established value of the game in order to determine the value of the game with the original payoff matrix.

It is evident that the problems (1) and (2) have always solutions, i.e. optimal strategies of both players always exist (it is in accordance with the fundamental theorem of game theory).

Conversion of a m x n game into a linear programming problem will be illustrated by the well-known game ”scissors - stone – paper”. In this game two players simultaneously present a hand in one of three positions: an open hand (paper), a closed fist (stone), or two open fingers (scissors). The payoff is 1 unit according to the rule ”Paper covers stone, stone breaks scissors, and scissors cut paper”. If both players present a hand in the same position, the payoff is 0.

The payoff matrix for this game is contained in the following table.

The payoff matrix has no saddle point and no dominated strategies. The only way how to solve this game is to convert it into a linear programming problem. We proceed as follows:

Step 1. We convert the payoff matrix into a positive matrix by adding 2 to each element in the matrix. The new matrix (let us mark it K) is

K

Step 2. We set up two corresponding linear programming problems:

A) Minimize z = t1 + t2 + t3
Subject to 2t1 + 3t2 + t3 ? 1
t1 + 2t2 + 3t3 ? 1
3t1 + t2 + 2t3? 1
t1 ? 0t2 ?0 t3 ?0

B) Maximize f= s1 + s2 + s3
Subject to 2s1 + s2 + 3s3 ? 1
3s1 + 2s2 + s3? 1
s1 + 3s2 + 2s3 ? 1
s1 ? 0s2 ?0 s3 ?0
 
 

Step 3. We solve the maximization problem (for player B) using the simplex method. We introduce slack variables s4, s5, s6 and write the initial simplex tableau. We use rules for the entering variable and the leaving variable till the test of optimality is fulfilled (see Section 2.2.2). The initial and final simplex tableau of the considered linear programming problem is contained in Table 2.21 and Table 2.22.

Table 2.21 ”Initial Simplex Tableau”
 

Table 2.22 ”Final Simplex Tableau”

Maximum f= s1 + s2 + s3 occurs at s1 = s2 = s3 =and is .

The optimal solution to the minimization problem (for player A) can be read from the bottom row of the final simplex tableau for the dual problem above. Thus, from this row we conclude that the optimal solution for player A is t1 = t2 = t3 =

Step 4. We use the solutions in step 3 to find the value v1 for the game with the payoff matrix K  and to establish optimal strategies, and we determine the value v  for the original game.

According to the obtained results, each player should choose strategies ”scissors”, ”stone”, ”paper” randomly in the same proportion. The game is fair, since its value is zero.

2.10.7 Games against Nature

Games against nature represent matrix games with only one rational player. The second player (nature, fate, or an institution) is indifferent to the result of the game. We can say that nature may be treacherous, but it is not malicious.

Strategies of nature represent non-controllable and unpredictable situations that can substantially influence payoffs of the decision maker. One cannot recommend an optimal strategy to nature and therefore selecting optimal strategies is reasonable only for the decision maker.

Solving games against nature can represent choosing the most advantageous alternative of decision by means of rules described in Sections 3.2, 3.3, 3.4. Only if the decision maker is a great pessimist expecting an enemy behavior of nature he can solve a game against nature using the minimax approach. In the comparison with the mentioned rules for selecting the best alternative, the minimax approach makes possible to combine several advantageous alternatives. Moreover, value of the game against nature is important for the decision maker, since if nature deviates from its computed optimal strategy (nature always chooses a pure strategy), he can gain more than the value of the game.

Minimax approach to solving a game against nature is illustrated by the following example.

A large farm grows both wheat and rice. Rice does better in wetter years and wheat does better in normal or drier years. Based on twenty-year records and current market prices, a consulting firm provides the farm management with the following payoff matrix, where the entries represent a yearly gain (in millions of dollars).


Using the minimax approach, we can eliminate second strategy of nature (it is dominated by third strategy). The subgame with the payoff matrix

has no saddle point and therefore an optimal strategy for the farm and the value of the ”game” can be determined by means of the formulas (2.5) and (2.6). These formulas yield the following results:

x01 =, x02=, v =.

The above numbers can be interpreted so that for each year the payoff matrix holds, the farm should choose the area of wheat and rice in the proportion 7:2. Its expected yearly gain cannot be worse than  $mil. For example, if the weather is normal, the expected yearly gain of the farm will be

.