# Using a recursive algorithm to find the value of a game

Tags:
1. Jun 28, 2017

### fignewtons

1. The problem statement, all variables and given/known data
Imagine you are playing a game with me, of drawing balls from a box. There are two blue balls and two red balls. They are picked with equal probability, and are drawn without replacement. If you draw a blue ball, I give you $1. If you draw a red ball, you pay me$1.25. What is the expected value of this game to you if you are allowed to stop at any point?

2. Relevant equations
I drew a lattice diagram that shows the value of the game in all the possible states of the balls.
(b,r) means that at this point, there are b blue balls and r red balls.

That is, something like this:

The boxed values indicate the value at each point in time, calculated recursively.

3. The attempt at a solution
I was able to figure out the value of the game if you were required to play until the box was completely empty using a recursive method.
the expected value at any point (b,r), V(b,r) = max((b/(b+r))(1 + V(b-1, r)) + (r/(b+r))(-1.25 + V(b, r-1)),0)
where V(b,r) is the method of value of the game at the situation with b blue balls and r red balls.
So V(2,2) is the value at the beginning of the complete game, so it was 0.4583 approximately if we were required to play until all the balls had been drawn.

I'm not sure how to compute the value of the game to you if you have the option of stopping whenever you want. I was told it was (1/3) but I don't know how this number comes about. Can someone explain how the way you value the game given this option is different from the one above, and how it is calculated? Thanks.

2. Jun 28, 2017

### andrewkirk

Your tree is a 're-combining tree' in which a node can have more than one edge coming in from the left. This problem is easier to analyse with a non-recombining tree in which each node has either one or two edges coming out of it to the right, but only ever one edge coming in from the left.

Let each node be a draw of a ball, except for the first node (root node), which is the starting position. At each node write five things (give yourself plenty of space on the paper to do this):
• B or R, being the colour drawn at that node (this is empty for the root node)
• a pair (b,r) being the number of blues and reds remaining after that draw (this is (2,2) for the root node)
• the payoff arising from that draw, which is 1 for B and -1.25 for R (this is empty for the root node)
• the cumulative payoff (this is 0 for the root node)
• the probability of reaching that node
The leaf nodes are the nodes at which all blue balls have been drawn, ie the second datum is (0,2), (0,1) or (0,0). 'Leaf node' means a node from which you draw no outgoing lines to the right. The reason we stop when all blues have been drawn is that any further draws (if any are possible) can only give losses.

For each leaf node, multiply the cumulative payoff by the probability (ie the last two items in the above list). Add that across leaves, and there's your expected value.

I only had fifteen nodes in my final tree, and the expected value was indeed 1/3.

3. Jun 29, 2017

### Ray Vickson

If you are allowed to stop or not stop at any time, the expected payoff cannot be worse than if you are forced to play to the end! The reason is that if you have more options available you cannot do worse. So, the maximum expected payoff when you can stop or not as you see fit must be at least as large as your 0.4583 figure, and this is larger than 1/3.

$$V(b,r) = \max \left(0,\frac{b}{b+r} [1 + V(b-1,r)] + \frac{r}{b+r} [-(5/4) + V(b, r-1)] \right) ,$$
with the boundary condition that $V(b,0) = b$ (just pick all $b$ remaining balls and get +\$1 each) and $V(0,r) = 0$ (stop picking balls). Doing this I get $V(2,2) = 31/48 \doteq 0.64583$.

Last edited: Jun 29, 2017
4. Jun 29, 2017

### willem2

I also get 0.4583 = 11/24, so I think you must have made a mistake somewhere.
V(1,1) = 1/2 * (V(0,1)+1) + 1/2 (V(1,0) - 5/4) = 1/2 + 1/2(1 - 5/4) = 3/8
V(2,1) = 2/3(V(1,1) + 1) + 1/3(V(2,0)-5/4) = 2/3(3/8 + 1) + 1/3(2-5/4) = 7/6
V(1,2) = 1/3(V(0,2) + 1_ + 2/3(V(1,1) -5/4) = 1/3(0+1) + 2/3(3/8 - 5/4) = 1/3 - 14/24 .it will become negative, so V(1,2) = 0.
V(2,2) = 1/2(V(1,2) + 1) + 1/2 (V(2,1) -5/4) = 1/2(0+1) + 1/2 (7/6 - 5/4) = 1/2 - 1/2(1/12) = 11/24

#### Attached Files:

File size:
17.8 KB
Views:
23
5. Jun 29, 2017

### Ray Vickson

Sorry: you are correct; I mixed up the 1/3 and 2/3 when computing V(1,2).
Anyway, the main point is that the OP has already taken care of the possibility of stopping, because that is built in to his recursion. Also: the answer is definitely > 1/3.

6. Jun 29, 2017

### andrewkirk

The attached diagram demonstrates the basis on which I calculated the answer to be 1/3. The payment to the player at each node is s. The probability of reaching a node is P. The leaves are shaded green. We add up Σs across those leaves, multiplying by the probabilities P, which are all 1/6, and we get 1/3.

EDIT: HOWEVER, I'm now convinced the value is what Ray says. See below.

#### Attached Files:

• ###### ball draw diagram 2.png
File size:
26.9 KB
Views:
21
Last edited: Jun 29, 2017
7. Jun 29, 2017

### andrewkirk

I've just realised that the value is indeed more than 1/3.
The algorithm that gives 1/3 only stops when there are no more blues, on the simple expectation that one can only lose from then on by further draws.

However, it can be advantageous to stop before then, even if there are remaining blues, because the expectation of payoffs from continued draws may be less than the expectation of stopping, which is 0.

That will be the case when there are 1 blue and 2 reds remaining, because the expected payoff from continuing to draw until the blue is drawn is negative.

In the diagram, that's the top node in the second column labelled

Blue
(1,2)
P=0.5
s=1
Σs=1

If we remove the descendant nodes of that node, making it a leaf, the expectation over the leaves becomes $0.458\dot3$

If anybody's interested, here's R code for calculating V for any combination of reds and blues. It gives $0.458\dot3$ for $V(2,2)$.

Code (Text):

V <- function (b,r){print(paste('b=',b,'r=',r))
if (r==0)
b else
if (b==0)
0 else
max(0, b* (1 + V(b-1,r)) + r*(-(5/4) + V(b, r-1)) )/(b+r)
}
V(2,2)

and attached is a revised diagram.

#### Attached Files:

• ###### ball draw diagram 2a.png
File size:
29.1 KB
Views:
23
Last edited: Jun 29, 2017
8. Jun 29, 2017

### SammyS

Staff Emeritus
You do have an extraneous arrow.

After initially picking two Reds, you have two arrows from that; one to a Blue (fine), one to a Red (not possible).

9. Jun 30, 2017

### Ray Vickson

A corresponding Maple procedure is:

V:=proc(a::nonnegint,b::nonnegint)
> option remember;
> if a = 0 then 0
> elif b = 0 then a
> else max(0,a/(a+b)*(1+V(a-1,b))+b/(a+b)*(-5/4+V(a,b-1)))
> end if;
> end proc;

This gives V(2,2) = 11/24 = 0.46786 and V(5,3) = 449/224 = 2.00446, etc.