Using a recursive algorithm to find the value of a game

In summary, the expected value of the game is 0.4583 if you are required to play until all the balls have been drawn. However, if you have the option to stop at any point, the expected value increases to 0.64583. This is because having more options available can only improve the expected payoff.
  • #1
fignewtons
28
0

Homework Statement


Imagine you are playing a game with me, of drawing balls from a box. There are two blue balls and two red balls. They are picked with equal probability, and are drawn without replacement. If you draw a blue ball, I give you $1. If you draw a red ball, you pay me $1.25. What is the expected value of this game to you if you are allowed to stop at any point?

Homework Equations


I drew a lattice diagram that shows the value of the game in all the possible states of the balls.
(b,r) means that at this point, there are b blue balls and r red balls.

That is, something like this:
FullSizeRender.jpg

The boxed values indicate the value at each point in time, calculated recursively.

The Attempt at a Solution


I was able to figure out the value of the game if you were required to play until the box was completely empty using a recursive method.
the expected value at any point (b,r), V(b,r) = max((b/(b+r))(1 + V(b-1, r)) + (r/(b+r))(-1.25 + V(b, r-1)),0)
where V(b,r) is the method of value of the game at the situation with b blue balls and r red balls.
So V(2,2) is the value at the beginning of the complete game, so it was 0.4583 approximately if we were required to play until all the balls had been drawn.

I'm not sure how to compute the value of the game to you if you have the option of stopping whenever you want. I was told it was (1/3) but I don't know how this number comes about. Can someone explain how the way you value the game given this option is different from the one above, and how it is calculated? Thanks.
 
Physics news on Phys.org
  • #2
Your tree is a 're-combining tree' in which a node can have more than one edge coming in from the left. This problem is easier to analyse with a non-recombining tree in which each node has either one or two edges coming out of it to the right, but only ever one edge coming in from the left.

Let each node be a draw of a ball, except for the first node (root node), which is the starting position. At each node write five things (give yourself plenty of space on the paper to do this):
  • B or R, being the colour drawn at that node (this is empty for the root node)
  • a pair (b,r) being the number of blues and reds remaining after that draw (this is (2,2) for the root node)
  • the payoff arising from that draw, which is 1 for B and -1.25 for R (this is empty for the root node)
  • the cumulative payoff (this is 0 for the root node)
  • the probability of reaching that node
The leaf nodes are the nodes at which all blue balls have been drawn, ie the second datum is (0,2), (0,1) or (0,0). 'Leaf node' means a node from which you draw no outgoing lines to the right. The reason we stop when all blues have been drawn is that any further draws (if any are possible) can only give losses.

For each leaf node, multiply the cumulative payoff by the probability (ie the last two items in the above list). Add that across leaves, and there's your expected value.

I only had fifteen nodes in my final tree, and the expected value was indeed 1/3.
 
  • Like
Likes scottdave
  • #3
figNewtons said:

Homework Statement


Imagine you are playing a game with me, of drawing balls from a box. There are two blue balls and two red balls. They are picked with equal probability, and are drawn without replacement. If you draw a blue ball, I give you $1. If you draw a red ball, you pay me $1.25. What is the expected value of this game to you if you are allowed to stop at any point?

Homework Equations


I drew a lattice diagram that shows the value of the game in all the possible states of the balls.
(b,r) means that at this point, there are b blue balls and r red balls.

That is, something like this:
View attachment 206255
The boxed values indicate the value at each point in time, calculated recursively.

The Attempt at a Solution


I was able to figure out the value of the game if you were required to play until the box was completely empty using a recursive method.
the expected value at any point (b,r), V(b,r) = max((b/(b+r))(1 + V(b-1, r)) + (r/(b+r))(-1.25 + V(b, r-1)),0)
where V(b,r) is the method of value of the game at the situation with b blue balls and r red balls.
So V(2,2) is the value at the beginning of the complete game, so it was 0.4583 approximately if we were required to play until all the balls had been drawn.

I'm not sure how to compute the value of the game to you if you have the option of stopping whenever you want. I was told it was (1/3) but I don't know how this number comes about. Can someone explain how the way you value the game given this option is different from the one above, and how it is calculated? Thanks.

If you are allowed to stop or not stop at any time, the expected payoff cannot be worse than if you are forced to play to the end! The reason is that if you have more options available you cannot do worse. So, the maximum expected payoff when you can stop or not as you see fit must be at least as large as your 0.4583 figure, and this is larger than 1/3.

Using your recursive scheme
$$V(b,r) = \max \left(0,\frac{b}{b+r} [1 + V(b-1,r)] + \frac{r}{b+r} [-(5/4) + V(b, r-1)] \right) , $$
with the boundary condition that ##V(b,0) = b ## (just pick all ##b## remaining balls and get +$1 each) and ##V(0,r) = 0## (stop picking balls). Doing this I get ##V(2,2) = 31/48 \doteq 0.64583##.
 
Last edited:
  • #4
Ray Vickson said:
Using your recursive scheme
$$V(b,r) = \max \left(0,\frac{b}{b+r} [1 + V(b-1,r)] + \frac{r}{b+r} [-(5/4) + V(b, r-1)] \right) , $$
with the boundary condition that ##V(b,0) = b ## (just pick all ##b## remaining balls and get +$1 each) and ##V(0,r) = 0## (stop picking balls). Doing this I get ##V(2,2) = 31/48 \doteq 0.64583##.

I also get 0.4583 = 11/24, so I think you must have made a mistake somewhere.
V(1,1) = 1/2 * (V(0,1)+1) + 1/2 (V(1,0) - 5/4) = 1/2 + 1/2(1 - 5/4) = 3/8
V(2,1) = 2/3(V(1,1) + 1) + 1/3(V(2,0)-5/4) = 2/3(3/8 + 1) + 1/3(2-5/4) = 7/6
V(1,2) = 1/3(V(0,2) + 1_ + 2/3(V(1,1) -5/4) = 1/3(0+1) + 2/3(3/8 - 5/4) = 1/3 - 14/24 .it will become negative, so V(1,2) = 0.
V(2,2) = 1/2(V(1,2) + 1) + 1/2 (V(2,1) -5/4) = 1/2(0+1) + 1/2 (7/6 - 5/4) = 1/2 - 1/2(1/12) = 11/24
 

Attachments

  • upload_2017-6-29_11-27-32.png
    upload_2017-6-29_11-27-32.png
    12.7 KB · Views: 412
  • #5
willem2 said:
I also get 0.4583 = 11/24, so I think you must have made a mistake somewhere.
V(1,1) = 1/2 * (V(0,1)+1) + 1/2 (V(1,0) - 5/4) = 1/2 + 1/2(1 - 5/4) = 3/8
V(2,1) = 2/3(V(1,1) + 1) + 1/3(V(2,0)-5/4) = 2/3(3/8 + 1) + 1/3(2-5/4) = 7/6
V(1,2) = 1/3(V(0,2) + 1_ + 2/3(V(1,1) -5/4) = 1/3(0+1) + 2/3(3/8 - 5/4) = 1/3 - 14/24 .it will become negative, so V(1,2) = 0.
V(2,2) = 1/2(V(1,2) + 1) + 1/2 (V(2,1) -5/4) = 1/2(0+1) + 1/2 (7/6 - 5/4) = 1/2 - 1/2(1/12) = 11/24
Sorry: you are correct; I mixed up the 1/3 and 2/3 when computing V(1,2).
Anyway, the main point is that the OP has already taken care of the possibility of stopping, because that is built into his recursion. Also: the answer is definitely > 1/3.
 
  • #6
The attached diagram demonstrates the basis on which I calculated the answer to be 1/3. The payment to the player at each node is s. The probability of reaching a node is P. The leaves are shaded green. We add up Σs across those leaves, multiplying by the probabilities P, which are all 1/6, and we get 1/3.

EDIT: HOWEVER, I'm now convinced the value is what Ray says. See below.
 

Attachments

  • ball draw diagram 2.png
    ball draw diagram 2.png
    22 KB · Views: 408
Last edited:
  • #7
I've just realized that the value is indeed more than 1/3.
The algorithm that gives 1/3 only stops when there are no more blues, on the simple expectation that one can only lose from then on by further draws.

However, it can be advantageous to stop before then, even if there are remaining blues, because the expectation of payoffs from continued draws may be less than the expectation of stopping, which is 0.

That will be the case when there are 1 blue and 2 reds remaining, because the expected payoff from continuing to draw until the blue is drawn is negative.

In the diagram, that's the top node in the second column labelled

Blue
(1,2)
P=0.5
s=1
Σs=1

If we remove the descendant nodes of that node, making it a leaf, the expectation over the leaves becomes ##0.458\dot3##

If anybody's interested, here's R code for calculating V for any combination of reds and blues. It gives ##0.458\dot3## for ##V(2,2)##.

Code:
V <- function (b,r){print(paste('b=',b,'r=',r))
  if (r==0)
    b else
      if (b==0)
        0 else
        max(0, b* (1 + V(b-1,r)) + r*(-(5/4) + V(b, r-1)) )/(b+r)
}
V(2,2)
and attached is a revised diagram.
 

Attachments

  • ball draw diagram 2a.png
    ball draw diagram 2a.png
    24.4 KB · Views: 426
Last edited:
  • #8
andrewkirk said:
The attachment process didn't cooperate so it didn't appear when I expected to. It should be visible now though.
You do have an extraneous arrow.

After initially picking two Reds, you have two arrows from that; one to a Blue (fine), one to a Red (not possible).

Your numbers are fine.
 
  • #9
andrewkirk said:
I've just realized that the value is indeed more than 1/3.
The algorithm that gives 1/3 only stops when there are no more blues, on the simple expectation that one can only lose from then on by further draws.

However, it can be advantageous to stop before then, even if there are remaining blues, because the expectation of payoffs from continued draws may be less than the expectation of stopping, which is 0.

That will be the case when there are 1 blue and 2 reds remaining, because the expected payoff from continuing to draw until the blue is drawn is negative.

In the diagram, that's the top node in the second column labelled

Blue
(1,2)
P=0.5
s=1
Σs=1

If we remove the descendant nodes of that node, making it a leaf, the expectation over the leaves becomes ##0.458\dot3##

If anybody's interested, here's R code for calculating V for any combination of reds and blues. It gives ##0.458\dot3## for ##V(2,2)##.

Code:
V <- function (b,r){print(paste('b=',b,'r=',r))
  if (r==0)
    b else
      if (b==0)
        0 else
        max(0, b* (1 + V(b-1,r)) + r*(-(5/4) + V(b, r-1)) )/(b+r)
}
V(2,2)
and attached is a revised diagram.

A corresponding Maple procedure is:

V:=proc(a::nonnegint,b::nonnegint)
> option remember;
> if a = 0 then 0
> elif b = 0 then a
> else max(0,a/(a+b)*(1+V(a-1,b))+b/(a+b)*(-5/4+V(a,b-1)))
> end if;
> end proc;

This gives V(2,2) = 11/24 = 0.46786 and V(5,3) = 449/224 = 2.00446, etc.
 

1. How does a recursive algorithm work?

A recursive algorithm is a programming technique in which a function calls itself repeatedly until a certain condition is met. It involves breaking down a problem into smaller subproblems and using the same function to solve each subproblem, eventually leading to the solution of the original problem.

2. What is the significance of using a recursive algorithm to find the value of a game?

A recursive algorithm can be useful in finding the value of a game because it allows for a systematic approach to evaluating all possible outcomes and choosing the best one. It also allows for more complex and strategic decision-making in games with multiple moves and outcomes.

3. What are the advantages of using a recursive algorithm over other methods?

One advantage of using a recursive algorithm is that it can often result in a more elegant and concise solution to a problem. It also allows for efficient use of memory, as the function calls are stored in a stack rather than creating new variables for each subproblem.

4. Are there any limitations or drawbacks to using a recursive algorithm?

One limitation of using a recursive algorithm is that it can be less efficient than other methods, particularly when dealing with large data sets. It can also be more difficult to debug and understand, as the function calls can become nested and complex.

5. How do you know when to use a recursive algorithm to find the value of a game?

The decision to use a recursive algorithm to find the value of a game depends on the complexity and structure of the game. If the game involves multiple moves and outcomes that can be broken down into smaller subproblems, a recursive algorithm may be a good approach. It is also important to consider the time and memory constraints of the problem before deciding on a solution.

Similar threads

  • Calculus and Beyond Homework Help
Replies
29
Views
1K
  • Programming and Computer Science
Replies
22
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
917
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
955
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
826
  • Calculus and Beyond Homework Help
Replies
4
Views
2K
Replies
9
Views
961
  • Calculus and Beyond Homework Help
Replies
12
Views
1K
Back
Top