Programming a recursive expected value card game

In summary, the conversation discusses the expected value of drawing cards randomly from a deck of 52 cards, with 26 red and 26 black, without replacement. If the card drawn is red, the player gains a dollar, and if it is black, the player loses a dollar. The expected value is calculated using a recursive formula, and the conversation then explores different strategies for programming this formula. Ultimately, the conversation ends with a discussion of using machine learning to optimize outcomes in this game.
  • #1
member 428835
Hi PF!

Take a deck of cards. 26 are red and 26 are black. You draw cards randomly without replacement: if the card is red you gain a dollar, if black you lose a dollar. What's the expected value?

My thought process was to look at simple cases and build to a 52 card deck. We know if ##r=0## expected payout is 0, and if ##b=0## expected payout is ##r##. Otherwise the following rule should govern the expected payout ##E(r,b)## of the process:

$$E(r,b) = \frac{r}{r+b}(1 + E(r-1,b) ) + \frac{b}{r+b}(-1 + E(r,b-1) )$$

Obviously we draw another card when ##E(r,b)>0##, else we stop. My question is how to program this? I've tried the following but am stumped as to what's next. Can you help?

Python:
def cards(r, b):
    def expVal(r, b):
        if r == 0:
            return 0
        elif b == 0:
            return r
        else:
            return r/(r + b) * (1 + expVal(r-1, b)) + b/(r + b) * (-1 + expVal(r, b-1))
    expVal(r, b)
if __name__ == "__main__":
    r = 26
    b = 26
    cards(r, b)
 
  • Like
Likes PeroK and FactChecker
Technology news on Phys.org
  • #2
In your problem statement, are you allowed to stop when there are no more red cards left? Are you allowed to keep going when there are no more black cards left? You have not defined the end-game rules.
1) Make sure that your logic in lines 3-6 match the end-game rules.
2) You need to print the result.
 
  • #3
joshmccraney said:
Take a deck of cards. 26 are red and 26 are black. You draw cards randomly without replacement: if the card is red you gain a dollar, if black you lose a dollar. What's the expected value?
If these are the only rules, you would always win if you quit while ahead or at worse, have a draw. I like those odds. It would make an interesting machine learning problem to determine the optimal stopping point when you're ahead.
 
  • Like
Likes FactChecker
  • #4
Borg said:
If these are the only rules, you would always win if you quit while ahead or at worse, have a draw. I like those odds. It would make an interesting machine learning problem to determine the optimal stopping point when you're ahead.
You don't need anything that complicated. You can just change the program to let retval return 0, if it would ever return a negative number.
The other problem with the program is that it is too slow, because it calls expval many times with the same arguments. You'll need memoization.
if you put
Python:
rom functools import cache
@cache
before the definition of expval (and drop the redundant cards(b,r) definition) this will compute instantaneously, instead of never.
 
  • Love
Likes PeroK
  • #5
willem2 said:
The other problem with the program is that it is too slow, because it calls expval many times with the same arguments.
Does it? It seems that every call will have one fewer red or one fewer black -- never repeating the same two arguments.
 
  • #6
One principle is that as more cards are drawn the winnings that you will settle for must reduce. The first strategy to test is the quit as soon as you are 1 card ahead. You could calculate how much that wins.

The next strategy is to quit when you are two cards ahead until some point and then quit when you are one card ahead after that point. You have one variable here. If you optimise that then does that fix the point at which you drop to one card?

Then, you quit three cards ahead, then two then one. That means two variable points. But, we already have the optimum point for the switch from 2 to 1. Now we calculate the optimum point for the switch from 3 to 2.

We continue until the optimum point is as early as possible.

My guess is that we would quit at 3 cards ahead after 3 cards, then 2 cards ahead at some point and 1 card ahead at some final point. I doubt it would be worth gambling if you 3 cards ahead, possibly 4 at the very maximum and perhaps a maximum of 2 would be optimal.
 
  • #7
That's why I thought it would be an interesting ML program. It would also take into account how many cards of each type are remaining when it decides whether to quit.
 
  • #8
As soon as you are ahead, without replacement, the odds turn against you, and if you continue to draw all the cards, then you will only break even. The converse is that if you are behind, then the odds turn in your favor and you know that you can always break even by continuing to draw all the cards.
So I think that the optimum strategy is to continue until you are ahead and then stop immediately (or until there are no more cards)

UPDATE: @Borg has pointed out the error of this thinking. See posts #15 and #16.
 
Last edited:
  • #9
I don't think that it's that simple. The ultimate goal in machine learning is to maximize outcomes. Since there are cases where you can earn more than $1, there is the possibility of earning more if there are still good cards left. Over thousands of games, it will learn when to try for more in order to increase the total amount won per game.
 
  • #10
Borg said:
I don't think that it's that simple. The ultimate goal in machine learning is to maximize outcomes. Since there are cases where you can earn more than $1, there is the possibility of earning more if there are still good cards left. Over thousands of games, it will learn when to try for more in order to increase the total amount won per game.
There is no need for machine learning when the analysis gives an answer. Once you are ahead, it means that there are more black cards remaining than red cards. Therefore, the expected return for continuing the game is negative. So you should stop as soon as you get ahead. Do not continue when the odds are against you.
 
  • #11
So you would quit if your first draw was a winner? I could definitely see being able to beat that strategy over the long term.
 
  • #12
Borg said:
So you would quit if your first draw was a winner? I could definitely see being able to beat that strategy over the long term.
Yes, I would quit. It would not be a good bet to continue the game with 25 red cards and 26 black cards. I would expect, on average, to lose the following play.
 
  • #13
However, you have a chance to double your money. Even if the next card was a loser, you would still have a very signicant chance of eventually getting back to +1 again. The odds of that happening for this example would be higher than being stuck with zero.
 
  • #14
Borg said:
However, you have a chance to double your money. Even if the next card was a loser, you would still have a very signicant chance of eventually getting back to +1 again. The odds of that happening for this example would be higher than being stuck with zero.
If the next card is a loser, then there are 25 red and 25 black cards remaining. I would expect, on average, to break even by continuing play. But that is worse than the +1 advantage that I had before I picked the second card. So it was not wise to pick the second card.
 
  • #15
Only if you never get above zero in 50 draws. That's little different than the 52 cards in the starting scenario. Statistically speaking, the odds are very high that you would get in positive territory at some point for either situation. Over time, the attempt to go for a second dollar (with just under 50-50 odds) would win me more money.
 
  • Like
Likes FactChecker
  • #16
Borg said:
Only if you never get above zero in 50 draws. That's little different than the 52 cards in the starting scenario. Statistically speaking, the odds are very high that you would get in positive territory at some point for either situation. Over time, the attempt to go for a second dollar (with just under 50-50 odds) would win me more money.
AHa! I see your point. I stand corrected. If there is a high enough variance so that you can expect some greater advantage at some point later on, then it is better to play for that. You should stop when your current advantage is greater or equal to the expected maximum advantage if you continue the game.
That is really worth some thought. Thanks.
 
  • Like
Likes Borg
  • #17
Early in the game, I would expect that there might even be favorable odds of going for a third dollar. What to go for and when in a problem like this is something that machine learning excels in.
 
  • Like
Likes PeroK and FactChecker
  • #18
FactChecker said:
Does it? It seems that every call will have one fewer red or one fewer black -- never repeating the same two arguments.
expval (2,2) calls expval(1,2) and expval(2,1) and both call expval(1,1)
If you memoize, you have to compute a maxium of r*b nodes, and if you don't
With r=b=10, you only have to to 120 function calls, opposed to 369511 without memoization.
r=b=26 will just never get done without memoization
 
  • Like
Likes PeroK and FactChecker
  • #19
FactChecker said:
AHa! I see your point. I stand corrected. If there is a high enough variance so that you can expect some greater advantage at some point later on, then it is better to play for that. You should stop when your current advantage is greater or equal to the expected maximum advantage if you continue the game.
That is really worth some thought. Thanks.
If we take a simple example. You draw a red card first.

If your strategy is to quit, then you win $1 every time this happens.

If your strategy is to play on to the second card, then you win $2 almost 50% of the time and are back to even about 50% of the time. That's an average win of almost the same as above, but you can still win $1 on most of the remaining games.

My first simulation gives an average win of ##0.963## for the first strategy; ##1.200## if you try to get two cards ahead only until the second card (then quit whenever you are one card ahead after that); ##1.318## if you try to get two cards ahead until the 4th card; etc.

And, in fact, the optimum strategy is to keep playing until the 48th (or 50th) card looking to get two cards ahead.

You can confirm analytically that it's better to play on from 46 to 48 as follows.

The test case is where you are ##+1## after 47 cards. With the "46" strategy, you would quit now and take the $1 whenever this happens. With the "48" strategy, you would play on and:

With probability ##\frac 2 5## win $2.

With probability ##\frac 3 5## be level after 48 cards.

With probability ##\frac 3 {10}## win $1 by quitting after 49 cards.

With probability ##\frac 3 {10}## be one card down after 49 cards. And, so have a final probability of ##\frac 1 {10}## of winning $1 after 51 cards.

That all adds up to an expected win of $1.2 dollars for the "48" strategy against $1 for the "46" strategy in these key cases. All other cases being equal.

Finally, it's just as good to play on if you are one card ahead after 49 cards from the "48" to the "50" strategy as your expected winning would be:

With probability ##\frac 1 3## you win $2.

With probability ##\frac 2 3## you are level after 50 cards, so with probability ##\frac 1 3## you win $1.

Both strategies win $1 on average in this key case.

I'll look at simulating playing on trying to get 3 cards ahead now. This must be better again.

PS The expected win is about $1.756 per game with the "48" or "50" strategy.
 
  • Like
Likes FactChecker
  • #20
The optimum strategy is to look for a win of $6 early on and then settle for $5, $4 etc. as more cards have gone.

The full optimum strategy is:

Look for a win of $6 until the 6th, 8th or 10th cards (I can't detect much difference in these cases). I got the best result using the 6th card.

Settle for a win of $5 from the 7th until the 25th card

Settle for a win of $4 win from the 26th until the 32nd card.

Settle for a $3 win from the 33rd until the 43rd card.

Settle for a $2 win from the 44th card until the 50th card.

Settle for $1 on the 51st card.

The average win is approx $2.62 per game.

There is only a marginal difference if you don't look for the $6 wins and just settle for a $5 win (from the start).
 
  • Like
Likes Borg and FactChecker
  • #21
willem2 said:
expval (2,2) calls expval(1,2) and expval(2,1) and both call expval(1,1)
If you memoize, you have to compute a maxium of r*b nodes, and if you don't
With r=b=10, you only have to to 120 function calls, opposed to 369511 without memoization.
r=b=26 will just never get done without memoization
I see. I stand corrected.
 
  • #22
PS I ran some simulations with 10 million trials. The best result was looking for +6 until the 8th card. That gave a return of $2.622 per game.

PPS Here's a breakdown for the 10 million games in terms of the various wins:

$6: 0.0283
$5: 0.2213
$4: 0.1325
$3: 0.1539
$2: 0.1314
$1: 0.0912
$0: 0.3327

It's amazing how often you can win $5-$6. It's 25% of the time.
 
  • Like
Likes FactChecker and Borg
  • #23
I fixed the OP's program and it produced the answer of $2.6245.

I also used it to confirm the cutoff points (which resolved the marginal cases from my simulation):

There is no point in looking for +7

Settle for +6

Settle for +5 from the 11th card (adjustment from the 9th)

Settle for +4 from the 24th card (adjustment from the 26th)

Settle for +3 from the 35th card (adjustment from the 33rd)

Settle for +2 from the 44th card

Settle for +1 from either the 49th or 51st card

I reran the simulation over 10 million trials and the adjustments pushed the average up to 2.6243.
 
Last edited:
  • Like
Likes FactChecker

FAQ: Programming a recursive expected value card game

1. What is a recursive expected value card game?

A recursive expected value card game is a card game where the outcome of each round depends on the outcome of previous rounds. It involves calculating the expected value of each possible move, taking into account the potential outcomes of future rounds.

2. How is recursion used in programming a recursive expected value card game?

Recursion is used in programming a recursive expected value card game in order to calculate the expected value of each possible move. This involves breaking down the problem into smaller subproblems and using the results of those subproblems to solve the larger problem.

3. What are the benefits of using recursion in programming a recursive expected value card game?

The benefits of using recursion in programming a recursive expected value card game include a more efficient and organized approach to solving complex problems, as well as the ability to handle a larger number of possible outcomes.

4. Can you give an example of a recursive expected value card game?

An example of a recursive expected value card game is the popular card game "War". In this game, players compare the top card of their deck and the player with the higher card wins the round. However, if the two cards are equal, then the players must continue playing until one player has a higher card. This recursive aspect of the game adds an element of strategy and probability to the game.

5. How can I improve my skills in programming a recursive expected value card game?

To improve your skills in programming a recursive expected value card game, it is important to practice and familiarize yourself with different problem-solving techniques and algorithms. It may also be helpful to study and analyze existing recursive expected value card games to gain a better understanding of how they work.

Similar threads

Replies
3
Views
4K
Replies
18
Views
1K
Replies
1
Views
1K
Replies
8
Views
1K
Replies
1
Views
3K
Replies
1
Views
1K
Replies
4
Views
1K
Back
Top