# Homework Help: Chance of winning in a spin-off of the game "Black Jack"

1. Jul 28, 2015

### Nathanael

1. The problem statement, all variables and given/known data
We have a game in which a player plays against a dealer. The goal is to get closest to 100 without going over.
The game goes like this:
A random integer is generated from 1 to 100 (including 1 and 100) which is given to the player. The player can either "hit" or "stay." If the player hits, they will get another random integer (from 1 to 100) which is added on to their number. The player can do this as many times as they wish, but they instantly lose ("bust") if they go over 100.
Once the player stays, the dealer will keep rolling a random integer (from 1 to 100) for their-self until they either beat the player (get closer to 100) or bust (go over 100).

I am curious to find what is the chance the player will win if the player chooses to stay at a value of "m"

2. Relevant equations
$\sum\limits_{n=1}^{m}\bigg (\frac{n}{100}\sum\limits_{s=1}^{n}\Big (100^{-s}{n-1 \choose s-1}\Big )\bigg )$

3. The attempt at a solution
The equation in "2. Relevant equations" is my guess at the answer.

The reason I guessed this is because the problem reminded me of "pascal's triangle." So let us identify an entry in pascal's triangle by two numbers, n and s.
n is the row (starting from n=1 being "1"; n=2 being "1, 1"; n=3 being "1, 2, 1"; etc.) and s is the number of entries from the left (starting from s=1 being the leftmost entry, which is always 1; s=2 being n-1; etc.).
(So s takes values from 1 to n.)

The sth entry of the nth row will be given by ${n-1\choose s-1}$

Now this is what I'm thinking (sort of a guess)... The sth entry of the nth row represents the number of paths for the dealer to get a value of n in s steps.
For example, there is only 1 way to get n in 1 step: roll an n; and there is only 1 way to get an n in n steps: roll all ones.

The probability for a specific path which takes s steps will be 100-s and so the probability that the dealer gets to a value of n in s steps will be $100^{-s}{n-1\choose s-1}$

Therefore the chance of the dealer getting a value of n is $\sum\limits_{s=1}^n100^{-s}{n-1\choose s-1}$

Now, if the dealer has a value of n, then the chance of you winning is n/100 on the next roll.
For example, if the dealer has a 2, there is a 0.02 chance you will win on the next roll, because the dealer can roll either 100 or 99.

So we should multiply the chance you will win if the dealer has n by the chance of the dealer having n, and then sum over all n from 1 to your value, m. That should be the probability of you winning by staying on m. Thus we get:

$\sum\limits_{n=1}^{m}\bigg (\frac{n}{100}\sum\limits_{s=1}^{n}\Big (100^{-s}{n-1 \choose s-1}\Big )\bigg )$

Can anyone confirm if this is correct or incorrect?

2. Jul 29, 2015

### geoffrey159

I don't feel comfortable with probabilities but here is what I think.
Player wins if the dealer scores less than $m$ or more than 100. If you call $S_d$ the score of the dealer, in terms of events, you have :

\begin{align} \{ \text{player wins} \} &= \{ 1\le S_d \le m-1 \} \cup \{ S_d > 100 \} \\ &= \cup_{ k = 1}^{m-1} \{ S_d = k \} \cup \cup_{ k = 1}^{99} \{ S_d = 100+k \} \end{align}

The main problem is to find the law of $S_d$.

3. Jul 29, 2015

### Nathanael

The player can only win if the dealer scores more than 100.
If the dealer's score is less than the player's score (m) then the dealer will keep getting a random integer [1,100] which is added on to the dealer's score.

4. Jul 29, 2015

### geoffrey159

Oh sorry, I read your post too fast ! So we have
$P(\{ \text{player wins} \} ) = 1- P( \{ \text{dealer wins} \} ) = 1 - \sum_{k = m+1}^{100} P(S_d = k)$

Do you agree ?

If you call $S_d^{(t)} := \{ \text{score of the dealer after } t \text{ deals} \}$ and $D_t := \{ \text{output of deal } t \}$, then

\begin{align} \{S_d = k \} &= \cup_{1 \le t \le k} \{ S_d^{(t)} = k \}\\ &= \cup_{1 \le t \le k} \{ (D_1 = k_1, ..., D_t = k_t), \ k_1 + ... + k_t = k \} \\ &= \cup_{1 \le t \le k} \cup_{(k_1,...,k_t) \in t\text{-compositions of } k }\{ D_1 = k_1, ..., D_t = k_t \} \end{align}

If each deal is independent, then

$P(S_d = k ) = \sum_{t = 1}^kC_{k-1}^{t-1}{100^{-t}}$

So finally, I get

$P(\{ \text{player wins} \} ) = 1 - \sum_{k = m+1}^{100} \sum_{t = 1}^k C_{k-1}^{t-1} {100^{-t}}$

5. Jul 29, 2015

### Nathanael

Is $C_b^a={a\choose b}$ ?
To be honest I have no idea where you got that from, but that is slightly different from what I guessed:
Unless it's $C_a^b={a\choose b}$ ? Then we agree on the P(Sd=n)
(Or else I think you made a typo.)

Anyway that was the part of my formula I was unsure about (I am the one uncomfortable with probabilities and I basically pulled that one out of thin air).

If you agree with that much then it shouldn't take much effort for you to confirm or disconfirm the rest of my logic:
I like your way $P($player wins$) =1-\sum\limits_{a=m+1}^{100}P(S_d=a)$ better, and it makes me think my answer is wrong but I am curious why.

Last edited: Jul 29, 2015
6. Jul 29, 2015

### Ray Vickson

I think you have the correct expression, but maybe not for what you think it is. Your expression is the conditional $P(\text{win}|\text{stay at} \; m)$, the probability that the player wins, given that he reaches $m$ and stays.

A more complete analysis might be to pick a value $m_0$, and the player stays if he reaches $m \in \{ m_0, m_0+1, \ldots, 100 \}$ (that is, $m_0$ or better). For each such $m$ the player may not reach it (having gone over 100), but if we call $P_m$ the probability that the player safely reaches $m$ (and thus stays) then
$$P\{ \text{player wins} \} = \sum_{m=m_0}^{100} P_m \times P(\text{win}|\text{stay at} \; m)$$
It might be interesting to evaluate this for various values of $m_0$ and see whether there is an optimal value.

Note: the probability $P_m$ can be evaluated in a similar way to what you did above, because it involves paths for the player rather than the dealer.

Note added in edit: as written, all terms in the summation above are > 0 and do not depend on $m_0$, so taking $m_0 = 1$ would maximize the sum. Obviously, this cannot be correct: we need something inside the sum to be dependent on $m_0$ in order to have a non-trivial problem. So, instead of $P_m$ we should use $P(m|m_0) =$ probability of reaching $m$ when we are required to go to $m_0$ or beyond. The correct coefficient to use is
$$P(m|m_0) = \sum_{j=1}^{m_0-1} \sum_{s=1}^j 100^{-s-1} {j-1 \choose s-1}.$$
This depends on $m_0$ but not explicitly on $m$!

Last edited: Jul 29, 2015
7. Jul 29, 2015

### Nathanael

That is exactly what I thought it was
The reason is, I wanted to find at which minimum value should you stay if you want the chance of winning to be greater than 50%

I was going to think about optimizing after I made sure I was correct so far.
(Also, I was going to use this first result in optimization; see below.)

Hmm, thanks, but I was thinking about optimization differently.

I was thinking the optimal value of m0 would be when the expected increase in chance of winning (by staying at your new value) is equal (or slightly less than) the expected chance of losing by rolling again (which of course is just m/100).
Does that seem like a sensible definition of the optimal minimum value to stay at?

8. Jul 30, 2015

### geoffrey159

Yes, $C_a^b = {a\choose b}$.

$C_{k-1}^{t-1}$ is the number of ways you can find integers $k_1,...,k_t > 0$ such that $k_1+...+k_t = k$. Otherwise called $t$-composition of $k$ (https://en.wikipedia.org/wiki/Composition_(combinatorics)).

*******

But now you have to explain why I find : $P( 1\le S_d \le 70) \approx 1.006 > 1$ !!! Either my calculations are wrong, or you don't have a law of probability, in which case the model is wrong and you have to start over !

I get (check for errors) :
$P(S_d = k ) = \sum_{t = 1}^kC_{k-1}^{t-1}{100^{-t}} = \frac{1}{100} \sum_{\ell=0}^{k-1} C_{k-1}^\ell 100^{-\ell} = \frac{1}{100} (1+ \frac{1}{100})^{k-1} = \frac{1}{100} (\frac{101}{100})^{k-1}$
So

$P( 1\le S_d \le M) = 1.01^M - 1$. Therefore, $M \ge \frac{\ln 2}{\ln 1.01} \approx 70 \Rightarrow P( 1\le S_d \le M) > 1$ which is absurd !

9. Jul 31, 2015

### haruspex

Nice problem.
Of course, it's not really like Blackjack. A Blackjack dealer doesn't know what the other players hold exactly, and has no volition.

I would generalise it to N instead of 100 - makes the working easier - and think of the players as starting at N and subtracting values to get close to, or equal to, 0.
I didn't quite follow your argument for the formula in the OP, though it may well be right. I find it more persuasive to develop a recurrence relation.
Suppose a player/dealer is at x and elects to stop as soon as below y<=x. Let f(x,y,z) be the probability of ending up at z, 0<=z < y.
Then we have $f(x,y,z)=\frac 1N(1+\Sigma_{i=1}^{x-y}f(x-i,y,z))$.
Solving, $f(x,y,z)=\frac 1N(1+\frac 1N)^{x-y}$. Note that z doesn't feature.
So suppose player starts at N and stops when below y. $f(N,y,z)=\frac 1N(1+\frac 1N)^{N-y}$. The probability that the player does not go negative is $\frac yN(1+\frac 1N)^{N-y}$
Dealer must now beat z, so player's prob of winning is $\Sigma_{z}f(N,y,z)(1-\frac zN(1+\frac 1N)^{N-z})$.
When you have summed that, I recommend assuming $z=\alpha N$ for some magic constant $\alpha$, making approximations for large N, and differentiating wrt $\alpha$ to find the optimum value.
(I get $\alpha(1+2e^{1-\alpha})=1$.)

10. Aug 2, 2015

### Nathanael

Strange

But, if I take this part of your result: (which I don't know how to find myself so I'm stealing it from you )
$\sum\limits_{t=1}^{k}100^{-t}{k-1 \choose t-1} = \frac{1}{100} (\frac{101}{100})^{k-1}$

Then I can simplify the result in my OP:
$\sum\limits_{n=1}^{m}\bigg (\frac{n}{100}\sum\limits_{s=1}^{n}\Big (100^{-s}{n-1 \choose s-1}\Big )\bigg )=\sum\limits_{n=1}^{m}\Big( \frac{n}{100^2}(\frac{101}{100})^{n-1}\Big )=1-(\frac{101}{100})^m+m101^m100^{-(m+1)}$

Which I can now actually calculate results! (Thank you... I did not expect to be able to simplify my formula!)

As you can see... the chance of winning when you have 100 is 100%, as expected!!
And the chance of winning when you have a 1 is 1/1002 as expected (the dealer would have to roll a 1 then a 100)

So I have no idea why your method is acting up, but I am now even more confident that the result in my OP is correct

11. Aug 2, 2015

### Nathanael

@haruspex
Thank you for the brilliant reply, I appreciate it. Sorry I did not see it until today. I need to study your reply in more detail before I fully understand it.

12. Aug 3, 2015

### geoffrey159

The formula is coming from the binomial theorem (https://en.wikipedia.org/wiki/Binomial_theorem).

I am not as confident as you are for the following reason: even though we find the same 'law' for the score of the dealer, an important verification is to make sure that the probability of all possible outcomes is equal to 1.
Unless you can find an error in my calculations and/or reasoning, the probability of all possible outcomes is greater than 1, and therefore, it cannot be a law of probability.

To speak clearly, I think that the calculations are based on a wrong model. I have tried to think a little bit more about it, because it is interesting, but I am stuck.

13. Aug 3, 2015

### Ray Vickson

Here is (essentially) what the OP claims (correctly): if $\{ H_k\} =$ event that the player holds at $k$ and DW, DL = events that the dealer wins or loses, then for an $m$-policy (i.e., a policy where the player holds at $m$ or more) we have:
$$P\{ \text{player wins} \} = \sum_{k=m}^{100} P(H_k) P(DL | H_k),$$
where
$$P(H_m) = P(H_{m+1}) = \cdots = P(H_{100}) = \frac{1}{100} \left( \frac{101}{100} \right)^{m-1}$$
and $P(DL | H_k) = 1 - P(DW | H_k)$ with the dealer applying a $(k+1)$-policy:
$$P(DW | H_k) = H_{k+1} + H_{k+2} + \cdots + H_{100} \\ = H_{k+1} + H_{k+1} + \cdots + H_{k+1} = (100 - k) H_{k+1} = (100-k) \frac{1}{100} \left( \frac{101}{100} \right)^k$$
Altogether,
$$P\{ \text{player wins} \} = 0.01 \times 1.01^{m-1} \sum_{k=m}^{100} [ 1 - (100-k)\, 0.01 \, \times 1.01^k ]$$
for a player's $m$-policy. The sum above is also do-able, because sums like $\sum_{k=a}^b r^k$ and $\sum_{k=a}^b k r^k$ are well-known.

In fact, following Haruspex, and using $N$ instead of 100, the final result for a player's $m$-policy is
$$P\{ \text{player wins} \} = \frac{1}{N+1} \left[ (N+1) (r^m - r^{m+N}) -m r^m + (2N + 1 - m) r^{2m} \right],$$
where $r = (N+1)/N$.

For N = 100 the player's win probability is maximized at $m = 58$, with win-probability = 0.431693630.

Last edited: Aug 3, 2015