Optimal Stopping Strategy for Winning Game with Two Bells

In summary, the conversation discusses a game with two bells, where one rings at a constant rate and the other rings randomly within a given time frame. The player earns $1 each time the first bell rings, but must return all earnings if the second bell rings before they quit. The optimal strategy is to continue playing as long as the expected earnings over a small time interval are greater than 0. The maximum amount of money that can be won under this strategy is r(1-t), where r is the rate of the first bell and t is time.
  • #1
fignewtons
28
0

Homework Statement


You are playing a game with two bells. Bell A rings according to a homogeneous poisson process at a rate r per hour and Bell B rings once at a time T that is uniformly distributed from 0 to 1 hr (inclusive). You get $1 each time A rings and can quit anytime but if B rings before you quit, you must return all the money you received thus far.

Homework Equations


P[A rings] = rΔt
P[B rings] = Δt/(1-t)

The Attempt at a Solution


At time t, with x in earnings, the optimal stop strategy for this game is continuing if E[W] > 0, where W is net earnings over a small period of time (call it Δt):
E[W] = E[W|A rings]P[A rings] + E[W|B rings]P[B rings] + E[W|no rings]P[no rings] + E[W|many rings]P[many rings]
E[W|A rings] = 1
E[W|B rings] = $-x
E[W|no rings] = $0
and P[many rings] ≈ 0 (since Δt is small time interval)

So E[W] = E[W|A rings]P[A rings] + E[W|B rings]P[B rings]

substituting in the relevant equations...

the strategy is to continue only if x < r(1-t), and stop otherwise.

I'm not sure how to find what's the most amount of money I can win under this strategy. I thought it means I should take the derivative of this winnings strategy but taking the derivative doesn't make sense for the inequality.

Ie. derivative wrt x of x < r(1-t) is d/dx(x-r(1-t)) < 0 which yields 1< 0, doesn't make sense.
 
Physics news on Phys.org
  • #2
figNewtons said:
I thought it means I should take the derivative of this winnings strategy but taking the derivative doesn't make sense for the inequality.
Why would you take the derivative of the winnings? Instead, I suggest thinking of what is the maximal x at which you will quit based on your quitting criterion.
 
  • #3
So the maximal x is just r(1-t)? If you decide to stop at any time t.
 
  • #4
figNewtons said:

Homework Statement


You are playing a game with two bells. Bell A rings according to a homogeneous poisson process at a rate r per hour and Bell B rings once at a time T that is uniformly distributed from 0 to 1 hr (inclusive). You get $1 each time A rings and can quit anytime but if B rings before you quit, you must return all the money you received thus far.

Homework Equations


P[A rings] = rΔt
P[B rings] = Δt/(1-t)

The Attempt at a Solution


At time t, with x in earnings, the optimal stop strategy for this game is continuing if E[W] > 0, where W is net earnings over a small period of time (call it Δt):
E[W] = E[W|A rings]P[A rings] + E[W|B rings]P[B rings] + E[W|no rings]P[no rings] + E[W|many rings]P[many rings]
E[W|A rings] = 1
E[W|B rings] = $-x
E[W|no rings] = $0
and P[many rings] ≈ 0 (since Δt is small time interval)

So E[W] = E[W|A rings]P[A rings] + E[W|B rings]P[B rings]

substituting in the relevant equations...

the strategy is to continue only if x < r(1-t), and stop otherwise.

I'm not sure how to find what's the most amount of money I can win under this strategy. I thought it means I should take the derivative of this winnings strategy but taking the derivative doesn't make sense for the inequality.

Ie. derivative wrt x of x < r(1-t) is d/dx(x-r(1-t)) < 0 which yields 1< 0, doesn't make sense.

Assuming that your ##x## really means expected earnings, you have ##x(t + \Delta t) = x(t)## if you stop at ##t## and ##x(t + \Delta t) = x(t) + (r - x(t)/(1-t)) \Delta t## if you wait until ##t + \Delta t##. Therefore, as long as you wait you have a differential equation ##dx(t)/dt = r - x(t)/(1-t)## with initial condition ##x(0) = 0##.
 
  • #5
Ray Vickson said:
Assuming that your ##x## really means expected earnings, you have ##x(t + \Delta t) = x(t)## if you stop at ##t## and ##x(t + \Delta t) = x(t) + (r - x(t)/(1-t)) \Delta t## if you wait until ##t + \Delta t##. Therefore, as long as you wait you have a differential equation ##dx(t)/dt = r - x(t)/(1-t)## with initial condition ##x(0) = 0##.
We are not talking about expected winnings. The question was:
figNewtons said:
what's the most amount of money I can win under this strategy.
This has nothing to do with the expected earnings.
 
  • #6
Orodruin said:
We are not talking about expected winnings. The question was:

This has nothing to do with the expected earnings.

Yes, in a way it does. His criterion is to continue whenever the expected earnings over the next interval ##\Delta t## are ##> 0##; that is exactly where his condition ##x < r(1-t)## comes from.

That being said, my response to him was ill-advised: ##x## is not a continuous variable, but can change only in increments of ##+1## or ##-x##, and solutions of differential equations do not do that.
 
Last edited:
  • #7
Ray Vickson said:
Yes, in a way it does. His criterion is to continue whenever the expected earnings over the next interval ##\Delta t## are ##> 0##; that is exactly where his condition ##x < r(1-t)## comes from.

That being said, my response to him was ill-advised: ##x## is not a continuous variable, but can change only in increments of ##+1## or ##-x##, and solutions of differential equations do not do that.
No, the maximal possible winning depends only on the criterion for quitting nor the discrete increment. It is true that the criterion in itself is based on the infinitesimal increment, but the maximal winnings can be deduced directly from the quitting criterion.
 
  • #8
Sorry for being thick, but I am confused, is the maximal winning r(1-t) as in the quitting criterion verbatim?
or is it r(1-t) -1 since x<r(1-t) means it can never reach r(1-t) exactly, and since x is discrete and increases only in increments of $1.
or am I completely off?
 
  • #9
figNewtons said:
Sorry for being thick, but I am confused, is the maximal winning r(1-t) as in the quitting criterion verbatim?
or is it r(1-t) -1 since x<r(1-t) means it can never reach r(1-t) exactly, and since x is discrete and increases only in increments of $1.
or am I completely off?
Almost there. You need to think a bit about when the quitting criterion will actually tell you to quit. Also, the maximal winning should not depend on t.
 
  • #10
Orodruin said:
Almost there. You need to think a bit about when the quitting criterion will actually tell you to quit. Also, the maximal winning should not depend on t.

thanks for the hint. i think the quitting criterion tells you to quit or continue at end of Δt given you start at t. we want max Δt because playing longer allows for chance of winning more dollars, so the least t can be is 0 (which allows for the max Δt). In this case, t=0. so r(1-0)=r...the maximum winnings is $r...
it kind of makes sense intuitively (if you start at time 0 and play for 1 hour, and quit just before the bell B rings, you can get at most $r because A rings at a rate r/hour) but I'm not sure if the logic makes sense...please verify/correct
 
  • #11
figNewtons said:
In this case, t=0. so r(1-0)=r...the maximum winnings is $r...
This essentially assumes that ##r## is an integer. You probably should also discuss the case when ##r## is not an integer.

Essentially, the quitting criterion tells you when it is no longer profitable to continue. It is no longer profitable when the rate at which ##A## rings is lower than the rate at which ##B## rings weighted by the gain/loss -- which is what you discussed already in the first post. If the possible loss is already ##r##, you will never have ##B## ringing at a small enough rate to justify continuing.

The maximal winning occurs if you get really lucky and essentially ##A## rings ##r## times in quick succession essentially at ##t = 0##.

Edit: Another interesting question is how much a casino should charge you for playing this game ...
 
  • #12
Orodruin said:
This essentially assumes that ##r## is an integer. You probably should also discuss the case when ##r## is not an integer.

Essentially, the quitting criterion tells you when it is no longer profitable to continue. It is no longer profitable when the rate at which ##A## rings is lower than the rate at which ##B## rings weighted by the gain/loss -- which is what you discussed already in the first post. If the possible loss is already ##r##, you will never have ##B## ringing at a small enough rate to justify continuing.

The maximal winning occurs if you get really lucky and essentially ##A## rings ##r## times in quick succession essentially at ##t = 0##.

Edit: Another interesting question is how much a casino should charge you for playing this game ...
Ok, so we can take Δt again to be a small number, just that t=0. And as Δt -> 0, P[B ring] -> 0
 
  • #13
figNewtons said:
Ok, so we can take Δt again to be a small number, just that t=0. And as Δt -> 0, P[B ring] -> 0
I would not see it this way. The probability that A rings in the time interval ##\Delta t## also goes to zero. The question is how they relate to each other and therefore whether the expectation value is positive or negative.
 

What is the "Maximize winnings game"?

The "Maximize winnings game" is a mathematical game where players have to make strategic decisions in order to maximize their winnings.

How do you play the "Maximize winnings game"?

In the "Maximize winnings game", players are given a set of options with corresponding winnings. They must choose one option at a time, and the game ends when all options have been chosen. The goal is to choose the options that will result in the highest overall winnings.

What factors affect the outcome of the "Maximize winnings game"?

The outcome of the "Maximize winnings game" is affected by various factors such as the number of options available, the amount of winnings associated with each option, and the player's decision-making strategy.

Are there any optimal strategies for the "Maximize winnings game"?

There are various strategies that players can use in the "Maximize winnings game", but there is no one optimal strategy that guarantees the highest winnings every time. The best strategy may vary depending on the specific game parameters and the player's risk tolerance.

What are the real-world applications of the "Maximize winnings game"?

The "Maximize winnings game" has real-world applications in fields such as economics, finance, and game theory. It can be used to model decision-making processes and understand human behavior in situations involving risk and uncertainty.

Similar threads

  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Calculus and Beyond Homework Help
Replies
5
Views
3K
  • Calculus and Beyond Homework Help
Replies
17
Views
995
  • Precalculus Mathematics Homework Help
Replies
2
Views
1K
  • Classical Physics
Replies
0
Views
153
  • Calculus and Beyond Homework Help
Replies
32
Views
2K
  • Calculus and Beyond Homework Help
Replies
14
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
840
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
2
Replies
56
Views
3K
Back
Top