Financial Physics - Probability of Winning

Click For Summary

Homework Help Overview

The discussion revolves around a probability problem related to a game where the outcome at a given time depends on the results of the previous two time steps. Participants are tasked with finding expressions for certain probabilities in a steady state and determining conditions under which a player loses on average.

Discussion Character

  • Exploratory, Conceptual clarification, Mathematical reasoning, Problem interpretation

Approaches and Questions Raised

  • Participants explore modeling the problem as a Markov chain, discussing transition probabilities and steady-state distributions. Some express confusion regarding the underlying probability concepts and seek clarification on how to approach the problem.

Discussion Status

Several participants have offered insights into modeling the problem, including the use of transition matrices and steady-state equations. There is ongoing exploration of the relationships between different states and the probabilities associated with them, but no consensus has been reached on a complete solution.

Contextual Notes

Some participants mention a lack of prior knowledge in probability and related concepts, which may affect their understanding of the problem. The discussion includes references to coursework and previous exposure to related topics like binomial trees and Markov chains.

physicsoxford
Messages
4
Reaction score
0

Homework Statement



Question:
In game A the probability of winning at time t is determined by success (in any
game) at the previous two timesteps t-2 and t-1. A win (W) earns one unit of cash,
and a loss (L) results in paying one unit of cash. Following a sequence of outcomes (L;L)
at time steps (t - 2, t - 1), the probability of winning at timestep t is p1. Following
(L;W) it is p2, following (W;L) it is p3 and following (W;W) it is p4. Let D1(t) be
the probability of the sequence (L;L) at timesteps (t - 1, t), D2(t) be the probability
of (L;W), D3(t) be the probability of (W;L), and D4(t) be the probability of (W;W).
Find expressions for the Di in the steady state, for i = 1 to 4. Show that a player loses
on average when
p1p2 < (1 - p3)(1 - p4)

Homework Equations



No other equations are given!

The Attempt at a Solution



Im taking a class on Financial Physics and have no previous knowledge of probability. I have not taken statistical mechanics or Quantum yet. I am completely lost on this one. I been learning more about it but this is just over my head. Can someone help! Dont know where to start!
 
Physics news on Phys.org
physicsoxford said:

Homework Statement



Question:
In game A the probability of winning at time t is determined by success (in any
game) at the previous two timesteps t-2 and t-1. A win (W) earns one unit of cash,
and a loss (L) results in paying one unit of cash. Following a sequence of outcomes (L;L)
at time steps (t - 2, t - 1), the probability of winning at timestep t is p1. Following
(L;W) it is p2, following (W;L) it is p3 and following (W;W) it is p4. Let D1(t) be
the probability of the sequence (L;L) at timesteps (t - 1, t), D2(t) be the probability
of (L;W), D3(t) be the probability of (W;L), and D4(t) be the probability of (W;W).
Find expressions for the Di in the steady state, for i = 1 to 4. Show that a player loses
on average when
p1p2 < (1 - p3)(1 - p4)



Homework Equations



No other equations are given!

The Attempt at a Solution



Im taking a class on Financial Physics and have no previous knowledge of probability. I have not taken statistical mechanics or Quantum yet. I am completely lost on this one. I been learning more about it but this is just over my head. Can someone help! Dont know where to start!

While it may not help (because you have no previous exposure to probability) you can model the system as a Markov chain, where the state at time t consists of the outcomes at times t and t-1. There are four states:
state 1 = (W,W), state 2 = (W,L), state 3 = (L,W) and state 4 = (L,L)
If we are in state i (=1,2,3 or 4) at time t, what are the probabilities we will be in state j at time (t+1)? These are the so-called one-step transition probabilities, typically denoted as pij. We have pij≥0 for all i,j and Ʃ_{j=1..4} pij= 1 for i = 1,2,3,4.

In the present case:
<br /> \begin{array}{l}<br /> P(LL \to LW) = p_1 \, , \; P(LL \to LL) = 1-p_1\\<br /> P(LW \to WW) = p_2 \, , \; P(LW \to WL) = 1-p_2 \\<br /> P(WL \to LW) = p_3 \, , \; P(WL \to LL) = 1-p_3 \\<br /> P(WW \to WW) = p_4 \, , \; P(WW \to WL) = 1-p_4<br /> \end{array}<br />
with all other transitions having P(i → j) = 0.

The reward r at time t is r = +1 in states LW and WW, and is r = -1 in states WL and LL. The expected long-run reward per unit time is
\text{average reward } = \bar{r} = (+1)( \pi_{WW} + \pi_{LW}) + (-1)(\pi_{WL} + \pi_{LL}),
where \pi_{WW}, etc., are the steady-state probabilities of states WW, WL, LW and LL. These can be found using standard methods for Markov chains, and you can find all the needed material through Google, for example.

RGV
 
Out of curiosity, are you learning about Markov chains in your class currently? Without that theory, I'm unaware of how you'd find long term averages for a system like this, though I am pretty inexperienced with probability.
 
The only thing that we did in class that could relate to this would be the binomial tree model. So after looking up Markov Chain and reading a bit it makes more since but I am still struggling. Here is an attempt:

So the underlining reasoning is that S0P = S1 Where P is the transition probability matrix and S0 is the initial state distribution matrix and S1 = a later state distribution matrix.


P = Matrix:
p4, 1-p4, 0, 0
0, 0, p3, 1-p3
p2, 1-p2, 0, 0
0, 0, p1, 1-p1

As you showed in your response.

And S1 = matrix [∏WW,∏WL,∏LW, ∏LL]
and S0 = matrix [.25, .25, .25, .25] ?

Plug this in and solve for ∏WW, ...

Is this even close?
 
Last edited:
physicsoxford said:
The only thing that we did in class that could relate to this would be the binomial tree model. So after looking up Markov Chain and reading a bit it makes more since but I am still struggling. Here is an attempt:

So the underlining reasoning is that S0P = S1 Where P is the transition probability matrix and S0 is the initial state distribution matrix and S1 = a later state distribution matrix.


P = Matrix:
p4, 1-p4, 0, 0
0, 0, p3, 1-p3
p2, 1-p2, 0, 0
0, 0, p1, 1-p1

As you showed in your response.

And S1 = matrix [∏WW,∏WL,∏LW, ∏LL]
and S0 = matrix [.25, .25, .25, .25] ?

Plug this in and solve for ∏WW, ...

Is this even close?

The steady-state probabilities πi depend on the transition matrix P = (pij). For an n-state chain with transition matrix P they are solutions of a set of linear equations:
\pi_j = \sum_{i} \pi_i p_{ij}, j=1,2, \ldots, n, \;\text{ and } \sum_{j} \pi_j = 1.
The first n equations above can be summarized as \pi = \pi P, where \pi = (\pi_1, \pi_2, \ldots, \pi_n) is a row vector. Because each row of P sums to 1, one of the equations \pi_j = \sum_{i} \pi_i p_{ij} is redundant (that is, if n-1 of them hold, the nth one also holds), so we proceed by omitting anyone of those equations and replacing it by the normalization condition sum = 1. For the type of chain you have here (having a single "recurrent class") the system has a provably unique solution. Never mind for now if you don't know exactly what I am referring to; for now, it is enough to solve the equations to see what happens.

Let's do a little example, with three states:
P = \left[ \matrix{1/2&amp;0&amp;1/2\\0&amp;1/4&amp;3/4\\1/4&amp;1/2&amp;1/4} \right].
The steady-state equations are:
<br /> \begin{array}{rcl}<br /> \pi_1&amp;=&amp; \frac{1}{2} \pi_1 + \frac{1}{4} \pi_3 \\<br /> \pi_2&amp;=&amp; \frac{1}{4} \pi_2 + \frac{1}{2} \pi_3 \\<br /> \pi_3&amp;=&amp; \frac{1}{2} \pi_1 + \frac{3}{4} \pi_2 + \frac{1}{4} \pi_3<br /> \end{array}<br />
and \pi_1 + \pi_2 + \pi_3 = 1.
We leave out one of the first three equations (say the third one---but anyone of them would do) and replace it by the sum condition. That gives the linear system
<br /> \begin{array}{ccl}<br /> \pi_1&amp;=&amp; \frac{1}{2} \pi_1 + \frac{1}{4} \pi_3 \\<br /> \pi_2&amp;=&amp; \frac{1}{4} \pi_2 + \frac{1}{2} \pi_3 \\<br /> 1 &amp;=&amp; \pi_1 + \pi_2 + \pi_3<br /> \end{array}<br />
The solution is \pi_1 = 3/13, \pi_2 = 4/13, \pi_3 = 6/13.

The theory behind all this can be found in textbooks and web pages.

RGV
 
Alright let's see if I got this. Notation was killing me so I changed it: ∏WW=∏1, ∏WL=∏2, ∏LW=∏3, ∏LL=∏4. Using these equations:

1 = ∏1P4 + ∏3P2 --Equation 1

2 = ∏1(1-P4) + ∏3(1-P3) --Equation 2

4 = ∏2(1-P3) + ∏4(1-P1) --Equations 3

1 + ∏2 + ∏3 + ∏4 = 1 -- Equation 4
------

Equation 1 : ∏1 = ∏3P2/(1-P4) (A)

Equation 2 (subbing in Equation (A)): ∏3 = ∏3 [P2 + (1-P3)] (B)

Equation 3 (subbing in (B) ): ∏4 = ∏3[1-P2P3-(1-P3)P3] /P1 (C)

Equation 4 (Subbing in (A,B,C) ):

∏3 = P1(1-P4)/δ

Where δ = [(1-P4)(1+2P1+P1P2-P1P3-P2P3-P3+ P32)] + P2P3

We can then plug back in and find ∏1,∏2, and ∏4

Average reward = [P1P2+ P1(1-P4) - P1P2(1-P4) - P1(1-P4)(1-P3) - (1-P4)(1-P2P3-(1-P3)P3]/ δ

Sure is messy what am I doing wrong? should it not simplify?
 
physicsoxford said:
Alright let's see if I got this. Notation was killing me so I changed it: ∏WW=∏1, ∏WL=∏2, ∏LW=∏3, ∏LL=∏4. Using these equations:

1 = ∏1P4 + ∏3P2 --Equation 1

2 = ∏1(1-P4) + ∏3(1-P3) --Equation 2

4 = ∏2(1-P3) + ∏4(1-P1) --Equations 3

1 + ∏2 + ∏3 + ∏4 = 1 -- Equation 4
------

Equation 1 : ∏1 = ∏3P2/(1-P4) (A)

Equation 2 (subbing in Equation (A)): ∏3 = ∏3 [P2 + (1-P3)] (B)

Equation 3 (subbing in (B) ): ∏4 = ∏3[1-P2P3-(1-P3)P3] /P1 (C)

Equation 4 (Subbing in (A,B,C) ):

∏3 = P1(1-P4)/δ

Where δ = [(1-P4)(1+2P1+P1P2-P1P3-P2P3-P3+ P32)] + P2P3

We can then plug back in and find ∏1,∏2, and ∏4

Average reward = [P1P2+ P1(1-P4) - P1P2(1-P4) - P1(1-P4)(1-P3) - (1-P4)(1-P2P3-(1-P3)P3]/ δ

Sure is messy what am I doing wrong? should it not simplify?

When I did it I used states 1 = LL, 2 = LW, 3 = WL, 4 = WW, giving a matrix
<br /> P = \left[ \matrix{1-p_1 &amp; p_1 &amp; 0 &amp; 0\\<br /> 0 &amp; 0 &amp; 1-p_2 &amp; p_2 \\<br /> 1-p_3 &amp; p_3 &amp; 0 &amp; 0\\<br /> 0 &amp; 0 &amp; 1-p_4 &amp; p_4}\right]<br />
Letting v1=π_1, v2 = π_2, etc, the steady-state equations are:
v1 = q1*v1 + q3*v3, v2 = p1*v1 + p3*v3, v3 = q2*v2 + q4*v4, v1+v2+v3+v4=1,
Solving these (using Maple) gives some expressions similar to yours. The long-run win probability is Pwin = v2+v4:
Pwin = p1*(p2+1-p4)/D, where D = 2*p1-2*p4*p1+p2*p1+1-p3-p4+p4*p3.
We want to have Pwin < 1/2, or 2*p1*(p2+1-4) < D, or 2*p1*(p2+1-p4) -D < 0. That last form simplifies to what you need.

I have not checked your solution in detail, because your row/column ordering is different from mine.

RGV
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
Replies
2
Views
2K
  • · Replies 18 ·
Replies
18
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
7K
Replies
2
Views
9K
  • · Replies 2 ·
Replies
2
Views
7K
Replies
2
Views
5K
Replies
18
Views
3K
Replies
3
Views
2K