Is this a correct observation about Markov-chains and stoc.processes?

bobby2k · Jan 13, 2014

This question is only about formal definitions, so it is only about how we define things. It is probably very easy even though it is kind of long.

Most of the time when I have seen a random variable it is used as follows. We have a probability space (Ω,F,P) and a measureable space (E,ε), and the random variable X is a measurable function.
X: Ω→E. Now, Ω usually does not contain any information about previous values of X, like if we flip a coin and say that you can win 10 USD if we get tail we have that the probability space is

({heads,tails},
{{heads},{tails},{heads,tails},ø},
( ({heads},0.5),({tails},0.5),({heads,tails},1),(ø,0) )

And the measurable space where X can take values.
({0,10},{{0},{10},{0,10},ø})

And X = {({heads},0),({tails},10)}

Now the point is here that in the sample-space Ω there is no information about X.

But in Markov-chains, is this not the case? I have not seen in any book or wikipedia how they define it?

Wikipedia defines a stochasic process in a way that we assume we have the probability space
(Ω,F,P), and a measurable space (E,ε) where E is called the state space. The stochastic process is a set of random variables T = {Xt, t [itex]\in[/itex] P}, where P is an index set.
Now finally I can explain my problem. In a markov-chain the pobability in the next step depends on what state you are in. How does this coincide with the formal definition? This really confuses be since we only have one common probability function for all the states. Is it solved by letting the sample-space Ω contain a lot more information? So in this case we have to have values from the measurable state-space E in the sample space Ω?

Stephen Tashi · Jan 13, 2014

bobby2k said:

In a markov-chain the pobability in the next step depends on what state you are in. How does this coincide with the formal definition?

The formal definition says a stochastic process is an indexed collection of random variables. Some applied math books always assume the "index" is time. The term "index' does not imply that the index variable must be discrete (like 0,1,2...). For example, one may think of a process that produces a random radio signal. At each value of time t, the signal has a value X(t). The values of the signal at two different values such as X(1.035) and X(17.6) are different random variables. These two different random variables each have an associated probability space, but they need not have the same probability space.

Very little can be proven about a stochastic process using such a general definition. The processes that have interesting results are special cases of the general stochastic process.

A sequence of independent coin flips can be fit into the definition of a stochastic process. It is a special case because each of the random variables has the same probability space.

A discrete markov process can be fit into the definition of a stochastic process. The random variables in a markov process need not have the same probability space and in typical problems they don't. The random variables in a discrete markov chain have the special property that the conditional distribution of the random variable X(i) given the realized value of the random variable X(i-1) is the same as the conditional distribution of the random variable X(i) given all previous realized values of X(i-1),X(i-2)... (i.e. it is only a function of the realized value of X(i-1)). To make that statement using the language of measure theory may be complicated! It's complicated just to define what a conditional probability is in terms of probability spaces.

A generality to keep in mind is that most problems in probability and statistics that mention "a random variable" end up involving several different random variables and several different probability spaces. For example, if X is "a random variable" and M is the mean value of a sample of N realizations of X then M does not have the same probability space as X.

bobby2k · Jan 13, 2014

Stephen Tashi said:

A discrete markov process can be fit into the definition of a stochastic process. The random variables in a markov process need not have the same probability space and in typical problems they don't.

Thanks, from what you say it seems that Wikipedia then should modify their definition of a stochastic process:
http://en.wikipedia.org/wiki/Stochastic_process#Definition

Instead of (Ω,F,P) and (S,[itex]\Sigma[/itex]), they should talk about ([itex]\Omega_{t},F_{t},P_{t}[/itex]) and ([itex]S_{t},\Sigma_ {t}[/itex]) ? Since each of these object can be different for each variable?

bobby2k · Jan 13, 2014

This was a little more confusing.
In these links they talk about an underlying probability space
http://www.math.uah.edu/stat/prob/Processes.html
pdf: http://www.google.no/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCwQFjAA&url=http%3A%2F%2Fwww.springer.com%2Fcda%2Fcontent%2Fdocument%2Fcda_downloaddocument%2F9780817683450-c1.pdf%3FSGWID%3D0-0-45-1340717-p174546584&ei=P1nUUrWGMsrAtQa7wIDQBw&usg=AFQjCNFfEz_sqboEA-hjN0QvZ6TEOz2vmQ&bvm=bv.59378465,d.Yms

And in thse links, they talk about the same probability space.
https://wiki.engr.illinois.edu/download/attachments/185991190/hajekchapter4.pdf?version=1&
http://www.idsc.ethz.ch/Courses/stochasticsystems/Stochastic_Processes.pdf

It does seem that formally one may be able to define the events and simple outcomes one way that we only need one probability space?

Stephen Tashi · Jan 13, 2014

If we consider a random variable with a uniform distribution on [0,1] and a random variable Y with a ramp distribution on [0,1] then we can speak of an "underlying probability space" where the set is [0,1] and the measure of a set such as [0,1/2] is its length. Then X and Y are defined on the same "underlying probability space" even though the probability of Y being in [0,1/2] is not the same as the probability of X being in [0,1/2].

You could also consider Y to be defined by a probability space where the set is [1,0] and the measure of [0,1/2] is the probability that Y is in that interval. From that point of view X and Y are not defined on the same probability space because the spaces for X and Y have different measures.

To say that the indexed random variables in a stochastic process X() have the same "underlying" probability space does not imply that they have same probability distribution on the space.

bobby2k · Jan 13, 2014

I think we may have been speaking about the wrong sample space. I think that an outcome in Ω in the definition actually gives the entire realisation of the stochastic process, and not just an event for X at a given t-value. And the event-space F contains events, and these events are collection of entire realisations of the stochastic process.

So when we have a markov-chain, and using the definition from Wikipedia( http://en.wikipedia.org/wiki/Stochastic_process#Definition) Ω is the set of all (infinite) chains. F is a subspace of the power-set of Ω, I do not know if it can be the entire power-set? P is the probability-function evaluated in F. S is all the values the X's can have, I think that S consist of single, real numbers(no vectors or chains). I do not know why we need [itex]\Sigma[/itex], maybe because something complicated in the measure theory.

So even though the different X's are not defined on the same probability space, the probability-space (Ω,F,P) together with the function X and (S,[itex]\Sigma[/itex]) can probably be used to find the probability space for each X.

Do you think that this is what they meant in the definition on Wikipedia? That an event in the sample-space Ω does not represent an outcome for a single X(t), but a realisation for all X's where all are given values? Then we only have one probability-space, but in a way, this probability-space describes all the X's?

Stephen Tashi · Jan 13, 2014

Do you think that this is what they meant in the definition on Wikipedia? That an event in the sample-space Ω does not represent an outcome for a single X(t), but a realisation for all X's where all are given values? Then we only have one probability-space, but in a way, this probability-space describes all the X's?

That wikipedia article speaks of the "state space" [itex]S[/itex] and "finite dimensional distributions" defined on [itex]S^k[/itex]. So I don't think a point in [itex]S[/itex] is an trajectory of the process, i.e. it isn't a specific set of values for each of the random variables in the indexed collection of random variables.

(It's interesting to look at the controversies on the "talk" page for the Wikipedia article on "random variable". I think most contradictions come from the fact that applied math books define things differently than measure theory books.)

bobby2k · Jan 13, 2014

Stephen Tashi said:

That wikipedia article speaks of the "state space" [itex]S[/itex] and "finite dimensional distributions" defined on [itex]S^k[/itex]. So I don't think a point in [itex]S[/itex] is an trajectory of the process, i.e. it isn't a specific set of values for each of the random variables in the indexed collection of random variables.

(It's interesting to look at the controversies on the "talk" page for the Wikipedia article on "random variable". I think most contradictions come from the fact that applied math books define things differently than measure theory books.)

I agree that a point in S is not a trajectory of the process, but in S a point is actually a single value(not a vector or a chain, as I wrote above what you quoted). But an event in Ω can be a representative of a trajectory, since X can be seen as a function of t and and element in Ω? That was my point, since then we have a "common" probability-space?

PS: I assume that what you meant with trajectory is a realisation of the entire process.

Stephen Tashi · Jan 13, 2014

Suppose [itex]\Omega = {0,1,2}[/itex] is the set of "states" for a discrete stochastic process [itex]X[/itex] whose random variables are indexed by the set of integers {0,1,2,3]. The realization of a vector of values such as (X[0],X[1],X[2],X[3]) is not a point in [itex]\Omega[/itex]. it is a point in the space that is the cartesian product [itex]\Omega^4[/itex].

I agree that its possible to conceive of a probability space [itex]\Omega[/itex] such that each point in [itex]\Omega[/itex] is an ordered set of values that define a trajectory of a random process. However, this is not how a stochastic process is defined.

Notice that if we assume a stochastic process exists, in the sense of an indexed collection of random variables defined on the same "underlying" probability space, this does not assert that there exists a probability measure defined on the space of trajectories - i.e. the space where we define a "point" to be a realization of a specific trajectory. Proving that the indexed collection of random variables can be used to define a measure on the space of trajectories requires a theorem. It isn't asserted by the definition of a stochastic process. (Maybe it can't always be done!)

bobby2k · Jan 13, 2014

Stephen Tashi said:

Suppose [itex]\Omega = {0,1,2}[/itex] is the set of "states" for a discrete stochastic process [itex]X[/itex] whose random variables are indexed by the set of integers {0,1,2,3]. The realization of a vector of values such as (X[0],X[1],X[2],X[3]) is not a point in [itex]\Omega[/itex]. it is a point in the space that is the cartesian product [itex]\Omega^4[/itex].

I agree that its possible to conceive of a probability space [itex]\Omega[/itex] such that each point in [itex]\Omega[/itex] is an ordered set of values that define a trajectory of a random process. However, this is not how a stochastic process is defined.

Notice that if we assume a stochastic process exists, in the sense of an indexed collection of random variables defined on the same "underlying" probability space, this does not assert that there exists a probability measure defined on the space of trajectories - i.e. the space where we define a "point" to be a realization of a specific trajectory. Proving that the indexed collection of random variables can be used to define a measure on the space of trajectories requires a theorem. It isn't asserted by the definition of a stochastic process. (Maybe it can't always be done!)

But we can do the "converse"? If we already have a measure on the space of trajectories then this measure is the probability measure used in the definition?
When we have a markov chain, we have a way of defining the probability measure on the space of trajectories and then this can be used to define a stochastic process because if and element σ in Ω represent a trajectory, then Xt(σ) is just the value of this trajectory at t?

Stephen Tashi · Jan 14, 2014

bobby2k said:

But we can do the "converse"? If we already have a measure on the space of trajectories then this measure is the probability measure used in the definition?

"The" measure used in the definition of a stochastic process is the measure on the "underlying" probability space for the indexed random variables. So a measure defined on the space of trajectories isn't the "same" as this measure - it isn't defined on the same space.

When we have a markov chain, we have a way of defining the probability measure on the space of trajectories

The method you have uses probability measures of the random variables in the indexed collection. So if you assume you can apply that method, you have assumed these measures exist.

bobby2k · Jan 14, 2014

Stephen Tashi said:

"The" measure used in the definition of a stochastic process is the measure on the "underlying" probability space for the indexed random variables. So a measure defined on the space of trajectories isn't the "same" as this measure - it isn't defined on the same space.

But are we really sure that Ω in the definition have to contain the same objects as the samle-space for the different X's, it seems that Ω can be something more abstract, and as long as we for each t have a function from these abstract objects in Ω to the values that the X's can obtain, we are ok?

bobby2k · Jan 14, 2014

I think I found a very precice answer. It is in a book on stochastic analysis, which is extremely difficult it seems(it is used in master-studies), but he mentions something leading up to the proof of a theorem called Kolmogorovs-extension theorem.

There it is pretty clear why it all works, and why it is "underlying" probability space. We can just view a point in Ω as a function that is evaluated in t and then gives the random variable X[itex]_{t}[/itex](ω) for each ω in Ω.

However, I see that going from this to actually get probabilities for specific trajectories is probably not that straght forward. It requires advances measure-theory I imagine?(in order to add/integrate all the possibilities)

Stephen Tashi · Jan 17, 2014

bobby2k said:

However, I see that going from this to actually get probabilities for specific trajectories is probably not that straght forward. It requires advances measure-theory I imagine?(in order to add/integrate all the possibilities)

It would require some sophisticated measure theory to go from that definition, which is given for the case [itex]R^n[/itex], to the case of a process whose random variables were trajectories that were continuous in time.

To actually compute probabilities of a given set of trajectories requires that you specialize to specific types of stochastic processes because very little can be proven about a general stochastic process.

bobby2k · Jan 18, 2014

Stephen Tashi said:

It would require some sophisticated measure theory to go from that definition, which is given for the case [itex]R^n[/itex], to the case of a process whose random variables were trajectories that were continuous in time.

I don't see how that definition would exclude trajectories of a continuous time process. Even though we have [itex]R^{n}[/itex] I do not see the that it says that a trajectory have to be a countable number of points. I just think he means that each random variable at a given time, can be a vector and not just a number.

Is this a correct observation about Markov-chains and stoc.processes?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect