Regarding mixing time of a Markov chain

In summary, in the conversation it is discussed that $X=X_0, X_1, X_2, \ldots$ is a Markov chain on a finite state space $S$, with $P$ being the transition matrix. It is assumed that there exists an $\varepsilon>0$ where the total variation distance between point distributions $\mu_0$ and $\nu_0$ is less than or equal to $\varepsilon$. It is then stated that $Y=Y_0, Y_1, Y_2, \ldots$ is an independent copy of $X$. The question is posed about the magnitude of $P[X_1\neq Y_1|X_0=x
  • #1
caffeinemachine
Gold Member
MHB
816
15
Let $X=X_0, X_1, X_2, \ldots$ be a Markov chain on a finite state space $S$, and let $P$ denote the transition matrix.
Assume that there is an $\varepsilon>0$ such that whenever $\mu_0$ and $\nu_o$ are point distributions on $S$ (in other words, $\mu_0$ and $\nu_0$ are Direac masses) we have
$$\|\mu_0P-\nu_0P\|_{TV}\leq \varepsilon$$

Now let $Y=Y_0, Y_1, Y_2, \ldots$ be an independent copy of $X$.

Question. What can we say about the magnitude of $P[X_1\neq Y_1|X_0=x_0, Y_0=y_0]$, where $x_0$ and $y_0$ are two different states in $S$.

I intuitively think that we should be above to bound the above quantity by $\varepsilon$ up to multiplication by an absolute constant. But I am unable to prove it.

We have that
$$P[X_1\neq Y_1|X_0=x_0, Y_0=y_0] = \sum_{x\in S}\sum_{y\in S, y\neq x}P[X_1=x, Y_1=y|X_0=x_0, Y_0=y_0]$$
which is equal to
$$\sum_{x\in S} p(x_0, x)(1-p(y_0, x))$$
but I am unable to make progress.
 
Physics news on Phys.org
  • #2
Really a standard exercise/result for countable state random variables (with emphasis on markov chains) is to show, where, for notational reasons it is understood in your particular case that $X_0 = 0 $ and $Y_0 = 0$

$P[X_1\neq Y_1|X_0=x_0, Y_0=y_0] = P[X_1\neq Y_1] \geq d_{TV}(X_1,Y_1)= \|\mu_0P-\nu_0P\|_{TV}$
with $\mu := \mathbf e_0$ $\nu := \mathbf e_0$

and the inequality is reached with equality in the case of a maximal coupling. This is somewhat trivial here but useful for other couplings e.g. if $\nu_0$ was the steady state vector.

(nit: there are couple different formulations of total variation distance I am using the 1/2 L1 norm of difference between probability vectors interpretation for countable state r.v.v)

the inequality you're trying to prove 'points in the wrong way' so to speak. More context / background information is needed to make sense of what you're trying to do. E.g. if you don't know what coupling, that is a big deal and my post won't make any sense.
 
  • #3
steep said:
Really a standard exercise/result for countable state random variables (with emphasis on markov chains) is to show, where, for notational reasons it is understood in your particular case that $X_0 = 0 $ and $Y_0 = 0$

$P[X_1\neq Y_1|X_0=x_0, Y_0=y_0] = P[X_1\neq Y_1] \geq d_{TV}(X_1,Y_1)= \|\mu_0P-\nu_0P\|_{TV}$
with $\mu := \mathbf e_0$ $\nu := \mathbf e_0$

and the inequality is reached with equality in the case of a maximal coupling. This is somewhat trivial here but useful for other couplings e.g. if $\nu_0$ was the steady state vector.

(nit: there are couple different formulations of total variation distance I am using the 1/2 L1 norm of difference between probability vectors interpretation for countable state r.v.v)

the inequality you're trying to prove 'points in the wrong way' so to speak. More context / background information is needed to make sense of what you're trying to do. E.g. if you don't know what coupling, that is a big deal and my post won't make any sense.

I realize now that the statement I want to prove does not hold for the random walk on a big complete graph.
 

1. What is a Markov chain?

A Markov chain is a mathematical model used to study the behavior of a system that changes over time. It is a sequence of random events or states in which the probability of transitioning from one state to another is dependent only on the current state and not on any previous states.

2. How is mixing time defined for a Markov chain?

Mixing time is the number of steps required for a Markov chain to reach a state where the distribution of states is close to the stationary distribution. In simpler terms, it is the time it takes for the chain to reach a point where it is no longer influenced by its initial state.

3. What factors affect the mixing time of a Markov chain?

The mixing time of a Markov chain is affected by the structure of the chain, the initial distribution of states, and the transition probabilities between states. It can also be influenced by the choice of algorithm used to simulate the chain.

4. How is the mixing time of a Markov chain calculated?

The mixing time of a Markov chain is typically estimated through simulation or by using mathematical techniques such as spectral methods. These methods involve running the chain for a large number of steps and analyzing the resulting distribution of states to determine when it has reached a stationary distribution.

5. Why is the mixing time of a Markov chain important?

The mixing time of a Markov chain is important because it is a measure of how quickly the chain reaches a steady state. A longer mixing time means that the chain takes longer to converge to its stationary distribution, which can impact the accuracy and efficiency of simulations or predictions made using the chain.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
Replies
5
Views
397
  • Calculus and Beyond Homework Help
Replies
26
Views
2K
  • Linear and Abstract Algebra
Replies
3
Views
1K
Replies
1
Views
837
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
691
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
Back
Top