# Gauss-Markov process covariance function

1. Oct 29, 2015

Hi all. I've been getting up to speed with Gaussian processes (https://en.wikipedia.org/wiki/Gaussian_process), and was interested to know what properties a Gaussian process must satisfy for it to also be a Markov process (https://en.wikipedia.org/wiki/Markov_process).

Briefly, a Gaussian process is an ordered collection of Gaussian random variables (e.g., a time series) which can be characterised by the covariance function between variables representing different time points t and t'. A Markov process is one in which the state at some time t > t' can be predicted just as well by the state at time t' as it could be including additional information from times t < t'.

According to the wiki page on a Gauss-Markov process (https://en.wikipedia.org/wiki/Gauss–Markov_process), a condition a Gaussian process must satisfy in order to be a Markov process is that it have an exponential covariance function $$\sigma^2 e^{-\beta |t - t'|}$$. Does anyone have any idea why this should be the case? Why not $$\sigma^2 e^{-\beta |t - t'|^2}$$ for example?

Thanks!

2. Oct 29, 2015

### andrewkirk

I think it's because in a Gauss-Markov process we want the correlation coefficients between different times to be multiplicative. That is, for $t_1<t_2<t_3$ we want $corr(X(t_1),X(t_3))=corr(X(t_1),X(t_2))\cdot corr(X(t_2),X(t_3))$.

That will be the case for the first covariance function but not the second one.

3. Oct 30, 2015

Thanks for the reply. Unfortunately I don't think the covariance function satisfisfies the property you state (note that the time delay is normed).

I've actually come to the conclusion that the wikipedia page is wrong - when I looked into their reference it derives the stated covariance function for an AR(1) process only, which is a special case of a Gauss-Markov process.

If this is the case, then I'm still left wondering what general constraints the Markov property would impose on the covariance function of a Gaussian process.

4. Oct 30, 2015

### MarneMath

If something is a Gaussian Markov process (Stationary), then the covariance must have the decay exponential form. (Prove it, it's fun!)

5. Oct 30, 2015

I have seen that the covariance function for an AR(p) process (a form of Gaussian Markov process) is $$\sum_{k=0}^{p-1} \beta_k |t|^k e^{-\alpha|t - t'|}$$. This is not a simple decay exponential form, so I'm not entirely sure what class of functions you are describing.

Edit: Disregard that. I just read that an autoregressive model AR(p) with p > 1 is in fact not a Markov process!

Last edited: Oct 30, 2015
6. Oct 30, 2015

### andrewkirk

Yes, I think the wikipedia page is not a good source. It is sketchy, unreliable and contains errors. For instance it says the autocorrelation function is $\sigma^2e^{-|\tau|}$ whereas I'm pretty sure that is an autocovariance, not an autocorrelation. An autocorrelation should not have $\sigma$ in it, assuming that $\sigma$ is supposed to be instantaneous volatility (which is what $\sigma$ usually represents when one is talking about continuous stochastic processes).

The statement $E(X(t)^2)=\sigma^2$ looks wrong too, as the RHS should have $t$ in it. Further, the expectation on the LHS is not meaningful without giving a reference point. They may mean $E(X(t)^2|X(0)=0)$ but it's unclear.

I expect there are better sources out there that present Gauss-Markov processes in a more rigorous and complete manner, which would be more rewarding. If you read them, then you can fix up the Wikipedia article!

7. Oct 30, 2015

I agree. I think it is a covariance function between random variables separated by the delay $\tau = t-t'$. The expected $E(X(t)^2)=\sigma^2$ should refer to the covariance where $t=t'$ (which is fixed due to stationarity).

I think I'll need a proper textbook for this, as a google search isn't helping. I have learned that the exponential is the only memoryless continuous distribution, which I'm sure must relate to why it has to be the covariance. However, that doesn't tell me why it has to be normed, for example. Perhaps I'll be able to derive the result myself with a little thought.

As an aside, I've found a lot of the available material to be quite advanced mathematically, involving usage of sigma algebras, filtrations etc. I wonder whether it's worth my while to really learn these more foundational concepts, given that I'll generally be interested more in practical applications and data analysis rather than pure mathematics (I'm working in neuroscience but trained in physics).

8. Oct 30, 2015

### andrewkirk

Personally, I think it's very worthwhile to develop a firm understanding of the theoretical basis if you're going to be working with continuous stochastic processes. The reason is that, even when dealing only with applications, the concepts become very confusing and in some cases seem impossible, meaningless or both. Understanding the mathematical foundations really helps clear a way through the fog. Or at least, it did for me. Without it, one just doesn't have a working vocabulary to talk clearly about the various things one needs to talk about with stochastic processes. As an example, the textbook 'Financial Calculus' by Baxter and Rennie, which is an applied book aimed at mathematical finance professionals ('quants') introduces the mathematical framework, and constantly returns to it, even though the aim of the book is to be able to price options, swaps and other derivatives.

Unlike the wiki article on Gauss-Markov processes, I found the wiki articles on the theoretical underpinnings of continuous stochastic processes quite good. A good sequence for reading would be the articles entitled 'Probability Space' > 'Sigma-algebra' > 'Filtration (mathematics)' > 'Stochastic process'.

Or you could start by reading the note I just wrote for somebody else on physicsforums about sigma algebras for Markov processes, which is post #4 in this thread.

I've done plenty of work with continuous stochastic processes, but have not come across the term 'Gauss-Markov' process before so I can't comment on whether the definitions given in the wiki article give the usual, accepted meaning of that word. I can only comment on whether they make sense - which they don't quite manage to do. The processes I work with are mostly Ito processes or Poisson processes. Most (but not all) of these are Markov, and the Ito processes are also Gaussian, but whether they conform to somebody's definition of 'Gauss-Markov' I have never needed to know.

9. Oct 30, 2015

### MarneMath

The wikipedia isn't wrong per se, it's just a special case.

A couple of notes though. It isn't unusual for the term autocorrelation and auto covariance to be thrown around interchangeably, especially when it comes to electrical engineering. Which that in mind, you should be mildly aware that the example posted is a common shaping filter (Think kalman filter). Essentially if you have a discrete time Gaussian White Noise, you have a few properties that make this make more sense. Each x(t) has a identical Gaussian normal distribution. There exist no correlation between these random variables that for R_x if tau = 0, we have sigma^2 else we have 0 and lastly the power spectrum density is also sigma square regardless of the parameter.

The same general property applies to continuous case too. So why does this matter? The last property tells us that the Gaussian white noise is evenly distributed over some infinite frequency range. However, in the real world we need a finite range. Therefore the shaping filter! Let us have a stationary process (which occurs if the process is given a sufficient enough time for transients to fade). Let H(s) be some stable transfer function. We can use the properties of stationary process to show that the spectral densities are such that S_x(jw) = |H(jw)|^2S_w(jw). Since we are talking about Gaussian white noise, then clearly S_w(jw) = sigma^2 (because of the properties I listed above). Now all have to do is apply the constraint to this result and you'll have the wiki result.

10. Oct 31, 2015

We can forget about the wikipedia page, and just focus on the question "what constraints does the markov property impose on the covariance function for a Gaussian process?". Without loss of generality, a Gaussian process can be modelled as having zero mean so that the full structure is contained within the covariance function. We can also assume stationarity to make things simpler.

I tried to derive the results myself, but had some difficulties in implementing the Markov property for a continuous Gaussian process. I know I need to condition the probability distribution over the past or present states, but I'm unsure how to do this. Apparently the conditional distributions of multivariate Gaussians are themselves multivariate Gaussians, but whether this extends to the continuous case I'm not sure.