I Proof request for best linear predictor

Click For Summary
The discussion centers on proving Theorem 5.2, which states that the best linear predictor of Y based on X can be expressed in terms of means, variances, and covariance. Key definitions include the regression function E(Y|X=x) and the concept of linear predictors, which are functions of X. The expected quadratic prediction error is introduced as a measure to compare the effectiveness of different predictors. Theorem 5.1 establishes that the regression function is the optimal predictor when the second moment of Y is finite. The thread emphasizes the need for clarity in applying these definitions and theorems to prove the main theorem.
psie
Messages
315
Reaction score
40
TL;DR
In An Intermediate Course in Probability by Gut, there's a theorem stated without proof concerning best linear predictors. I was wondering if anyone knows how to prove it/or knows other sources where it has been proved.
Maybe this is a simple exercise, but I don't see how to prove the below theorem with the tools I've been given in the section (if it is possible at all).

Theorem 5.2. Suppose that ##EX^2<\infty## and ##EY^2<\infty##. Set \begin{align*}\mu_x&=EX, \\ \mu_y&=EY, \\ \sigma_x^2&=\operatorname{Var}X,\\ \sigma_y^2&=\operatorname{Var}Y, \\ \sigma_{xy}&=\operatorname{Cov}(X,Y), \\ \rho&=\sigma_{xy}/(\sigma_x\sigma_y).\end{align*} The best linear predictor of ##Y## based on ##X## is $$L(X)=\alpha+\beta X,$$where ##\alpha=\mu_y-\frac{\sigma_{xy}}{\sigma_x^2}\mu_x=\mu_y-\rho\frac{\sigma_y}{\sigma_x}\mu_x## and ##\beta=\frac{\sigma_{xy}}{\sigma_x^2}=\rho\frac{\sigma_y}{\sigma_x}##.

That's the theorem that I'm looking to prove. Now I'll just state some definitions and a theorem that has been given in the section prior to the above theorem. As done in the book, we confine ourselves to conditioning on a random variable, although definitions and theorems extend to conditioning on a random vector.

Definition 5.1. The function ##h(x)=E(Y\mid X=x)## is called the regression function ##Y## on ##X##.

Definition 5.2. A predictor (for ##Y##) based on ##X## is a function, ##d(X)##. The predictor is called linear if ##d## is linear.

Definition 5.3. The expected quadratic prediction error is $$E(Y-d(X))^2.$$ Moreover, if ##d_1## and ##d_2## are predictors, we say that ##d_1## is better than ##d_2## if ##E(Y-d_1(X))^2\leq E(Y-d_2(X))^2##.

Theorem 5.1. Suppose that ##EY^2<\infty##. Then ##h(X)=E(Y\mid X)## (i.e. the regression function ##Y## on ##X##) is the best predictor of ##Y## based on ##X##.
 
Physics news on Phys.org
There is a nice little variation of the problem. The host says, after you have chosen the door, that you can change your guess, but to sweeten the deal, he says you can choose the two other doors, if you wish. This proposition is a no brainer, however before you are quick enough to accept it, the host opens one of the two doors and it is empty. In this version you really want to change your pick, but at the same time ask yourself is the host impartial and does that change anything. The host...

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
Replies
9
Views
2K
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K