Undergrad Proof request for best linear predictor

Click For Summary
The discussion centers on proving Theorem 5.2, which states that the best linear predictor of Y based on X can be expressed in terms of means, variances, and covariance. Key definitions include the regression function E(Y|X=x) and the concept of linear predictors, which are functions of X. The expected quadratic prediction error is introduced as a measure to compare the effectiveness of different predictors. Theorem 5.1 establishes that the regression function is the optimal predictor when the second moment of Y is finite. The thread emphasizes the need for clarity in applying these definitions and theorems to prove the main theorem.
psie
Messages
315
Reaction score
40
TL;DR
In An Intermediate Course in Probability by Gut, there's a theorem stated without proof concerning best linear predictors. I was wondering if anyone knows how to prove it/or knows other sources where it has been proved.
Maybe this is a simple exercise, but I don't see how to prove the below theorem with the tools I've been given in the section (if it is possible at all).

Theorem 5.2. Suppose that ##EX^2<\infty## and ##EY^2<\infty##. Set \begin{align*}\mu_x&=EX, \\ \mu_y&=EY, \\ \sigma_x^2&=\operatorname{Var}X,\\ \sigma_y^2&=\operatorname{Var}Y, \\ \sigma_{xy}&=\operatorname{Cov}(X,Y), \\ \rho&=\sigma_{xy}/(\sigma_x\sigma_y).\end{align*} The best linear predictor of ##Y## based on ##X## is $$L(X)=\alpha+\beta X,$$where ##\alpha=\mu_y-\frac{\sigma_{xy}}{\sigma_x^2}\mu_x=\mu_y-\rho\frac{\sigma_y}{\sigma_x}\mu_x## and ##\beta=\frac{\sigma_{xy}}{\sigma_x^2}=\rho\frac{\sigma_y}{\sigma_x}##.

That's the theorem that I'm looking to prove. Now I'll just state some definitions and a theorem that has been given in the section prior to the above theorem. As done in the book, we confine ourselves to conditioning on a random variable, although definitions and theorems extend to conditioning on a random vector.

Definition 5.1. The function ##h(x)=E(Y\mid X=x)## is called the regression function ##Y## on ##X##.

Definition 5.2. A predictor (for ##Y##) based on ##X## is a function, ##d(X)##. The predictor is called linear if ##d## is linear.

Definition 5.3. The expected quadratic prediction error is $$E(Y-d(X))^2.$$ Moreover, if ##d_1## and ##d_2## are predictors, we say that ##d_1## is better than ##d_2## if ##E(Y-d_1(X))^2\leq E(Y-d_2(X))^2##.

Theorem 5.1. Suppose that ##EY^2<\infty##. Then ##h(X)=E(Y\mid X)## (i.e. the regression function ##Y## on ##X##) is the best predictor of ##Y## based on ##X##.
 
Physics news on Phys.org
First trick I learned this one a long time ago and have used it to entertain and amuse young kids. Ask your friend to write down a three-digit number without showing it to you. Then ask him or her to rearrange the digits to form a new three-digit number. After that, write whichever is the larger number above the other number, and then subtract the smaller from the larger, making sure that you don't see any of the numbers. Then ask the young "victim" to tell you any two of the digits of the...

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
Replies
2
Views
4K
  • · Replies 114 ·
4
Replies
114
Views
11K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 18 ·
Replies
18
Views
2K