Proof request for best linear predictor

  • Context: Undergrad 
  • Thread starter Thread starter psie
  • Start date Start date
  • Tags Tags
    Probability theory
Click For Summary
SUMMARY

The discussion centers on proving Theorem 5.2, which states that the best linear predictor of a random variable Y based on another random variable X can be expressed as L(X) = α + βX. Here, α and β are defined in terms of the expected values and variances of X and Y, specifically α = μ_y - ρ(σ_y/σ_x)μ_x and β = ρ(σ_y/σ_x), where ρ is the correlation coefficient. The discussion also references key definitions and Theorem 5.1, which asserts that the regression function h(X) = E(Y|X) is the optimal predictor of Y based on X when EY² < ∞.

PREREQUISITES
  • Understanding of random variables and their properties
  • Familiarity with concepts of expectation and variance
  • Knowledge of linear regression and predictors
  • Basic grasp of covariance and correlation coefficients
NEXT STEPS
  • Study the Gauss-Markov theorem and its implications for linear regression
  • Explore the derivation of the best linear predictor in statistical theory
  • Investigate the properties of regression functions and their applications
  • Learn about the expected quadratic prediction error and its significance in model evaluation
USEFUL FOR

Statisticians, data scientists, and researchers involved in predictive modeling and linear regression analysis will benefit from this discussion.

psie
Messages
315
Reaction score
40
TL;DR
In An Intermediate Course in Probability by Gut, there's a theorem stated without proof concerning best linear predictors. I was wondering if anyone knows how to prove it/or knows other sources where it has been proved.
Maybe this is a simple exercise, but I don't see how to prove the below theorem with the tools I've been given in the section (if it is possible at all).

Theorem 5.2. Suppose that ##EX^2<\infty## and ##EY^2<\infty##. Set \begin{align*}\mu_x&=EX, \\ \mu_y&=EY, \\ \sigma_x^2&=\operatorname{Var}X,\\ \sigma_y^2&=\operatorname{Var}Y, \\ \sigma_{xy}&=\operatorname{Cov}(X,Y), \\ \rho&=\sigma_{xy}/(\sigma_x\sigma_y).\end{align*} The best linear predictor of ##Y## based on ##X## is $$L(X)=\alpha+\beta X,$$where ##\alpha=\mu_y-\frac{\sigma_{xy}}{\sigma_x^2}\mu_x=\mu_y-\rho\frac{\sigma_y}{\sigma_x}\mu_x## and ##\beta=\frac{\sigma_{xy}}{\sigma_x^2}=\rho\frac{\sigma_y}{\sigma_x}##.

That's the theorem that I'm looking to prove. Now I'll just state some definitions and a theorem that has been given in the section prior to the above theorem. As done in the book, we confine ourselves to conditioning on a random variable, although definitions and theorems extend to conditioning on a random vector.

Definition 5.1. The function ##h(x)=E(Y\mid X=x)## is called the regression function ##Y## on ##X##.

Definition 5.2. A predictor (for ##Y##) based on ##X## is a function, ##d(X)##. The predictor is called linear if ##d## is linear.

Definition 5.3. The expected quadratic prediction error is $$E(Y-d(X))^2.$$ Moreover, if ##d_1## and ##d_2## are predictors, we say that ##d_1## is better than ##d_2## if ##E(Y-d_1(X))^2\leq E(Y-d_2(X))^2##.

Theorem 5.1. Suppose that ##EY^2<\infty##. Then ##h(X)=E(Y\mid X)## (i.e. the regression function ##Y## on ##X##) is the best predictor of ##Y## based on ##X##.
 
Physics news on Phys.org

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K