Linear regression and bivariate normal, is there a relationship?

Click For Summary

Discussion Overview

The discussion revolves around the relationship between linear regression and the bivariate normal distribution. Participants explore the assumptions underlying linear regression models and clarify the roles of the variables involved, particularly focusing on whether both X and Y should be considered random variables.

Discussion Character

  • Exploratory, Technical explanation, Debate/contested

Main Points Raised

  • One participant presents the linear regression model as formulated in "Applied Linear Models," noting that in this formulation, X is treated as a known constant while Y is a random variable influenced by a normally distributed error term.
  • Another participant challenges the assumption that both X and Y are random variables in the context of linear regression, suggesting that this may be a confusion with total least squares regression.
  • A later reply acknowledges the confusion regarding the assumption of both variables being random and thanks the responder for clarification.
  • There is a question raised about the nature of total least squares, with a participant seeking to understand if it is simply a generalization of linear regression that allows for polynomial fits.
  • One participant asserts that total least squares treats both X and Y as random variables.

Areas of Agreement / Disagreement

Participants express differing views on the treatment of X and Y in linear regression, with some asserting that X is a constant while others suggest both should be considered random variables. The discussion remains unresolved regarding the implications of these differing assumptions.

Contextual Notes

The discussion highlights potential confusion regarding the definitions and assumptions of linear regression versus total least squares, as well as the implications of treating variables as constants versus random variables.

CantorSet
Messages
44
Reaction score
0
Hi everyone,

This is not a homework question. I just want to understand an aspect of linear regression better. The book "Applied Linear Models" by Kutchner et al, states that a linear regression model is of the form

Y_i = B_0 + B_1 X_i + \epsilon_i

where
Y_i is the value of the response variable in the ith trial
B_0, B_1 are parameters
X_i is a known constant
\epsilon_i is a random variable, normally distributed.
Therefore, Y_i is also a random variable, normally distributed but X_i is a constant.

This confused me a bit because I always associated linear regression with the bivariate normal distribution. That is, the underlying assumption of linear regression is the data \{(x_1,y_1), (x_2,y_2),...,(x_n,y_x) \} is sampled from a bivariate normal distribution. In which case, both X and Y are random variables. But in the formulation above, X is a known constant, while \epsilon and therefore Y are the random variables.

So in summary, what is the connection (if any) is between linear regression as formulated by Kutner and the bivariate normal.
 
Physics news on Phys.org
CantorSet said:
the underlying assumption of linear regression is the data \{(x_1,y_1), (x_2,y_2),...,(x_n,y_x) \} is sampled from a bivariate normal distribution. In which case, both X and Y are random variables.

I've never seen a treatment of regression that made that assumption. Are you confusing linear regession with some sort of "total least squares" regression?
http://en.wikipedia.org/wiki/Total_least_squares
 
Stephen Tashi said:
I've never seen a treatment of regression that made that assumption. Are you confusing linear regession with some sort of "total least squares" regression?
http://en.wikipedia.org/wiki/Total_least_squares

Thanks for responding, Stephen.

Yea, that was my own confusion for making that assumption. Thanks for clearing that up.

By the way, total least squares is just a generalization of linear regression in that the curve you're fitting the data points to can be polynomials with degrees higher than 1, right? Or is there more to total least squares?
 
Total least squares treats both X and Y as random variables.
 

Similar threads

  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
5K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 64 ·
3
Replies
64
Views
6K
Replies
3
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K