Graduate Error in (Multi)linear Regression

Click For Summary
The discussion centers on the conditions necessary to justify the use of (multi)linear regression for modeling data, particularly regarding the assumptions of error distribution. There is debate over whether normality of errors is required, with some sources stating that only independent and identically distributed (i.i.d) errors with a mean of zero are necessary. Normality is often linked to deriving the distribution of coefficients and establishing confidence intervals. While a multi-linear model can be an unbiased estimator under certain conditions, it may still perform well in practice even when those conditions are not strictly met. The conversation also touches on scenarios where non-linear models are more appropriate, such as modeling the height of a falling object.
WWGD
Science Advisor
Homework Helper
Messages
7,756
Reaction score
12,977
Hi,
I keep reading varying accounts on conditions needed to " justify" the use of ( multi) linear regression to model data.

Specifically, I have seen several authors require errors to be normal, i.i.d , whilr others only require the errors be i.i.d with mean 0. Just where is the assumption of normality used to justify the use of linear models? I know of Gauss Mark of, but this seems too strong. I've heard that it is used to find the distribution of the coefficients and determine reliable confidence intervals for the coefficients? If so, do you suggest a source? If not, can you explain?
 
Physics news on Phys.org
I guess the question is what does "justify" mean. If the errors are zero mean iid normal and equally distributed, then I think the multi linear model is an unbiased estimator of the actual values. If the errors are otherwise distributed then a different model (i.e. different coefficients) might be a better choice.

But in practice, a multi linear model works pretty well in lots of other situations, and it's often hard to know that the errors actually look like.
 
Thank you. Just curious, as an aside, do you know of situations that cannot be modeled (log)linearly; with linearity meaning linearity in the coefficients?
 
You just mean give a scenario where a non linear model is better? Sure, suppose you drop an object, and write down the time and the height at those times as it falls. The height measurement has some noise to it. Then the right model for the height is going to be something like ##h(t)=-\frac{g}{2}t^2+h_0## if it starts at a height of ##h_0##.

I feel like there's a good chance I did not understand the question.
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
3K
Replies
3
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 19 ·
Replies
19
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K