Normal assumption with least squares regression

Click For Summary

Discussion Overview

The discussion revolves around the assumptions required for least squares regression, particularly the necessity of assuming normality for the response variable Y. Participants explore the implications of this assumption, its relevance to hypothesis testing, and alternatives to normality in statistical modeling.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • One participant questions the necessity of assuming normality for Y in least squares regression, suggesting it may relate to the computation of standard errors for parameters.
  • Another participant asserts that normality is not required for least squares estimation itself, but is necessary for hypothesis testing based on parameter estimates and standard deviations.
  • A further inquiry is made about the possibility of assuming other distributions, such as gamma, and the implications of using generalized linear models (GLMs) instead of normality.
  • One participant advises against making distributional assumptions unless necessary, suggesting that maximum likelihood estimation could be a viable alternative if non-normal distributions are assumed.

Areas of Agreement / Disagreement

Participants express differing views on the necessity of normality in least squares regression, with some arguing it is essential for hypothesis testing while others contend it is not required for estimation. The discussion remains unresolved regarding the implications of assuming different distributions.

Contextual Notes

There are limitations in the discussion regarding the assumptions made about the distribution of Y and the conditions under which least squares results can be applied to hypothesis testing.

Guy Incognito
Messages
4
Reaction score
0
My google search just turns up results telling me that one of the assumptions I have to make is that each Y is normal. My question is why do I have to assume its normal. Why does it follow that it has to be normal as opposed to some other distribution? Hope that makes sense.

Edit: I thought about this some more. Is it just as simple as the standard errors for the parameters are computed assuming each Y is normal? If you write it out you can easily see that B1 for example is a linear function of Y1...Yn and thus will be normal.
 
Last edited:
Physics news on Phys.org
You do not need normality for least squares estimation. That includes the estimation of standard errors. The LS parameter estimates and the standard deviations are sample-based statistics; they do not require making assumptions about a distribution.

You do need normality when you are testing hypotheses based on the parameter estimates and the standard deviations. Since hypothesis testing means looking up probability values from a "probability table," you need to know which table to look at, and that means you have to make an assumption about the distribution.
 
Ok, so I guess my question is why do I have to assume that it's normal. Why can't I assume it's gamma or anything else. I was under the impression that if I wanted to use anything other than normal, I had to use GLMs (which I'll admit I know nothing about).
 
Last edited:
My advice is not to make distributional assumptions whenever you don't have to.

However, that would imply you cannot use ordinary LS results to test hypotheses.

If you wish to assume a non-normal distribution, then my advice is to use maximum likelihood estimation: http://en.wikipedia.org/wiki/Maximum_likelihood
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 13 ·
Replies
13
Views
5K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K