Normal assumption with least squares regression

Guy Incognito
Messages
4
Reaction score
0
My google search just turns up results telling me that one of the assumptions I have to make is that each Y is normal. My question is why do I have to assume its normal. Why does it follow that it has to be normal as opposed to some other distribution? Hope that makes sense.

Edit: I thought about this some more. Is it just as simple as the standard errors for the parameters are computed assuming each Y is normal? If you write it out you can easily see that B1 for example is a linear function of Y1...Yn and thus will be normal.
 
Last edited:
Physics news on Phys.org
You do not need normality for least squares estimation. That includes the estimation of standard errors. The LS parameter estimates and the standard deviations are sample-based statistics; they do not require making assumptions about a distribution.

You do need normality when you are testing hypotheses based on the parameter estimates and the standard deviations. Since hypothesis testing means looking up probability values from a "probability table," you need to know which table to look at, and that means you have to make an assumption about the distribution.
 
Ok, so I guess my question is why do I have to assume that it's normal. Why can't I assume it's gamma or anything else. I was under the impression that if I wanted to use anything other than normal, I had to use GLMs (which I'll admit I know nothing about).
 
Last edited:
My advice is not to make distributional assumptions whenever you don't have to.

However, that would imply you cannot use ordinary LS results to test hypotheses.

If you wish to assume a non-normal distribution, then my advice is to use maximum likelihood estimation: http://en.wikipedia.org/wiki/Maximum_likelihood
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top