# When y is negative in linear regression?

1. Jan 28, 2014

### xeon123

I am using linear regression to predict 'y' based on 8 variables.
With my example, most the Betas that I got are negative. So, y, the value to predict, is negative.
To my data, y is a time in seconds, so I think it shouldn't be negative.

I my example in python, and I want to know if y should be negative, even when y is seconds, or my code is not correct.

Is is possible that y can be negative?

2. Jan 28, 2014

### SteamKing

Staff Emeritus
It's not clear what you mean by a linear regression with 8 variables. Does this mean you are using 8 data points?

3. Jan 29, 2014

### xeon123

I mean that I use 8 independent variables to get y.

y = Beta1*x1 + Beta2*x2 + Beta3*x3 + Beta4*x4 + Beta5*x5 + Beta6*x6 + Beta7*x7 + Beta8*x8

And when I calculate the Betas to get a predicted y, \hat{y}, some of them are negative, making \hat{y} negative.

Last edited: Jan 29, 2014
4. Jan 29, 2014

### Office_Shredder

Staff Emeritus
Have you checked the statistical significance of those betas? If some of them are just noise then you would expect to get nonsense results.

Even then statistical modeling with a linear fit is never going to be a perfect, it is entirely possible that if x1 is larger that the timing of y will be shorter, causing a negative Beta1 to appear. At that point you might question whether a linear model is a good one to use for the x1 variable.

5. Jan 29, 2014

### FactChecker

A lot of statistics packages have stepwise regression algorithms. They start with a constant and the most significant independent variable (say Xm): Y = Beta0 + Betam * Xm. Then, one by one, add in the next most significant term, then the next, etc., till there are no statistically significant terms to add. That will allow you to include only those terms that are statistically significant.

If you know that Y can never be negative, you might want to try a model that will never go negative, like Y = exp( Beta0 + Beta1 * X1). For that, do a stepwise linear regression using the natural log of the Y data. That will give an expression ln(Y) = Beta0 + Beta1 * X1. Many statistics packages have these types of regressions as options.