When y is negative in linear regression?

Click For Summary
In linear regression, if the predicted value 'y' is negative while it represents time in seconds, it indicates a potential issue with the model or data. Negative Betas can arise from noise in the data or inappropriate model selection, suggesting that some independent variables may not be statistically significant. If 'y' should never be negative, alternative modeling approaches, such as using exponential functions or logarithmic transformations, may be more suitable. Stepwise regression can help identify and retain only statistically significant variables. Ultimately, ensuring the model aligns with the nature of the dependent variable is crucial for accurate predictions.
xeon123
Messages
90
Reaction score
0
I am using linear regression to predict 'y' based on 8 variables.
With my example, most the Betas that I got are negative. So, y, the value to predict, is negative.
To my data, y is a time in seconds, so I think it shouldn't be negative.

I my example in python, and I want to know if y should be negative, even when y is seconds, or my code is not correct.

Is is possible that y can be negative?
 
Physics news on Phys.org
It's not clear what you mean by a linear regression with 8 variables. Does this mean you are using 8 data points?
 
I mean that I use 8 independent variables to get y.

y = Beta1*x1 + Beta2*x2 + Beta3*x3 + Beta4*x4 + Beta5*x5 + Beta6*x6 + Beta7*x7 + Beta8*x8

And when I calculate the Betas to get a predicted y, \hat{y}, some of them are negative, making \hat{y} negative.
 
Last edited:
Have you checked the statistical significance of those betas? If some of them are just noise then you would expect to get nonsense results.

Even then statistical modeling with a linear fit is never going to be a perfect, it is entirely possible that if x1 is larger that the timing of y will be shorter, causing a negative Beta1 to appear. At that point you might question whether a linear model is a good one to use for the x1 variable.
 
A lot of statistics packages have stepwise regression algorithms. They start with a constant and the most significant independent variable (say Xm): Y = Beta0 + Betam * Xm. Then, one by one, add in the next most significant term, then the next, etc., till there are no statistically significant terms to add. That will allow you to include only those terms that are statistically significant.

If you know that Y can never be negative, you might want to try a model that will never go negative, like Y = exp( Beta0 + Beta1 * X1). For that, do a stepwise linear regression using the natural log of the Y data. That will give an expression ln(Y) = Beta0 + Beta1 * X1. Many statistics packages have these types of regressions as options.
 
I am studying the mathematical formalism behind non-commutative geometry approach to quantum gravity. I was reading about Hopf algebras and their Drinfeld twist with a specific example of the Moyal-Weyl twist defined as F=exp(-iλ/2θ^(μν)∂_μ⊗∂_ν) where λ is a constant parametar and θ antisymmetric constant tensor. {∂_μ} is the basis of the tangent vector space over the underlying spacetime Now, from my understanding the enveloping algebra which appears in the definition of the Hopf algebra...

Similar threads

  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
3
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K