Line of regression substitution

Office_Shredder · Nov 20, 2021

I don't understand your question. In general the point of a linear regression is so you can substitute in a value of x to find a good guess for y. You won't get exactly the right answer just because you usually assume there's some noise in your prediction, is that what you mean?

Einstein44 · Nov 20, 2021

Office_Shredder said:

I don't understand your question. In general the point of a linear regression is so you can substitute in a value of x to find a good guess for y. You won't get exactly the right answer just because you usually assume there's some noise in your prediction, is that what you mean?

Nevermind, I believe I phrased this wrong. I meant why you cannot substitute y to estimate x. Because I remember the prof saying that you can substitute x to estimate y, but not the other way around. And I forgot the reason and didn't find anything on this on the internet, so I thought maybe someone knows what I mean.

Office_Shredder · Nov 20, 2021

Oh yeah. I think the way to think about this is that you can consider two linear regressions (I'm going to assume the constant term comes out zero for both)

##y=\beta_x x##
##x= \beta_y y##.

It's tempting to think that ##\beta_x \beta_y =1##. But it's not, in fact in general the product of the betas is ##R^2## value of the linear regression, and only equals 1 when the two variables are perfectly correlated. As a simple example, suppose x and y are totally uncorrelated. Then ##\beta_x=\beta_y=0##. If they are only slightly correlated, you might get that ##\beta_x## and ##\beta_y## are both small and almost zero. Then trying to invert your linear regression is going to give you a very bad result for an estimate.

FactChecker · Nov 20, 2021

The regression shown was calculated to minimize the sum-squared-errors of the y estimates versus the y sample values. Those errors are the distances parallel to the Y-axis. If you want to estimate x, you would want a regression line that minimizes the sum-squared-errors of the x estimates versus the x sample values. Those errors are the distances parallel to the X-axis. So the minimization would be different.

WWGD · Nov 22, 2021

As Schreder said, product of slopes is ## R^2##, where ##R## is the correlation coefficient.
Slopes are given as ##R \frac {s_{xx}}{s_{yy}}##, so that the products *

Barring cases where either denominator is ##0, R \frac {s_{yy}}{s_{xx}} * R \frac {s_{xx}}{s_{xx}} =R^2##,

Notice that for nonlinear relations, the relation between the two may be invertible only locally , e.g., for
Hooke's law ##y =kx^2 ##

* Barring cases when either is 0, which means data is constant.

Line of regression substitution

Thread 'Finding the number of ways to arrange identical balls in a circle (3 different colors)'

Similar threads

Why Are There Two Possible Values of x in Similar Shapes Geometry Problems?

Find the polar form of a complex number

Finding the number of ways to arrange identical balls in a circle (3 different colors)

[ASK] Trigonometric Inequality

What does this equation mean?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers