 |
 |
Linear Regression: reversing the roles of X and Y |
 |
May25-09, 04:20 AM
|
#1
|
kingwinner is
Offline:
Posts: 815
|
Linear Regression: reversing the roles of X and Y
Simple linear regression:
Y = β0 + β1 *X + ε , where ε is random error
Fitted (predicted) value of Y for each X is:
^
Y = b0 + b1 *X (e.g. Y hat = 7.2 + 2.6 X)
Consider
^
X = b0' + b1' *Y
[the b0,b1,b0', and b1' are least-square estimates of the β's]
Prove whether or not we can get the values of bo,b1 from bo',b1'. If not, why not?
Completely clueless...Any help is greatly appreciated!
|
|
|
|
May25-09, 06:12 AM
|
#2
|
HallsofIvy is
Offline:
Posts: 24,772
|
Re: Linear Regression: reversing the roles of X and Y
Start with  and solve for x.
|
|
|
|
May25-09, 03:06 PM
|
#3
|
statdad is
Offline:
Posts: 702
Recognitions:
Homework Helper
|
Re: Linear Regression: reversing the roles of X and Y
Originally Posted by kingwinner
Simple linear regression:
Y = β0 + β1 *X + ε , where ε is random error
Fitted (predicted) value of Y for each X is:
^
Y = b0 + b1 *X (e.g. Y hat = 7.2 + 2.6 X)
Consider
^
X = b0' + b1' *Y
[the b0,b1,b0', and b1' are least-square estimates of the β's]
Prove whether or not we can get the values of bo,b1 from bo',b1'. If not, why not?
Completely clueless...Any help is greatly appreciated!
|
I'm a little confused about your question? Are you asking whether regressing X on Y will always give the same coefficients, or whether it is ever possible to get the same ones?
|
|
|
|
May25-09, 04:09 PM
|
#4
|
kingwinner is
Offline:
Posts: 815
|
Re: Linear Regression: reversing the roles of X and Y
Originally Posted by statdad
I'm a little confused about your question? Are you asking whether regressing X on Y will always give the same coefficients, or whether it is ever possible to get the same ones?
|
(X_i, Y_i), i=1,2,...n
Y hat is a fitted (predicted) value of Y based on fixed values of X.
Y hat = b0 + b1 *X with b0 and b1 being the least-square estimates.
For X hat, we are predicting the value of X from values of Y which would produce a different set of parameters, b0' and b1'. Is there any general mathematical relationship linking b0', b1' and b0, b1?
Thanks for answering!
|
|
|
|
May27-09, 08:23 AM
|
#5
|
kingwinner is
Offline:
Posts: 815
|
Re: Linear Regression: reversing the roles of X and Y
Any help?
I think this is called "inverse regression"...
|
|
|
|
May27-09, 08:37 AM
|
#6
|
statdad is
Offline:
Posts: 702
Recognitions:
Homework Helper
|
Re: Linear Regression: reversing the roles of X and Y
" Is there any general mathematical relationship linking b0', b1' and b0, b1?"
No. If you put some severe restrictions on the Ys and Xs you could come up with a situation in which the two sets are equal, but in general - no.
Also, note that in the situation where x is fixed (non-random), regressing x on Y makes no sense - the dependent variable in regression must be random.
This may be off-topic for you, but Graybill ("Theory and Application of the Linear Model": my copy is from 1976, a horrid-green cover) discusses a similar problem on pages 275-283: the problem in the book deals with this: If we observe a value of a random variable Y (say y0) in a regression model, how can we estimate the corresponding value of x?
|
|
|
|
May27-09, 10:01 PM
|
#7
|
mXSCNT is
Offline:
Posts: 261
|
Re: Linear Regression: reversing the roles of X and Y
Kingwinner: the easiest first step is to try an example. Start with a random set of (X,Y) pairs and regress Y on X and see what the coefficients b0,b1 are. Then regress X on Y and see what the coefficients b0',b1' are. Do you see any simple relationship between b0,b1 and b0',b1'? (i.e. can you get b0',b1' by solving the equation y=b0+b1x for x?)
|
|
|
|
May28-09, 06:49 AM
|
#8
|
HallsofIvy is
Offline:
Posts: 24,772
|
Re: Linear Regression: reversing the roles of X and Y
It can be shown that the line such that the sum of the vertical distances from points to the line, the line such that the sum of the horizontal distances from points to the line, and the line such that the sum of distances perpendicular to the line are all the same line. That says that reversing x and y will give the same line.
|
|
|
|
Jun9-09, 12:10 AM
|
#9
|
junglebeast is
Offline:
Posts: 428
|
Re: Linear Regression: reversing the roles of X and Y
Originally Posted by HallsofIvy
It can be shown that the line such that the sum of the vertical distances from points to the line, the line such that the sum of the horizontal distances from points to the line, and the line such that the sum of distances perpendicular to the line are all the same line. That says that reversing x and y will give the same line.
|
I would be interested to see that proof...
When linear regression is used to find the line in slope-intercept form, this is not the case...as a glaring example, consider that a vertical line cannot be represented, whereas a horizontal one can. If your data set is more vertical than horizontal, you will get a much better fit by reversing the order of X and Y series.
I quickly written a program to randomly generate some data points and compare visually the line that minimizes Y error ( yellow) to the line that minimizes X error ( purple) and the line that minimizes point-to-line distance (red). As you can see from this example, they are not always the same line.
In order to eliminate the possibility that these differences are simply due to rounding errors, I repeated the experiment using floating point precision, double precision, and 320-bits of floating point precision using GMP bignum. The results are the same in all cases, indicating that precision does not play a factor here.
Here's my source code:
Code:
#include "linalg\linear_least_squares.h"
#include "vision\drawing.h"
#include "stat\rng.h"
#include "linalg\null_space.h"
#include "bignum\bigfloat.h"
template<typename Real>
void linear_regression( const std::vector<Real> &X, const std::vector<Real> &Y,
Real &m, Real &b)
{
Real sX=0, sY=0, sXY=0, sXX=0;
for(unsigned i=0; i<X.size(); ++i)
{
Real x = X[i], y = Y[i];
sX += x;
sY += y;
sXY += x*y;
sXX += x*x;
}
Real n = X.size();
m = (sY*sX - n*sXY)/( sX*sX - n*sXX );
b = (sX*sXY - sY*sXX)/( sX*sX - n*sXX );
}
int main()
{
using namespace heinlib;
using namespace cimgl;
bigfloat::set_default_precision(300);
typedef bigfloat Real;
bigfloat rr;
printf("precision = %d\n", rr.precision() );
CImg<float> image(250, 250, 1, 3, 0);
std::vector<Real> X, Y;
int N = 10;
for(unsigned i=0; i<N; ++i)
{
Real x = random<Real>::uniform(0, 250);
Real y = random<Real>::uniform(0, 250);
image.draw_circle( x, y, 3, color<float>::white() );
X.push_back(x);
Y.push_back(y);
}
Real m1, b1, m2, b2;
linear_regression(X, Y, m1, b1 );
linear_regression(Y, X, m2, b2 );
//flip second line
b2 = -b2/m2;
m2 = 1/m2;
cimg_draw_line( image, m1, b1, color<float>::yellow() );
cimg_draw_line( image, m2, b2, color<float>::purple() );
//find the means of X and Y
Real mX = 0, mY = 0;
for(unsigned i=0; i<N; ++i)
{
Real x = X[i], y = Y[i];
mX += x; mY += y;
}
mX /= N;
mY /= N;
//find least squares line by distance to line..
Real sXX=0, sYY=0, sXY=0;
for(unsigned i=0; i<N; ++i)
{
Real x = X[i] - mX,
y = Y[i] - mY;
sXX += x*x;
sYY += y*y;
sXY += x*y;
}
static_matrix<2,2,Real> A = { sXX, sXY,
sXY, sYY };
static_matrix<2,1,Real> Norm;
null_space_SVD(A, Norm);
//general form
static_matrix<3,1,Real> line = { Norm[0], Norm[1], -( mX*Norm[0] + mY*Norm[1] ) };
cimg_draw_line( image, line, color<float>::red() );
CImgDisplay disp(image);
system("pause");
}
|
|
|
|
Jun9-09, 12:42 AM
|
#10
|
EnumaElish is
Offline:
Posts: 2,217
Recognitions:
Homework Helper
Science Advisor
|
Re: Linear Regression: reversing the roles of X and Y
Assuming no singularity (vertical or horizontal) exists in the data, the standardized slope coefficient b/s.e.(b) as well as the goodness of fit statistic (R squared) will be identical between a vertical regression (Y = b0 + b1 X + u) and the corresponding horizontal regression (X = a0 + a1 Y + v) .
|
|
|
|
Jun9-09, 01:01 AM
|
Last edited by junglebeast; Jun9-09 at 01:06 AM..
#11
|
junglebeast is
Offline:
Posts: 428
|
Re: Linear Regression: reversing the roles of X and Y
Originally Posted by EnumaElish
Assuming no singularity (vertical or horizontal) exists in the data, the standardized slope coefficient b/s.e.(b) as well as the goodness of fit statistic (R squared) will be identical between a vertical regression (Y = b0 + b1 X + u) and the corresponding horizontal regression (X = a0 + a1 Y + v) .
|
Well, you can say that...but you haven't given any formal proof or evidence of the claim, and it is contrary to the example I just showed, which I have made the source available for you to see.
You can observe the same effect using Excel's built in linear regression. The graphs are rotated and stretched, but notice that the lines go through different points in relation to each other.
The singularity is not present in either of the examples
|
|
|
|
Jun9-09, 01:46 AM
|
#12
|
EnumaElish is
Offline:
Posts: 2,217
Recognitions:
Homework Helper
Science Advisor
|
Re: Linear Regression: reversing the roles of X and Y
Can you provide either the standard errors (of the coefficients) or the t statistics?
|
|
|
|
Jun9-09, 02:16 AM
|
#13
|
junglebeast is
Offline:
Posts: 428
|
Re: Linear Regression: reversing the roles of X and Y
Originally Posted by EnumaElish
Can you provide either the standard errors (of the coefficients) or the t statistics?
|
Your question does not even make sense, as the coefficients are not random variables. The coefficients are mathematical solutions to the line equation for a fixed data set of points.
By showing that numerical precision was not responsible for their differences, this proves that the parameters of the recovered lines are indeed different (ie, different equations). At least, I cannot think of any other possible way of interpreting those results. Let me know if you can...
|
|
|
|
Jun9-09, 07:23 AM
|
Last edited by statdad; Jun9-09 at 08:35 AM..
#14
|
statdad is
Offline:
Posts: 702
Recognitions:
Homework Helper
|
Re: Linear Regression: reversing the roles of X and Y
Originally Posted by junglebeast
Your question does not even make sense, as the coefficients are not random variables. The coefficients are mathematical solutions to the line equation for a fixed data set of points.
By showing that numerical precision was not responsible for their differences, this proves that the parameters of the recovered lines are indeed different (ie, different equations). At least, I cannot think of any other possible way of interpreting those results. Let me know if you can...
|
The coefficients in a regression are statistics, so it certainly does make sense to talk about their standard errors.
Since  is simply the square of the correlation coefficient, that quantity will be the same whether you regress Y on x or X on y.
Sorry - hitting post too soon is the result of posting before morning coffee.
The slopes of Y on x and X on y won't be equal (unless you have an incredible stroke of luck), but the t-statistics in each case, used for testing
 , will be, since the test statistic for the slope can be written a a function of  .
|
|
|
|
Jun9-09, 09:34 AM
|
#15
|
EnumaElish is
Offline:
Posts: 2,217
Recognitions:
Homework Helper
Science Advisor
|
Re: Linear Regression: reversing the roles of X and Y
(i) Y is random, (ii) b estimates are a function of Y, (iii) therefore estimated b's are random.
|
|
|
|
Jun9-09, 09:37 AM
|
#16
|
statdad is
Offline:
Posts: 702
Recognitions:
Homework Helper
|
Re: Linear Regression: reversing the roles of X and Y
Originally Posted by EnumaElish
(i) Y is random, (ii) b estimates are a function of Y, (iii) therefore estimated b's are random.
|
Er, I was agreeing with you earlier (if this post is aimed at me)
|
|
|
|
|
 |
 |
|
 |
|