# The least squares line

## Homework Statement

Does the "through-origin" least squares line

yhat = b*X

pass through the point (ybar, xbar)?

The "through-origin" model is the least squares model without the intercept.

## Homework Equations

b = sum[YX]/sum[X^2]

yhat = b*X

## The Attempt at a Solution

when I calculate a sample linear model yhat = b*X, ybar =/= b*xbar. The aforementioned result was obtained for two different data sets.

Online it says that for:

y = a + bX

(xbar, ybar) does lie on the line. However, for the model in question:

y = bX

(xbar, ybar) does not lie on the line.

I have the answer but I don't understand why it is so?

Last edited:

Related Calculus and Beyond Homework Help News on Phys.org
Ray Vickson
Homework Helper
Dearly Missed

## Homework Statement

Does the "through-origin" least squares line

yhat = b*X

pass through the point (ybar, xbar)?

The "through-origin" model is the least squares model without the intercept.

## Homework Equations

b = sum[YX]/sum[X^2]

yhat = b*X

## The Attempt at a Solution

when I calculate a sample linear model yhat = b*X, ybar =/= b*xbar. The aforementioned result was obtained for two different data sets.

Online it says that for:

y = a + bX

(xbar, ybar) does lie on the line. However, for the model in question:

y = bX

(xbar, ybar) does not lie on the line.

I have the answer but I don't understand why it is so?
You don't need data sets to see that (xbar,ybar) does usually not lie on the line: from the fit
##y = [\sum_i(x_i y_i)/\sum(x_i^2)] x## it will normally not be the case that ##\bar{y}## equals ##[\sum_i(x_i y_i)/\sum(x_i^2)] \bar{x}##. That is, for most data sets the equality will fail.

As to WHY it fails, consider two data sets ##\{ (x_i, y_{1i})\}## and ##\{ (x_i, y_{2i})\}## , with ##y_{2i} = y_{1i} + c## for all i; that is, y for set 2 is just shifted upward (or downward) by ##c##. You can easily check that for the least-squares lines with intercepts, the intercepts for set 2 is just that for set 1 plus c, and the slopes are the same. However, if you force the two lines to pass through the origin, the two slopes will be different: ##\text{slope 2} - \text{slope 1} = c \sum_i(x_i)/\sum_i(x_i^2)##, and so ##y_2(\bar{x}) - y_1(\bar{x}) = c \sum_i(x_i) \bar{x}/\sum_i(x_i^2)##, while ##\bar{y_2} - \bar{y_1} = c.## So, as long as ##\bar{x}\sum_i(x_i)/\sum_i(x_i^2) \neq 1## we could not have both lines passing through ##(\bar{x},\bar{y}).##

So the ybar in your examble is [mean(y_1i) + mean(y_2i)] / 2?

Ray Vickson