Identity in statistics that frequently pops up

Click For Summary
SUMMARY

The discussion centers on the identity in statistics related to linear regression, specifically the equation ∑(x_i)^2 - n*x-bar^2 = Σ(x_i - x-bar)^2. The user initially struggled to understand the identity but later solved it independently. Feedback from other forum members confirmed that the user's method was standard and efficient, suggesting that the proof could be concluded earlier than initially attempted.

PREREQUISITES
  • Understanding of linear regression concepts, including the role of x_i and y_0 = b_0 + b_1*x_i.
  • Familiarity with statistical notation, particularly summation (Σ) and mean (x-bar).
  • Knowledge of algebraic manipulation and proof techniques in statistics.
  • Experience with variance calculations and their derivations.
NEXT STEPS
  • Study the derivation of variance in statistics, focusing on the formula Σ(x_i - x-bar)^2.
  • Explore alternative proofs for statistical identities, particularly in linear regression contexts.
  • Learn about the implications of the Central Limit Theorem on regression analysis.
  • Investigate the role of residuals in linear regression and their relationship to the identity discussed.
USEFUL FOR

Students of statistics, educators teaching linear regression, and data analysts seeking to deepen their understanding of statistical identities and proofs.

pandaBee
Messages
23
Reaction score
0

Homework Statement


In my statistics notes/lectures my professor will oftentimes use an identity that looks like the following:
x_i is a non random variable, the summand is from i=1 to n;
This segment comes from notes on linear regression (y_0 = b_0 + b_1*x_i)

I actually forgot to mention that x-bar is supposed to be squared on the LHS, sorry about that.
∑(x_i)^2 - n*x-bar^2 = Σ(x_i - x-bar)^2
However I just do not see how this works out at all!
When I work it It turns out that x-bar = 1 but this doesn't make sense to me at all in the context.

Does anyone have some insight they could provide me?

********EDIT*****
After working through it a few times I actually solved the identity myself, however I am still curious if there's perhaps a more elegant way to power through the proof compared to the way I personally did below. Sorry for the confusion.


Homework Equations

The Attempt at a Solution


Σ(x_i - x-bar)^2
= Σ(x_i^2 - 2*x-bar*x_i + x-bar^2)

= Σ(x_i^2) - 2*x-barΣ(x_i) + n*x-bar^2 = ∑(x_i)^2 - n*x-bar^2 (by the above equation in part 1)

⇒-2*x-barΣ(x_i) + n*x-bar^2 = - n*x-bar^2

If you divide both sides by x-bar^2;
-2Σ(x_i)/(x-bar) + n = -n
= -2*n*x-bar/(x-bar) + n = -n
= -2n + n = -n
or - n = - n
 
Last edited by a moderator:
Physics news on Phys.org
pandaBee said:

Homework Statement


In my statistics notes/lectures my professor will oftentimes use an identity that looks like the following:
x_i is a non random variable, the summand is from i=1 to n;
This segment comes from notes on linear regression (y_0 = b_0 + b_1*x_i)

I actually forgot to mention that x-bar is supposed to be squared on the LHS, sorry about that.
∑(x_i)^2 - n*x-bar^2 = Σ(x_i - x-bar)^2
However I just do not see how this works out at all!
When I work it It turns out that x-bar = 1 but this doesn't make sense to me at all in the context.

Does anyone have some insight they could provide me?

********EDIT*****
After working through it a few times I actually solved the identity myself, however I am still curious if there's perhaps a more elegant way to power through the proof compared to the way I personally did below. Sorry for the confusion.


Homework Equations

The Attempt at a Solution


Σ(x_i - x-bar)^2
= Σ(x_i^2 - 2*x-bar*x_i + x-bar^2)

= Σ(x_i^2) - 2*x-barΣ(x_i) + n*x-bar^2 = ∑(x_i)^2 - n*x-bar^2 (by the above equation in part 1)

⇒-2*x-barΣ(x_i) + n*x-bar^2 = - n*x-bar^2

If you divide both sides by x-bar^2;
-2Σ(x_i)/(x-bar) + n = -n
= -2*n*x-bar/(x-bar) + n = -n
= -2n + n = -n
or - n = - n

Your method is the standard one, and is about as elegant and simple as you can get. However, you carried on way past the point you needed to. You were already done at line (3) [the one with the statement "by the above equation in part 1"]. At that point you have obtained exactly what you started out wanting to prove, so what possible use would any of the remaining material (in lines 4--9) be to you?
 
Last edited by a moderator:

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
32
Views
4K
  • · Replies 25 ·
Replies
25
Views
4K
  • · Replies 4 ·
Replies
4
Views
1K
Replies
0
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K