Show the two forms of the sample variance are equivalent

In summary, we can show that the two forms of the sample variance are equivalent by expanding and simplifying both equations. The (n-1) in the denominators cancels out and after distributing the summations to all the terms, we can use the fact that the sum of all the observations is equal to n times the mean. However, we still have to deal with the j summations and the Yj terms.
  • #1
TeenieBopper
29
0

Homework Statement


Showthe two forms of the sample variance are equivalent:
[itex]\frac{1}{n-1}[/itex][itex]\sum_{i=1}^\n (Yi-Ybar)2[/itex] = [itex]\frac{1}{n(n-1)}[/itex][itex]\sum_{i=1}^\n \sum_{j>i}\n (Yi-Yj)2[/itex]

The first summation is from i=1 to n, the second is i=1 to n and the third is j>i to n. Sorry, I don't know how to format those.

Homework Equations





The Attempt at a Solution


I don't really know where to begin, so I tried just expanding and cancelling where I could. I know the (n-1) in the denominator on both sides cancel, and then after expanding I get

[itex]\sum (Yi2-2YiYbar + Ybar2[/itex] = [itex]\frac{1}{n}[/itex][itex]\sum \sum (Yi2-2YiYj+Yj2)[/itex]

Then, I can distribute the 1/n and the summations on the right side. If I do that, and I have a term that does not have j (such as Yi), I can essentially drop the j summation from that term, correct? After I do that, I have the following:

[itex]\sum Yi2[/itex] -2[itex]\sum YiYbar[/itex] + [itex]\sum Ybar2[/itex] = [itex]\frac{1}{n}[/itex][itex]\sum Yi2 - \frac{2}{n}[/itex][itex]\sum \sum YiYj[/itex]+[itex]\sum \sum Yj2[/itex]

And here's where I'm stuck (assuming I even did everything right to get here, which I doubt). I don't know how to deal with the Yj, among other things. Any help would be greatly appreciated.

edit: I'm sorry about the terrible formatting. I tried using the LaTex tag buttons and I've looked at the FAQ; not sure what I'm doing wrong.
 
Last edited:
Physics news on Phys.org
  • #2
TeenieBopper said:

Homework Statement


Showthe two forms of the sample variance are equivalent:
[itex]\frac{1}{n-1}[/itex][itex]\sum_{i=1}^\n (Yi-Ybar)2[/itex] = [itex]\frac{1}{n(n-1)}[/itex][itex]\sum_{i=1}^\n \sum_{j>i}\n (Yi-Yj)2[/itex]

The first summation is from i=1 to n, the second is i=1 to n and the third is j>i to n. Sorry, I don't know how to format those.

Homework Equations





The Attempt at a Solution


I don't really know where to begin, so I tried just expanding and cancelling where I could. I know the (n-1) in the denominator on both sides cancel, and then after expanding I get

[itex]\sum (Yi2-2YiYbar + Ybar2[/itex] = [itex]\frac{1}{n}[/itex][itex]\sum \sum (Yi2-2YiYj+Yj2)[/itex]

Then, I can distribute the 1/n and the summations on the right side. If I do that, and I have a term that does not have j (such as Yi), I can essentially drop the j summation from that term, correct? After I do that, I have the following:

[itex]\sum Yi2[/itex] -2[itex]\sum YiYbar[/itex] + [itex]\sum Ybar2[/itex] = [itex]\frac{1}{n}[/itex][itex]\sum Yi2 - \frac{2}{n}[/itex][itex]\sum \sum YiYj[/itex]+[itex]\sum \sum Yj2[/itex]

And here's where I'm stuck (assuming I even did everything right to get here, which I doubt). I don't know how to deal with the Yj, among other things. Any help would be greatly appreciated.

edit: I'm sorry about the terrible formatting. I tried using the LaTex tag buttons and I've looked at the FAQ; not sure what I'm doing wrong.

What you wrote is very hard to read. The problem you're having with formatting comes from mixing HTML tags (e.g., and ) with LaTeX script. Use one or the other, but not both.

Here's your first equation, cleaned up:
[itex]\frac{1}{n-1} \sum_{i=1}^n (Y_i - Ybar)^2 = \frac{1}{n(n-1)}\sum_{i=1}^n \sum_{j>i}^n (Y_i-Y_j)^2[/itex]
 
  • Like
Likes 1 person
  • #3
Mark44 said:
What you wrote is very hard to read. The problem you're having with formatting comes from mixing HTML tags (e.g., and ) with LaTeX script. Use one or the other, but not both.

Here's your first equation, cleaned up:
[itex]\frac{1}{n-1} \sum_{i=1}^n (Y_i - Ybar)^2 = \frac{1}{n(n-1)}\sum_{i=1}^n \sum_{j>i}^n (Y_i-Y_j)^2[/itex]


Even better:
[tex]\frac{1}{n-1} \sum_{i=1}^n (Y_i - \bar{Y})^2 = \frac{1}{n(n-1)}\sum_{i=1}^{n-1} \sum_{j=i+1}^n (Y_i-Y_j)^2[/tex]
Note that in the second form, if we have j>i with j going up to n, then i can only go up to (n-1).
 
  • Like
Likes 1 person
  • #4
Ah, I see. I was just using the buttons at the top and side. Hopefully I'll get this right this time.

The original equation.

[tex]\frac{1}{n-1} \sum_{i=1}^n (Y_i - \bar{Y})^2 = \frac{1}{n(n-1)}\sum_{i=1}^{n-1} \sum_{j=i+1}^n (Y_i-Y_j)^2[/tex]

Like I said, the (n-1) from both sides cancel, and then I expand both sides

[tex]\sum_{i=1}^n(Y_i^2 -2Y_i \bar{Y} + \bar{Y}^2 = \frac{1}{n}\sum_{i=1}^{n-1} \sum_{j=i+1}^n(Y_i^2 - 2Y_iY_j + 2Y_j^2)[/tex]

I then distribute the summations to all the terms

[itex]\sum_{i=1}^nY_i^2-2 \sum_{i=1}^nY_i\bar{Y} + \sum_{i=1}^n\bar{Y}^2 = \frac{1}{n}\sum_{i=1}^{n-1} \sum_{j=i+1}^nY_i^2 - \frac{2}{n}\sum_{i=1}^{n-1} \sum_{j=i+1}^n Y_iY_j + \frac{1}{n}\sum_{i=1}^{n-1} \sum_{j=i+1}^nY_j^2[/itex]

I don't really know where to go from here. I know that [itex]\frac{1}{n}\sum_{i=1}^nY_i=\bar{Y}[/itex], but I'm not sure what that does from me. I'll still have the j summations to worry about on the right hand side, as well as the Yj terms.
 

1. What is the sample variance?

The sample variance is a measure of the spread or variability of a set of data points in a sample. It is calculated by taking the average of the squared differences between each data point and the sample mean.

2. What are the two forms of the sample variance?

The two forms of the sample variance are the population variance and the sample variance. The population variance is used when the data represents the entire population, whereas the sample variance is used when the data represents a sample of the population.

3. How are the two forms of the sample variance calculated?

The population variance is calculated by taking the sum of the squared differences between each data point and the population mean, and then dividing by the total number of data points. The sample variance is calculated in a similar way, but instead of using the population mean, the sample mean is used.

4. How are the two forms of the sample variance equivalent?

The two forms of the sample variance are equivalent because they both measure the variability of a set of data points around a central value. They are also both calculated using the squared differences between each data point and a mean value. The only difference is that the population variance uses the population mean, while the sample variance uses the sample mean.

5. Why is it important to understand the two forms of the sample variance?

It is important to understand the two forms of the sample variance because they are commonly used in statistical analysis to measure the spread of data. By understanding the differences between the two forms, a scientist can choose the appropriate measure depending on whether they are working with a sample or the entire population. This can help ensure the accuracy and reliability of their results.

Similar threads

  • Calculus and Beyond Homework Help
Replies
3
Views
418
  • Calculus and Beyond Homework Help
Replies
15
Views
1K
  • Calculus and Beyond Homework Help
Replies
0
Views
155
  • Calculus and Beyond Homework Help
Replies
1
Views
259
  • Calculus and Beyond Homework Help
Replies
14
Views
525
  • Calculus and Beyond Homework Help
Replies
4
Views
654
  • Calculus and Beyond Homework Help
Replies
9
Views
929
  • Calculus and Beyond Homework Help
Replies
6
Views
391
  • Calculus and Beyond Homework Help
Replies
8
Views
944
  • Calculus and Beyond Homework Help
Replies
16
Views
1K
Back
Top