# Standard deviation is wrong whatever I do?

1. Feb 13, 2014

### santoki

After filling out the table, I used it to finally calculate the standard deviation as 0.105. Just to make sure, I ran it through Microsoft Excel and got 0.942809042. I put two together and figured I'm doing something wrong after the xw column because Excel agrees that the mean is also 1.7.

Sample calculations:
d = x - <x>
d = (1.50 m) - (1.7 m)
d = -0.2 m

d2 = (-0.2 m)2 = 0.04

wd2 = (1)(0.04) = 0.04

SD = $\sqrt{[0.04+0.0225+0.09+0.0075+0+0+0.02+0+0.04]/(21-1)}$
SD = 0.1048808848

2. Feb 13, 2014

### BvU

If you take the weights seriously, you end up with ${\Sigma x_i w_i \over \Sigma w_i }= 1.652$

3. Feb 13, 2014

### haruspex

I think your problem is rounding error. 1.7 is not a sufficiently accurate value for the mean.

4. Feb 13, 2014

### BvU

In practice, we avoid having to correct all di (in case of a change or correction) by keeping track of x and x2 and later using
< (x - <x>)2 > = <x2> - 2 < x <x> > + (<x>)2= <x2> - 2 <x><x> > + (<x>)2 = <x2> - <x>2

5. Feb 14, 2014

### santoki

changed mean as suggested

Actually it wasn't just the mean, I didn't round anything till I solved for the standard deviation. Though what I ended up with was in the hundredths place!

SD = $\sqrt{0.172381853/20}$
SD = 0.0928390686

I really don't understand what kind of math error I'm doing or if I'm missing something?

6. Feb 14, 2014

### haruspex

That's the right answer (now). You must have made a mistake in how you put it into Excel.

7. Feb 14, 2014

### santoki

I've rechecked my data on Excel and it matched up. I've only highlighted the x and w columns to make a descriptive statistics analysis on Excel and came up with that.

Do you have any idea what could have caused the two answers to be a decimal place away from each other?

8. Feb 14, 2014

### BvU

[strike]Ah, the denominator in the square root for SD is not the sum of the weights -1, but the number of observations - 1, so 8. Then you get 0.147, so you report <x> = 1.65 +/- 0.05 with a standard deviation of 0.15[/strike]

Edited: no, the ∑w -1 should be in the denominator.

I had a hard time finding the smaller discrepancies, but: there seems to be a 0 missing in the d for x=1.65.
Code (Text):

x   w   xw      d   d^2     wd^2

1.5 1   1.5 -0.152380952    0.023219955 0.023219955
1.55    1   1.55    -0.102380952    0.010481859 0.010481859
1.6 9   14.4    -0.052380952    0.002743764 0.024693878
1.65    3   4.95    [B]-0.002380952[/B] 0.000005669 0.000017007
1.7 4   6.8 0.047619048 0.002267574 0.009070295
1.75    0   0   0.097619048 0.009529478 0.000000000
1.8 2   3.6 0.147619048 0.021791383 0.043582766
1.85    0   0   0.197619048 0.039053288 0.000000000
1.9 1   1.9 0.247619048 0.061315193 0.061315193

Sum 15.3    21  34.7    0.428571429 0.170408163 0.172380952

XL mean 1.70
XL stdev    0.136930639

weighted actual
<x>     [B]1.652[/B]        (from 34.7/21)
[strike]    SD      0.147       (from sqrt( 0.17238/(9-1) )[/strike]
SD      0.09283     (from sqrt( 0.17238/([b]20[/b]-1) )

Edited: Sorry to make a mess of things. Now I really have to think what the best estimate for the uncertainty in the mean is..... probably sd/√∑w = 0.0202, so you would still report something like

1.65 +/- 0.02 with σ = 0.09

Last edited: Feb 14, 2014
9. Feb 14, 2014

### santoki

I checked my calculation for x = 1.65 and you were right, I three zeroes instead of two. My last question is that why did the N (21) from N = ∑w and <x> = (∑xw)/N changed to the N (9) in the SD formula. Wouldn't that be unclear for students and would all make the same mistake of using 21 as N for each column?

10. Feb 14, 2014

### BvU

So your SD 0.093 was correct. As Haruspex already stated. I was so embarrassed I forgot the $\Sigma$w that I edited my last post. Somewhat confusing, I'm sorry to say.

Your 3 zeros instead of 2 surprised me: why not let XL do the work ?

Then: why XL Descriptive statistics doesn't do what you expect: well for the height because it isn't weighed. For the frequency it calculates the average weight (21/9) which isn't what you want. I don't know how to seduce XL to take the weights into account as weights, so I had to trick it by simply typing 1.6 nine times, 1.65 three times, etc. Then Descriptive statistics does do the remainder of the work:

Code (Text):
Test

Mean            1.652380952
Standard Error      0.020259093
Median          1.6
Mode            1.6
Standard Deviation  0.092838826
Sample Variance     0.008619048
...
Sum 34.7
Count   21

11. Feb 14, 2014

### santoki

I think I get what you mean now that I can compare two different outcomes. Thank you very much for your help!