Standard deviation is wrong whatever I do?

AI Thread Summary
The discussion revolves around discrepancies in calculating standard deviation using different methods, specifically between manual calculations and Microsoft Excel results. The user initially calculated a standard deviation of 0.105 but found Excel's output to be 0.9428, leading to confusion. Key issues identified include incorrect handling of weights in the calculations and misunderstandings about the appropriate denominator for the standard deviation formula. After corrections, the user confirmed that the standard deviation should be reported as approximately 0.093, emphasizing the importance of accurate data entry and understanding of statistical formulas. The conversation highlights the complexities of weighted calculations and the need for careful attention to detail in statistical analysis.
santoki
Messages
34
Reaction score
0
0AcVc67.jpg


After filling out the table, I used it to finally calculate the standard deviation as 0.105. Just to make sure, I ran it through Microsoft Excel and got 0.942809042. I put two together and figured I'm doing something wrong after the xw column because Excel agrees that the mean is also 1.7.

Sample calculations:
d = x - <x>
d = (1.50 m) - (1.7 m)
d = -0.2 m

d2 = (-0.2 m)2 = 0.04

wd2 = (1)(0.04) = 0.04

SD = \sqrt{[0.04+0.0225+0.09+0.0075+0+0+0.02+0+0.04]/(21-1)}
SD = 0.1048808848
 
Physics news on Phys.org
If you take the weights seriously, you end up with ##{\Sigma x_i w_i \over \Sigma w_i }= 1.652##
 
I think your problem is rounding error. 1.7 is not a sufficiently accurate value for the mean.
 
  • Like
Likes 1 person
In practice, we avoid having to correct all di (in case of a change or correction) by keeping track of x and x2 and later using
< (x - <x>)2 > = <x2> - 2 < x <x> > + (<x>)2= <x2> - 2 <x><x> > + (<x>)2 = <x2> - <x>2
 
changed mean as suggested

L2MTD6y.png


Actually it wasn't just the mean, I didn't round anything till I solved for the standard deviation. Though what I ended up with was in the hundredths place!

SD = \sqrt{0.172381853/20}
SD = 0.0928390686

I really don't understand what kind of math error I'm doing or if I'm missing something?
 
santoki said:
SD = 0.0928390686
That's the right answer (now). You must have made a mistake in how you put it into Excel.
 
haruspex said:
That's the right answer (now). You must have made a mistake in how you put it into Excel.

I've rechecked my data on Excel and it matched up. I've only highlighted the x and w columns to make a descriptive statistics analysis on Excel and came up with that.

Do you have any idea what could have caused the two answers to be a decimal place away from each other?
 
[strike]Ah, the denominator in the square root for SD is not the sum of the weights -1, but the number of observations - 1, so 8. Then you get 0.147, so you report <x> = 1.65 +/- 0.05 with a standard deviation of 0.15[/strike]

Edited: no, the ∑w -1 should be in the denominator.

I had a hard time finding the smaller discrepancies, but: there seems to be a 0 missing in the d for x=1.65.
Code:
		x	w	xw		d	d^2		wd^2
							
		1.5	1	1.5	-0.152380952	0.023219955	0.023219955
		1.55	1	1.55	-0.102380952	0.010481859	0.010481859
		1.6	9	14.4	-0.052380952	0.002743764	0.024693878
		1.65	3	4.95	[B]-0.002380952[/B]	0.000005669	0.000017007
		1.7	4	6.8	0.047619048	0.002267574	0.009070295
		1.75	0	0	0.097619048	0.009529478	0.000000000
		1.8	2	3.6	0.147619048	0.021791383	0.043582766
		1.85	0	0	0.197619048	0.039053288	0.000000000
		1.9	1	1.9	0.247619048	0.061315193	0.061315193
							
	Sum	15.3	21	34.7	0.428571429	0.170408163	0.172380952
							
	XL mean	1.70					
	XL stdev	0.136930639
					
weighted actual
	<x>		[B]1.652[/B]		(from 34.7/21)
[strike]	SD		0.147		(from sqrt( 0.17238/(9-1) )[/strike]
	SD		0.09283		(from sqrt( 0.17238/([b]20[/b]-1) )
Edited: Sorry to make a mess of things. Now I really have to think what the best estimate for the uncertainty in the mean is... probably sd/√∑w = 0.0202, so you would still report something like

1.65 +/- 0.02 with σ = 0.09
 
Last edited:
BvU said:
Ah, the denominator inder the square root for SD is not the sum of the weights -1, but the number of observations - 1, so 8.[/code]

I checked my calculation for x = 1.65 and you were right, I three zeroes instead of two. My last question is that why did the N (21) from N = ∑w and <x> = (∑xw)/N changed to the N (9) in the SD formula. Wouldn't that be unclear for students and would all make the same mistake of using 21 as N for each column?
 
  • #10
So your SD 0.093 was correct. As Haruspex already stated. :redface: I was so embarrassed I forgot the ##\Sigma##w that I edited my last post. Somewhat confusing, I'm sorry to say.

Your 3 zeros instead of 2 surprised me: why not let XL do the work ?

Then: why XL Descriptive statistics doesn't do what you expect: well for the height because it isn't weighed. For the frequency it calculates the average weight (21/9) which isn't what you want. I don't know how to seduce XL to take the weights into account as weights, so I had to trick it by simply typing 1.6 nine times, 1.65 three times, etc. Then Descriptive statistics does do the remainder of the work:

Code:
Test	
	
Mean			1.652380952
Standard Error		0.020259093
Median			1.6
Mode			1.6
Standard Deviation	0.092838826
Sample Variance		0.008619048
...
Sum	34.7
Count	21
 
  • #11
BvU said:
why XL Descriptive statistics doesn't do what you expect: well for the height because it isn't weighed. For the frequency it calculates the average weight (21/9) which isn't what you want. I don't know how to seduce XL to take the weights into account as weights, so I had to trick it by simply typing 1.6 nine times, 1.65 three times, etc. Then Descriptive statistics does do the remainder of the work[/code]

I think I get what you mean now that I can compare two different outcomes. Thank you very much for your help!
 
Back
Top