Standard deviation is wrong whatever I do?

In summary, the conversation discusses the calculation of the standard deviation for a set of data, which initially resulted in a value of 0.105 when calculated manually and 0.942809042 when calculated through Microsoft Excel. After further analysis and correction of a mistake, the correct value of 0.0928390686 was obtained. The conversation also addresses the use of weights in calculating the mean and the discrepancy in results when using Excel's Descriptive Statistics function.
  • #1
santoki
34
0
0AcVc67.jpg


After filling out the table, I used it to finally calculate the standard deviation as 0.105. Just to make sure, I ran it through Microsoft Excel and got 0.942809042. I put two together and figured I'm doing something wrong after the xw column because Excel agrees that the mean is also 1.7.

Sample calculations:
d = x - <x>
d = (1.50 m) - (1.7 m)
d = -0.2 m

d2 = (-0.2 m)2 = 0.04

wd2 = (1)(0.04) = 0.04

SD = [itex]\sqrt{[0.04+0.0225+0.09+0.0075+0+0+0.02+0+0.04]/(21-1)}[/itex]
SD = 0.1048808848
 
Physics news on Phys.org
  • #2
If you take the weights seriously, you end up with ##{\Sigma x_i w_i \over \Sigma w_i }= 1.652##
 
  • #3
I think your problem is rounding error. 1.7 is not a sufficiently accurate value for the mean.
 
  • Like
Likes 1 person
  • #4
In practice, we avoid having to correct all di (in case of a change or correction) by keeping track of x and x2 and later using
< (x - <x>)2 > = <x2> - 2 < x <x> > + (<x>)2= <x2> - 2 <x><x> > + (<x>)2 = <x2> - <x>2
 
  • #5
changed mean as suggested

L2MTD6y.png


Actually it wasn't just the mean, I didn't round anything till I solved for the standard deviation. Though what I ended up with was in the hundredths place!

SD = [itex]\sqrt{0.172381853/20}[/itex]
SD = 0.0928390686

I really don't understand what kind of math error I'm doing or if I'm missing something?
 
  • #6
santoki said:
SD = 0.0928390686
That's the right answer (now). You must have made a mistake in how you put it into Excel.
 
  • #7
haruspex said:
That's the right answer (now). You must have made a mistake in how you put it into Excel.

I've rechecked my data on Excel and it matched up. I've only highlighted the x and w columns to make a descriptive statistics analysis on Excel and came up with that.

Do you have any idea what could have caused the two answers to be a decimal place away from each other?
 
  • #8
[strike]Ah, the denominator in the square root for SD is not the sum of the weights -1, but the number of observations - 1, so 8. Then you get 0.147, so you report <x> = 1.65 +/- 0.05 with a standard deviation of 0.15[/strike]

Edited: no, the ∑w -1 should be in the denominator.

I had a hard time finding the smaller discrepancies, but: there seems to be a 0 missing in the d for x=1.65.
Code:
		x	w	xw		d	d^2		wd^2
							
		1.5	1	1.5	-0.152380952	0.023219955	0.023219955
		1.55	1	1.55	-0.102380952	0.010481859	0.010481859
		1.6	9	14.4	-0.052380952	0.002743764	0.024693878
		1.65	3	4.95	[B]-0.002380952[/B]	0.000005669	0.000017007
		1.7	4	6.8	0.047619048	0.002267574	0.009070295
		1.75	0	0	0.097619048	0.009529478	0.000000000
		1.8	2	3.6	0.147619048	0.021791383	0.043582766
		1.85	0	0	0.197619048	0.039053288	0.000000000
		1.9	1	1.9	0.247619048	0.061315193	0.061315193
							
	Sum	15.3	21	34.7	0.428571429	0.170408163	0.172380952
							
	XL mean	1.70					
	XL stdev	0.136930639
					
weighted actual
	<x>		[B]1.652[/B]		(from 34.7/21)
[strike]	SD		0.147		(from sqrt( 0.17238/(9-1) )[/strike]
	SD		0.09283		(from sqrt( 0.17238/([b]20[/b]-1) )
Edited: Sorry to make a mess of things. Now I really have to think what the best estimate for the uncertainty in the mean is... probably sd/√∑w = 0.0202, so you would still report something like

1.65 +/- 0.02 with σ = 0.09
 
Last edited:
  • #9
BvU said:
Ah, the denominator inder the square root for SD is not the sum of the weights -1, but the number of observations - 1, so 8.[/code]

I checked my calculation for x = 1.65 and you were right, I three zeroes instead of two. My last question is that why did the N (21) from N = ∑w and <x> = (∑xw)/N changed to the N (9) in the SD formula. Wouldn't that be unclear for students and would all make the same mistake of using 21 as N for each column?
 
  • #10
So your SD 0.093 was correct. As Haruspex already stated. :redface: I was so embarrassed I forgot the ##\Sigma##w that I edited my last post. Somewhat confusing, I'm sorry to say.

Your 3 zeros instead of 2 surprised me: why not let XL do the work ?

Then: why XL Descriptive statistics doesn't do what you expect: well for the height because it isn't weighed. For the frequency it calculates the average weight (21/9) which isn't what you want. I don't know how to seduce XL to take the weights into account as weights, so I had to trick it by simply typing 1.6 nine times, 1.65 three times, etc. Then Descriptive statistics does do the remainder of the work:

Code:
Test	
	
Mean			1.652380952
Standard Error		0.020259093
Median			1.6
Mode			1.6
Standard Deviation	0.092838826
Sample Variance		0.008619048
...
Sum	34.7
Count	21
 
  • #11
BvU said:
why XL Descriptive statistics doesn't do what you expect: well for the height because it isn't weighed. For the frequency it calculates the average weight (21/9) which isn't what you want. I don't know how to seduce XL to take the weights into account as weights, so I had to trick it by simply typing 1.6 nine times, 1.65 three times, etc. Then Descriptive statistics does do the remainder of the work[/code]

I think I get what you mean now that I can compare two different outcomes. Thank you very much for your help!
 

Related to Standard deviation is wrong whatever I do?

What is standard deviation?

Standard deviation is a measure of how spread out a set of data is from its mean. It is a statistical tool that helps to understand the variability or dispersion of a dataset.

Why is standard deviation important?

Standard deviation is important because it provides a measure of the amount of uncertainty or variation in a dataset. It is used in various statistical analyses and helps to compare the spread of data between different groups or populations.

What does it mean if standard deviation is wrong?

If the standard deviation is wrong, it means that there is an error in the calculation or interpretation of the data. This can lead to incorrect conclusions and affect the accuracy of statistical analyses.

Can standard deviation be negative?

No, standard deviation cannot be negative. It is always a positive value or zero, depending on the data. A negative value would not make sense in the context of measuring the spread of data.

How can I avoid making mistakes in calculating standard deviation?

To avoid mistakes in calculating standard deviation, it is important to carefully follow the formula and steps for calculating it. Double-checking the calculations and using software or calculators can also help to ensure accuracy.

Similar threads

  • Introductory Physics Homework Help
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Calculus and Beyond Homework Help
Replies
8
Views
3K
  • Calculus and Beyond Homework Help
Replies
4
Views
2K
  • Precalculus Mathematics Homework Help
Replies
1
Views
2K
  • Introductory Physics Homework Help
Replies
3
Views
3K
  • Introductory Physics Homework Help
Replies
14
Views
2K
  • Introductory Physics Homework Help
Replies
18
Views
2K
  • Other Physics Topics
Replies
14
Views
7K
  • Precalculus Mathematics Homework Help
Replies
7
Views
3K
Back
Top