Calculating a standard deviation involving weighted samples

In summary, the spreadsheet image shows the conjoined weighted average values for Ho and the corresponding error ranges, as well as the 12 values for Ho and their corresponding error ranges found in the source article. The weights for calculating weighted sums are calculated using the squares of the reciprocals of the error range values. The weighted average and the weighted sum of squared differences are calculated, with a final result of 0.22 for the square root of the quotient of the weighted sum of squared differences and the sum of weights. However, there may be some uncertainty in this calculation due to the assumption of using 1/Ei2 as weights and the adjustment for weighted samples.
  • #1
Buzz Bloom
Gold Member
2,519
467
TL;DR Summary
I used a spreadsheet (PNG image in main body section) to analyze 12 values with +/- error ranges which I assume for the purpose of this calculation to be standard deviation values. I am seeking conformation or corrections regarding the validity of the method I used.
Below is the spreadsheet image.
HoCalc.png

The source of the data in the two columns, Ho and +/-, is the following article
Let AHo be the conjoined weighted average values for Ho and the corresponding error ranges.
[1] AHo = AH +/- AE = 67.65 +/- 0.22.​

The Ho column contains the 12 values, Hi, for the Ho variable, i being the index (of a variable value) ranging from 1 to 12.

The +/- column contains for each Hi value the corresponding value of an error range, Ei, which I assume to be of 1 standard deviation σi.

The Source column specifies where in the article the particular 12 values for Hi and Ei are found.

The Wt=... column contains values, Wi, for the weights for calculating weighted sums. I assume that the appropriate choice of weight are the squares of the reciprocals of the Ei values.
[2] Wi = 1/Ei2
The sum of this column is
[3] ΣW = Σ[i=1 to 12] Wi = 25.8982 .​

The Wt*Ho column contains weighted Ho values
[4] WHi = Wi Hi.​
The sum of this column is
[5] ΣWH = Σ[i=1 to 12] WHi = 1751.9631 .​
The weighted average, AH, is
[6] AH = ΣWH / ΣW = 67.65.​

I feel confident that this result is OK for AH. I am less confident that my calculation below of AE is OK. I will present my analysis, and explain my main concern about AE, in what follows.

The D2^2=... column contains values, D2i, of the squares of the 12 differences between AH and a value of Ho.
[7] D2i = (Ho-Hi)2
Althought not part of the calculations, it will later be useful to also have defined the sum of the differences squared.
[8] ΣD2 = Σ[i=1 to 12] (D2i)​

The Wt*D2 column contains values, WD2i, of the product of a weight and a squared difference.​
[9] WD2i = Wi D2i
[10] ΣWD2 = Σ[i=1 to 12] (WD2i) = 1.2992​

I now define Aσ as square root of a quotient: the weighted sum of squared differences and the sum of weights. Using [3] and [10]
[11] Aσ = (WD2/ΣW)1/2 = 1.2992/25.8982 = 0.22​
The main reason I am uncertain is that if all of the Ei values were the same, say 1/s, then then the sum of the weights, ΣW, would be
[12] ΣW = (12/s)2.​
Then the sum of the weighted squares of differences, ΣWD, would be
[`13] ΣWD2 = (1/s2) Σ[i=1 to 12] (D2i)​
= (1/s)2)ΣD2​
and Aσ would be
[14] Aσ = [(1/12)ΣD2]1/21/2.​
Ordinarily, since there are 12 samples I would expect
[15] Aσ = [(1/11) ΣD2]1/21/2.​
Equation [15] is based on the article
246207

I do not know how to adjust this for weighted samples.

Another minor concern is that my assumtion to use 1/Ei2 as weights may not be the right approach. However, I think I have a good rational for doing this, but it is rather complicated, so I will not include it in this post. If anyone would like to see this rationale, I will post it later.
 
Last edited:
Physics news on Phys.org
  • #2
You can directly calculate the second moments using the weights. Then use the usual formula for the variance as an expression involving the first two moments.
 

FAQ: Calculating a standard deviation involving weighted samples

1. What is the formula for calculating standard deviation involving weighted samples?

The formula for calculating standard deviation for weighted samples is:
σ = √(∑(wi(xi-μ)^2) / ∑wi)
where σ is the standard deviation, wi is the weight of the ith sample, xi is the value of the ith sample, and μ is the weighted mean.

2. How do you determine the weight for each sample in a weighted sample set?

The weight for each sample in a weighted sample set is determined based on its relative importance or frequency in the population. This can be determined through prior knowledge or by using a weighting scheme, such as inverse probability weighting or post-stratification weighting.

3. Can standard deviation be calculated for a sample with negative weights?

Yes, standard deviation can be calculated for a sample with negative weights. However, the resulting standard deviation may not accurately represent the variability of the population, as negative weights can skew the results.

4. How does calculating standard deviation for weighted samples differ from calculating it for unweighted samples?

The main difference between calculating standard deviation for weighted and unweighted samples is the inclusion of weights in the formula. In unweighted samples, each sample is given equal weight, while in weighted samples, the weight of each sample is taken into consideration when calculating the standard deviation.

5. What are some common applications of calculating standard deviation involving weighted samples?

Calculating standard deviation involving weighted samples is commonly used in fields such as economics, social sciences, and market research, where samples may have varying levels of importance or representativeness in the population. It is also used in quality control and statistical process control to measure the variability of a process.

Similar threads

Replies
37
Views
739
Replies
15
Views
2K
Replies
1
Views
1K
Replies
4
Views
1K
Replies
5
Views
2K
Replies
3
Views
1K
Back
Top