Calculating a standard deviation involving weighted samples

Buzz Bloom · Jul 6, 2019

Below is the spreadsheet image.

The source of the data in the two columns, Ho and +/-, is the following article

http://planck.caltech.edu/pub/2015results/Planck_2015_Results_XIII_Cosmological_Parameters.pdf .

Let AHo be the conjoined weighted average values for Ho and the corresponding error ranges.

[1] AHo = AH +/- AE = 67.65 +/- 0.22.

The Ho column contains the 12 values, H_i, for the Ho variable, i being the index (of a variable value) ranging from 1 to 12.

The +/- column contains for each H_i value the corresponding value of an error range, E_i, which I assume to be of 1 standard deviation σ_i.

The Source column specifies where in the article the particular 12 values for H_i and E_i are found.

The Wt=... column contains values, W_i, for the weights for calculating weighted sums. I assume that the appropriate choice of weight are the squares of the reciprocals of the E_i values.

[2] W_i = 1/E_i²

The sum of this column is

[3] ΣW = Σ[i=1 to 12] W_i = 25.8982 .

The Wt*Ho column contains weighted Ho values

[4] WH_i = W_i H_i.

The sum of this column is

[5] ΣWH = Σ[i=1 to 12] WH_i = 1751.9631 .

The weighted average, AH, is

[6] AH = ΣWH / ΣW = 67.65.

I feel confident that this result is OK for AH. I am less confident that my calculation below of AE is OK. I will present my analysis, and explain my main concern about AE, in what follows.

The D2^2=... column contains values, D2_i, of the squares of the 12 differences between AH and a value of Ho.

[7] D2_i = (Ho-H_i)²

Althought not part of the calculations, it will later be useful to also have defined the sum of the differences squared.

[8] ΣD2 = Σ[i=1 to 12] (D2_i)

The Wt*D2 column contains values, WD2_i, of the product of a weight and a squared difference.

[9] WD2_i = W_i D2_i

[10] ΣWD2 = Σ[i=1 to 12] (WD2_i) = 1.2992

I now define Aσ as square root of a quotient: the weighted sum of squared differences and the sum of weights. Using [3] and [10]

[11] Aσ = (WD2/ΣW)^1/2 = 1.2992/25.8982 = 0.22

The main reason I am uncertain is that if all of the E_i values were the same, say 1/s, then then the sum of the weights, ΣW, would be

[12] ΣW = (12/s)².

Then the sum of the weighted squares of differences, ΣWD, would be

[`13] ΣWD2 = (1/s²) Σ[i=1 to 12] (D2_i)

= (1/s)²)ΣD2

and Aσ would be

[14] Aσ = [(1/12)ΣD2]^1/2^1/2.

Ordinarily, since there are 12 samples I would expect

[15] Aσ = [(1/11) ΣD2]^1/2^1/2.

Equation [15] is based on the article

https://en.wikipedia.org/wiki/Standard_deviation

I do not know how to adjust this for weighted samples.

Another minor concern is that my assumtion to use 1/E_i² as weights may not be the right approach. However, I think I have a good rational for doing this, but it is rather complicated, so I will not include it in this post. If anyone would like to see this rationale, I will post it later.

mathman · Jul 6, 2019

You can directly calculate the second moments using the weights. Then use the usual formula for the variance as an expression involving the first two moments.

Calculating a standard deviation involving weighted samples

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight