95% Confidence of weighted average

CompuChip · Aug 29, 2013

Hi all,

It's been a while since I have asked a question here, but statistics has never been my forte. I have the feeling that although I know the definitions I do not completely grasp the concept of confidence intervals. Unfortunately I do need to come up with something sensible here.

The situation is that I'm performing n experiments, and for each experiment I'm measuring m values. Step 1 is, for every quantity, to calculate the average over all experiments and provide a 95% confidence interval. So far so good: I have some nice code that will give me a two-sided student t-value which I can use to construct the confidence interval.

Now the tricky bit, for me, is that I also need to take a weighted average of these averages. The question is how to calculate a statistically sensible confidence interval on this average.

So to summarize with symbols: I have nm quantities ##q_{ij}## (##i = 1, \cdots, n##; ##j = 1, \cdots m##). I have calculated ##\bar{x}_j = \frac{1}{n} \sum_{i = 1}^n q_{ij}## with the corresponding 95% CI ##[\bar{x}_j - \Delta x_j, \bar{x}_j + \Delta x_j]##.
Now I wish to calculate the weighted average ##\mu = \sum_{j = 1}^m w_j \bar{x}_j## (if you want, you may assume the w_j sum to 1) and would like to know how I can construct the CI for this, either from the ##\Delta x_j## or from ##q_{ij}## directly.

If they were standard deviations I would expect something like ##\sigma^2 = \frac{1}{n} \sum_{j = 1}^m \Delta x_j^2## but I don't think it works that way for confidence intervals.

[edit]Let's also assume independence where needed, I will worry about that after I get an initial idea.[/edit]

chiro · Aug 30, 2013

Hey CompuChip.

In statistics, the area is typically known as meta-analyses. You have a few options.

If you know the distribution (or assume it to be something) then you can just use E[aX + bY] = aE[X] + bE[Y]. For the variance, this is a bit more complicated since Var[X + Y] = Var[X] + Var[Y] + 2*Cov[X,Y]. If everything is independent, the covariance terlm is zero but if not then it will affect your confidence intervals.

If you are looking at testing inferences with respect to means, then all individual group mean estimators are roughly normal, then you can use the fact that the sum of a linear combination of normals is also normal. If there is covariance, then your variance matrix will have some off-diagonal entries and if no covariance then the cov(X,Y) terms are zero).

In the case of independence, and you assume all estimators are roughly normal with some mean and some variance then use:

E[aX + bY] = aE[X] + bE[Y]
Var[aX + bY] = a^2Var[X] + b^2Var[Y]

and do recursive applications to get your final estimator of a sum of weighted estimators of a mean. If there is reason to believe that covariance terms are non-zero, you need to factor this in because if you don't, your estimators (and confidence intervals) will either be way too narrow or way too wide.

Stephen Tashi · Aug 30, 2013

CompuChip;4486073 The situation is that I'm performing [i said:

n[/i] experiments, and for each experiment I'm measuring m values.

You have to decide whether there is some systematic effect that varies from experiment to experiment (- for example, temperature of the laboratory). If such an effect is possible then the simplest and safest thing to do is compute the weighted sum for the m values in each of the n experiments. Treat these n weighted sums as n measurements. Find the sample mean for those n measurements and state a confidence interval for them.

CompuChip · Sep 2, 2013

Hi chiro and Stephen. Thanks for your replies. I think I have worked it out, hopefully I can run it by you to check that I got it right. It's a pretty long post (again) I'm afraid but I have two questions at the end that I would very much appreciate your having a look at.

So let me start from the basics: I am doing K experiments, and each of those is repeated R times giving values ##x_{k,r}##. Stephen, the "experiments" actually consists of metrics calculated on a computer simulation so there is no physical laboratory involved, but what you write is what I had in mind.

I will assume that the per-experiment mean ##X_k## is normally distributed with mean ##\mu_k## and standard deviation ##\sigma_k##. Under this assumption I can estimate these parameters from the sample data as
$$\overline{X}_k = \frac{1}{R} \sum_{r = 1}^R x_{k,r}, \qquad S_k^2 = \frac{1}{R - 1} \sum_{r = 1}^R (x_{k,r} - \overline{X}_k )^2.$$
Then the variable ##T_k = \frac{X_k - \overline{X}_k}{S_k / \sqrt{R}}## follows a Student's T-distribution with ##(R-1)## degrees of freedom, which for every experiment ##k## leads to an ##\alpha##-level confidence interval (e.g. ##\alpha = 0.95##)
$$\overline{X}_k \pm t_{1 - \tfrac{\alpha}{2}, R -1 } \frac{S_k^2}{\sqrt{R}}.$$

So far, so good?

Now I would like to define
$$X = \sum_{k = 1}^K w_k X_k = \sum_{k = 1}^K \sum_{r = 1}^R \frac{1}{R} w_k x_{k, r}$$
where I assume that ##\sum w_k = 1## and ##w_k > 0## for all ##k = 1, \ldots, K##.

Since ##X## is a linear combination of normally distributed variables, it is itself normally distributed with mean ##\mu## and standard deviation ##\sigma## given by
$$\mu = \sum_{k = 1}^K w_k \mu_k, \qquad \sigma^2 = \sum_{k = 1}^K \sum_{k' = 1}^K w_k w_{k'} \operatorname{cov}(X_k, X_{k'})$$
respectively. Here ##\operatorname{cov}(X, Y)## indicates the covariance between ##X## and ##Y##, which reduces to the variance of ##X## when ##X = Y##.

The estimators for the normal parameters in this case are
$$M = \sum_{k = 1}^K w_k \overline{X}_k, \qquad S^2 = \sum_{k = 1}^K \sum_{k' = 1}^K w_k w_{k'} q_{kk'}$$
where
$$q_{kk'} = \sum_{r = 1}^R (x_{k,r} - \overline{X}_k)(x_{k',r} - \overline{X}_{k'})$$
is the sample covariance of all observations.

Now if ##T \equiv \frac{X - M}{S / \sqrt{K}}## follows a Student's T-distribution, then as before the ##\alpha##-level confidence interval (e.g. ##\alpha = 0.95##) will be
$$\overline{X} \pm t_{1 - \tfrac{\alpha}{2}, K - 1 } \frac{S^2}{\sqrt{K}}.$$

Two outstanding questions I have, apart from request for general bashing of this approach:
(1) I get the feeling I am missing some prefactor like ##1/(K - 1)## or ##1/\left(1 - \sum (w_k^2) \right)## in my expression for the combined standard deviation ##\sigma^2##. However when using the bilinearity of variance to inductively derive what ##\sigma(X_1 + \ldots + X_n)## should look like it doesn't appear.
(2) How valid is the assumption about the distribution of ##T## in the last paragraph?

Thanks again!

chiro · Sep 2, 2013

Is there any reason why you can't use a Normal approximation? If you have enough observations for each mean term Xk then you might as well use the normal approximation since adding normals with regards to getting a joint distribution is very easy (linear combinations of normals will always be multivariate normal).

CompuChip · Sep 2, 2013

Although I usually expect R to be pretty large, in the sample data I've been given it is 4 (so every quantity is measured 4 times). This gives a 6% deviation between the student-T and normal approach for every experiment separately, but when I combine them I get a discrepancy of about 25% in the calculated overall confidence interval. So it kinda matters which I choose here :-)

CompuChip · Sep 3, 2013

Actually I found out I need the student-T because the 6% I mentioned was an error and that discrepancy is also in the 25% range.

I currently have an answer which "looks about right" based on my post #4 - still have no idea how much statistical sense it makes though :-)

Stephen Tashi · Sep 4, 2013

CompuChip said:

I will assume that the per-experiment mean ##X_k## is normally distributed with mean ##\mu_k## and standard deviation ##\sigma_k##.

That's not very controversial, but there are some specialized fields (like portfolio optimization) where different distributions are fashionable.

However when using the bilinearity of variance to inductively derive what ##\sigma(X_1 + \ldots + X_n)## should look like it doesn't appear.

Don't confuse [itex]\sigma^2[/itex] with an estimator for [itex]\sigma^2[/itex]. It wouldn't surprise me if estimator also has a bilinearity property, but we should do the algebra to check.

(2) How valid is the assumption about the distribution of ##T## in the last paragraph?

It's valid provided we show that the estimator you used for the weighted sum gives the same result as defining a result for the m-th experiment as [itex]Y_m = \sum w_i X_i[/itex] for the [itex]X_i[/itex] involved in the experiment and computing the estimator for the variance of [itex]Y[/itex] directly.

(It isn't really correct to speak of "the" estimator of the variance since several different estimators are possible. The one that's part of the t-statistic is the one you correctly used.)

95% Confidence of weighted average

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad The countability paradox of computable numbers

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect