Deriving conclusion from a small data set

musicgold · Nov 4, 2013

Hi

Please see table below GDP growth rates of a group 5 countries. I am trying to derive some conclusions from this small sample.

1. can I conclude that countries in the group have very similar growth rates and there is no significant difference between their growth rates?

2. As shown, the highest growth rate is just one std. dev. higher than the sample average. However, I am not sure if the std. dev. of such a small sample mean anything?

3. What would I need to do to be able to say that there is no statistically significant difference between these growth rates?

Thanks.

--------------------------------
Country GDP growth rates
A ---- 3.3%
B ---- 3.0%
C ---- 2.9%
D ---- 2.8%
E ---- 2.4%

Average 2.9%
Std. dev 0.3%
Range 0.9%

Simon Bridge · Nov 4, 2013

I imagine each of the individual figures have their individual uncertainties based on how they were calculated and the data used to do so.

The std.dev. of the mean is not the same as the std.dev. of the dataset.

If this were physics, then I'd take the uncertainty on each quoted rate as 0.1pp (percentage points)
Therefore the uncertainty on the mean will be 0.04 ... so they are all significantly off the mean growth.
But so what?

Does country E have the same growth rate as country A?
Their difference is 0.7pp ... that's outside 2x0.3 right?

musicgold · Nov 5, 2013

Thanks Simon.

Simon Bridge said:

The std.dev. of the mean is not the same as the std.dev. of the dataset.

I am not sure what you mean by 'std dev of the mean'? Do you mean 'std dev. of the sample' or 'std error' ?

Simon Bridge said:

If this were physics, then I'd take the uncertainty on each quoted rate as 0.1pp (percentage points)
Therefore the uncertainty on the mean will be 0.04 ... so they are all significantly off the mean growth.

This is not clear to me. I mean I know what a pp is, but I don't understand how the uncertainty of the mean will be 0.04pp.

Simon Bridge · Nov 5, 2013

musicgold said:

Thanks Simon.

I am not sure what you mean by 'std dev of the mean'? Do you mean 'std dev. of the sample' or 'std error' ?

The std.dev. of the mean is the std.error of the value you got for the mean.
When each of the values in a data set have an uncertainty, the the mean of those values will also be uncertain.
Recall "hypothesis testing" in statistics?

This is not clear to me. I mean I know what a pp is, but I don't understand how the uncertainty of the mean will be 0.04pp.

That is the uncertainty on one value divided my the square root of the number of terms.
But I don't think the mean of the values is going to tell you what you want to know.

Consider: if you only had two values, how would you see if they were significantly similar?

Office_Shredder · Nov 5, 2013

There are two distinct questions that you can be asking:

1.) I measured five data points, but it's impossible to measure the GDP of a nation with 100% accuracy. Are these measurements the statistically different given the noise in the measurements?

2.) I have good measurements, but a nation's GDP growth can be affected by temporary effects. Are these five economies growing at statistically significant different rates?

Which one are you looking for?

musicgold · Nov 11, 2013

Office_Shredder said:

There are two distinct questions that you can be asking:

1.) I measured five data points, but it's impossible to measure the GDP of a nation with 100% accuracy. Are these measurements the statistically different given the noise in the measurements?

2.) I have good measurements, but a nation's GDP growth can be affected by temporary effects. Are these five economies growing at statistically significant different rates?

Which one are you looking for?

Thanks. I think I am looking for the first one. How should I figure that out?

Simon Bridge · Nov 11, 2013

eg. let's say you have two data points ##x\pm\sigma_x## and ##y\pm\sigma_y## and you want to know if they are statistically different from each other.

That would be like asking if the difference ##x-y## is within some confidence interval of zero.

You know how to do "hypothesis testing" right?

In your case, you need some way of estimating the statistical uncertainty in each individual measurement.
A common estimator is to take half the smallest quoted place-value in the measurement.
i.e. A: (3.3±0.05)%

You also have more than two data points.
It is possible that countries A and B have similar enough growth to be statistically the same but A and E are statistically different.

Deriving conclusion from a small data set

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective