1. Oct 23, 2014

### qspeechc

Hi everyone.

It's been years since I've done any stats, so I need a bit of help, please. I want to include it in a blog post I'm going to do (not here on PF), so I don't want to give away too many details :p I apologise for my terrible understanding of stats, please be patient!

Anyway, over ten years I have 20 data points for each year, i.e. 200 in total, which are positive integers. In practice they are never higher than 2000, although conceivably they could be. The assumption is that each number is generated randomly.

1) How do I test if a given data point is too large to be random, given that the other numbers tend to be smaller?

2) A 'source' produces one data point a year, how can I test if this source is producing abnormally high numbers over the ten years?

Thank you for any help.

2. Oct 23, 2014

### Staff: Mentor

For both cases I would compute the mean and standard deviation of the distribution of your 200 data points.
1) You can check if they follow a typical distribution (most notably the Gaussian distribution). If yes, everything that you would not expect given this distribution might be some real effect. You cannot be sure without a clear model, but you can get a good idea with that method.
2) Check the mean and expected deviation of this mean, see if it is compatible with the first distribution.

3. Oct 24, 2014