Standard deviation with frequency

Big-Daddy · Mar 16, 2013

How do I work out the standard deviation of a set of variables where the frequency changes by a certain factor each time, but where there is no limit on the number of variables?

Let's just use "test scores" as an example because that's easy to understand.

The average is 80; 1 in 5 people score 79, and the same for 81; 1 in 5^2=25 people score 78, and the same for 82; 1 in 5^3=125 people score 77 or 83; etc., etc. What is the standard deviation? And what if 5 here is replaced with n, and the average by a - how do I work out an expression in a and n for the standard deviation?

There's an additional complication too - say the average is 1000 and 1 in n people score 1100 (1 in n people score 900), 1 in n^2 people score 800 (same for 1200), etc. (The complication of course is that n now refers to the multiplication factor for 100 points difference from the average, not 1 as before). How do I work out the standard deviation then?

pbuk · Mar 16, 2013

You know that the standard deviation of a distribution is usually determined using ## \sigma = \sqrt{\frac{1}{N} \sum_{i=1}^N (x_i - \mu)^2} ##, where ## \mu ## is the mean?

The purpose of the ## \frac{1}{N} ## here is to get the mean difference. Where there is no ## N ## but we know the frequencies (and the frequency of the ## i{\rm th} ## term is ## f_i ##, the equivalent formula is ## \sigma = \sqrt{\sum_{i} f_i (x_i - \mu)^2} ##. Can you see why this is so?

So in your ## \mu = 80 ## example, one term is ## 0.2(79 - 80)^2 ##.

Why do you think this distribution is interesting?

Big-Daddy · Mar 16, 2013

MrAnchovy said:

You know that the standard deviation of a distribution is usually determined using ## \sigma = \sqrt{\frac{1}{N} \sum_{i=1}^N (x_i - \mu)^2} ##, where ## \mu ## is the mean?

The purpose of the ## \frac{1}{N} ## here is to get the mean difference. Where there is no ## N ## but we know the frequencies (and the frequency of the ## i{\rm th} ## term is ## f_i ##, the equivalent formula is ## \sigma = \sqrt{\sum_{i} f_i (x_i - \mu)^2} ##. Can you see why this is so?

So in your ## \mu = 80 ## example, one term is ## 0.2(79 - 81)^2 ##.

Why do you think this distribution is interesting?

Shouldn't it be ## 0.2(79 - 80)^2 ## where 80 is the mean ## \mu ## ?

Actually I had not come across the first formula (I know very little statistics myself), though I have used the second one from time to time.

I'm not sure why the distribution is "interesting" ...

pbuk · Mar 16, 2013

Big-Daddy said:

Shouldn't it be ## 0.2(79 - 80)^2 ## where 80 is the mean ## \mu ## ?

Yes of course, sorry - I have edited it.

Big-Daddy said:

I'm not sure why the distribution is "interesting" ...

Why are you asking the question then? My point is that I cannot think of anything in the real world that would have this distribution and, with respect, you don't seem to know enough statistics to be interested in the characteristics of a purely theoretical distribution for its own sake.

Big-Daddy · Mar 17, 2013

MrAnchovy said:

Yes of course, sorry - I have edited it.

Why are you asking the question then? My point is that I cannot think of anything in the real world that would have this distribution and, with respect, you don't seem to know enough statistics to be interested in the characteristics of a purely theoretical distribution for its own sake.

I'm just interested to see how the calculation would be done in a situation which is not defined by a set number of data points but rather by an "infinitely wide" distribution. My interest in this problem was piqued by the idea that standard deviation is more fit a tool for distribution than for variance in an exact set of data points, which led me on to think of this problem.

How would the calculation proceed?

pbuk · Mar 17, 2013

Big-Daddy said:

I'm just interested to see how the calculation would be done in a situation which is not defined by a set number of data points but rather by an "infinitely wide" distribution.

Ah, I thought so. Unfortunately the distribution you described is not like most "infinitely wide" distributions that are actually used in statistics, or that appear in the real world. These distributions are described by their probability density function ## p(x) ## which is usually continuous so that it has values at 80.5 etc. For these distributions the standard deviation is not calculated by summing differences from the mean at discrete points, it is obtained by integration thus:

[itex]\sigma = \sqrt{\int_\mathbf{X} (x-\mu)^2 \, p(x) \, dx}[/itex]

Big-Daddy said:

My interest in this problem was piqued by the idea that standard deviation is more fit a tool for distribution than for variance in an exact set of data points, which led me on to think of this problem.

It is true that standard deviation is not a good general tool for measuring dispersion ("how variable" elements of a data set are) because we square the difference from the mean, points that are far from the mean (outliers) are given too much "weight" in the calculation.

So why do we use it? One reason is because it can in general be calculated for a distribution using the formula above; more robust measures of scale can in general only be calculated by evaluating discrete steps. So you see the answer to your inital question, which involved evaluating discrete steps to calculate SD, could easily cause confusion.

Big-Daddy · Mar 17, 2013

MrAnchovy said:

Ah, I thought so. Unfortunately the distribution you described is not like most "infinitely wide" distributions that are actually used in statistics, or that appear in the real world. These distributions are described by their probability density function ## p(x) ## which is usually continuous so that it has values at 80.5 etc. For these distributions the standard deviation is not calculated by summing differences from the mean at discrete points, it is obtained by integration thus:

[itex]\sigma = \sqrt{\int_\mathbf{X} (x-\mu)^2 \, p(x) \, dx}[/itex]

It is true that standard deviation is not a good general tool for measuring dispersion ("how variable" elements of a data set are) because we square the difference from the mean, points that are far from the mean (outliers) are given too much "weight" in the calculation.

So why do we use it? One reason is because it can in general be calculated for a distribution using the formula above; more robust measures of scale can in general only be calculated by evaluating discrete steps. So you see the answer to your inital question, which involved evaluating discrete steps to calculate SD, could easily cause confusion.

Thank you.

I am happy to work with the continuous function for now. What exactly does your integration mean here, though:

[itex]\sigma = \sqrt{\int_\mathbf{X} (x-\mu)^2 \, p(x) \, dx}[/itex]

You seem to have put on the lower limit, "X" (which is what, by the way?), but no upper limit? And what will p(x) look like for a couple of my examples? (That should help me understand how to calculate with it.)

pbuk · Mar 17, 2013

Sorry, that could have been clearer. X refers to the whole range over which the function p(x) is defined; it is not a lower limit. Often the range is from -∞ to ∞ so the integral is ## \int_{-\infty}^\infty ##

For your first example, p(x) is exactly as you have described it:

0.2 when x = 79, 81;
0.2² when x = 78, 82;
0.2³ ...

In the real world, exam results (and many other things - this is the most important distribution of all from a number of points of view) are likely to follow a Normal distribution with PDF:

[itex]f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{ -\frac{(x-\mu)^2}{2\sigma^2} }[/itex] which has mean ## \mu ## and standard deviation ## \sigma ##

Big-Daddy · Mar 18, 2013

MrAnchovy said:

Sorry, that could have been clearer. X refers to the whole range over which the function p(x) is defined; it is not a lower limit. Often the range is from -∞ to ∞ so the integral is ## \int_{-\infty}^\infty ##

For your first example, p(x) is exactly as you have described it:

0.2 when x = 79, 81;
0.2² when x = 78, 82;
0.2³ ...

Can this be expressed as a single function? Right now it simply seems a set of variables ... how can I get a function representing the probability of my pick being at a certain value of x, as a function of that x?

MrAnchovy said:

In the real world, exam results (and many other things - this is the most important distribution of all from a number of points of view) are likely to follow a Normal distribution with PDF:

[itex]f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{ -\frac{(x-\mu)^2}{2\sigma^2} }[/itex] which has mean ## \mu ## and standard deviation ## \sigma ##

Hmm, does the standard deviation not work to leave a fairly constant ratio between values?

pbuk · Mar 18, 2013

Big-Daddy said:

Can this be expressed as a single function? Right now it simply seems a set of variables ... how can I get a function representing the probability of my pick being at a certain value of x, as a function of that x?

Well ## P(x) = 0.2^{|80-x|} ## works for most natural number values of ## x ##. Compare this with the geometric distribution (note that the known mean and SD of the Geometric distribution could help you calculate the SD for your distribution).

Big-Daddy said:

Hmm, does the standard deviation not work to leave a fairly constant ratio between values?

I don't understand what you mean. Wikipedia has a good entry on the Normal distribution too.

Standard deviation with frequency

1. What is standard deviation with frequency?

2. How is standard deviation with frequency calculated?

3. What does a larger standard deviation with frequency indicate?

4. How is standard deviation with frequency used in data analysis?

5. Can standard deviation with frequency be negative?

Similar threads

Hot Threads

Recent Insights