# Sampling distribution of mean

1. Oct 30, 2015

### toothpaste666

1. The problem statement, all variables and given/known data
suppose that 50 random samples of size n = 10 are to be taken from a population having the discrete uniform distribution
f(x) = 1/10 for x = 0,1,2,...,9
0 elsewhere
sampling is with replacement so that we are sampling from an infinite population. we get 50 random samples whose means are ... (they list 50 means)

suppose that we convert the 50 samples into 25 samples of size n = 20 by combining the first two, the next two and so on, find the means of these samples and calculate their mean and their standard deviation. compare this mean and this standard deviation with the corresponding values expected in accordance with following theorem: if a random sample of size n is taken from a population having the mean μ and variance σ^2 , then X is a random variable whose distribution has the mean μ. for samples from infinite populations the variance of this distribution is σ^2/n

3. The attempt at a solution
I just want to make sure my method is correct. for each of the two means i am "combining" I think what they mean by combining is to find the mean of the two means to be combined. So if the first two means out of the 50 that they list are 4.4 and 3.2 , i combine them by finding the mean (4.4+3.2)/2 = 3.8 and now this is a mean of a sample of size 20 instead of 10. Once I combine the 50 samples into 25 samples this way, I find the mean and standard deviation of the 25 samples using the formulas μ = Σx/n and σ^2 = Σ(x-μ)^2/(n-1) . Then they want me to compare these with the ones I get from the theorem. I find these by using
μ = Σ(from 0 to 9) x(1/10) = 4.5
and
σ^2 = Σ(from 0 to 9)(x-4.5)^2(1/10) = 8.25
since n = 20 the variance is
8.25/20 = .4125

am i doing this the right way?

2. Oct 31, 2015

### krebs

$$\mu = \sum\limits_{x=0}^9 \frac{x}{10} = 4.5$$

$$\sigma^2 = \sum\limits_{x=0}^9 \frac{(x-4.5)^2}{10} = 8.25$$

Where are you getting the 10 in the denominator from? You have 25 numbers in your data table...

3. Oct 31, 2015

### Ray Vickson

He has 10 x 50 = 500 numbers $X_1, X_2, \ldots, X_{500}$, with each $X_i$ being an independent sampled value from UNIF{0,1,...,9}. I think he is taking $\mu$ and $\sigma^2$ to be $EX_i$ and $\text{Var} X_i$, which do, indeed, have '10' in the denominator. Then, he is computing
$$\text{Var} \left( \frac{1}{20} \sum_{i=1}^{20} X_i \right) = \sigma^2/20$$
I don't think the wording of the question is crystal clear, but his interpretation is one defensible reading.

Last edited: Oct 31, 2015
4. Oct 31, 2015

### toothpaste666

Is it correct that I combine the samples like that? for some reason my variance is coming out negative

5. Oct 31, 2015

### krebs

Oh, that could be. The way I read it is that he was just given a list of 50 means, and he needed to calculate the mean and standard deviation of that list, and then repeat for a list of 25 made by combining every set of two means. I find it hard to believe that his data table has 500 numbers for him to deal with.

Last edited: Oct 31, 2015
6. Oct 31, 2015

### Ray Vickson

Note the typo in the above; I have corrected it in Post # 3. I should have written Var(1/20 sum X_i), not Var (sum X_i).

7. Oct 31, 2015

### Ray Vickson

No, it does not: he was not GIVEN 500 numbers. He was given 50 numbers, each of which is a sample-mean of size 10. However, the data was stated to come from a uniform distribution, and of course came from massaging 500 numbers, 10 at a time.

As I said, the wording of the question (if accurately reported) leaves a lot of room for interpretation. Personally, I would NOT have used the OP's interpretation, because it would have made more sense to me to look at his bundle of 50 numbers $\{ \bar{x}_i, i=1,2, \ldots, 50 \}$ as the data themselves, and to look not at the "theoretical" variance, but rather at the "sample variance, that would be given by
$$\text{Sample Var} = \frac{1}{49} \sum_{i=1}^{50} (\bar{x}_i - \bar{\bar{x}})^2,$$
where $\bar{\bar{x}} = \sum_{i=1}^{50} \bar{x}_i / 50$ is the sample mean of the $\{ \bar{x}_i \}$ data. That would have given rise to the thornier question of what happens when you combine the data into $y_1 = (\bar{x}_1+\bar{x}_2)/2, \: y_2 = (\bar{x}_3 + \bar{x}_4)/2, \ldots, \: y_{25} = (\bar{x}_{49} + \bar{x}_{50})/2$, and then try to get an appropriate formula for the sample variance of the $\{ y_j \}$ data in terms of sample variances associated with the original data. For example, when we deal with the "theoretical" variance, it does not matter if we combine the x's into y's and then take the variance, because the outcome will be the same either way. However, a question arises whether this remains true of "sample" variances rather than "theoretical" variances.

8. Oct 31, 2015

### krebs

Sorry toothpaste, I misread your question. I see what you are trying to do now. For this distribution,
μ = 4.5
σ2 = 8.25

if you take infinite n=1 samples of χ, then σ2 = 8.25
If you take infinite n=10 samples of χ, then σ2 = 0.825
If you take infinite n=20 samples of χ, then σ2 = 0.4125

So, you know your expected σ2 of χ at different sizes of n if you sample an infinite number of times.

Now you have to verify it using your 50 samples of χ where n = 10, and your 25 samples where n=20. Can you calculate the σ2 for your 50 and 25 observations?

9. Oct 31, 2015

### krebs

I went ahead and modeled this in excel for you, so you can see that it is true. See that as my number of samples of the mean increases, the variance approaches the expected values (which are based on the size of each of those samples). You only have 50 samples for n=10, and 25 for n=20, so your variances should be a bit different than the expected values, unless your textbook massaged the numbers to demonstrate this point.

#### Attached Files:

• ###### stats.png
File size:
172.8 KB
Views:
38
10. Oct 31, 2015

### toothpaste666

It was done for the 50 samples as an example in my book but they didn't really show the work they just said the answers and compared them. The exercise says to combine the 50 from the example into 25 and do it for those. after combining them these are the 25 values I get:

3.8, 4.3, 4.3, 5.1, 4.9
4.2, 4.1, 4.2, 4.9, 4.2
3.0, 5.2, 4.3, 4.5, 3.8
5.4, 5.6, 5.7, 4.0, 5.1
3.2, 4.5, 3.4, 5.0, 4.5

first I calculated

Σxi = 111.2
and
Σxi^2 = 506.72

then the mean is
x = Σxi/n = 111.2/20 = 5.56

and the variance is
s^2 =[Σxi^2 - (Σxi)^2/n]/(n-1) = [506.72 - (111.2)^2/20]/19 = -5.87

but I know this can't be right because it is negative and that would mean the standard deviation is complex. I know I did something wrong but I can't figure out what.

11. Nov 1, 2015

### Ray Vickson

You need to be dividing by 25 and 24, not by 20 and 19. That will leave you with a positive variance.

Last edited: Nov 1, 2015
12. Nov 1, 2015

### krebs

Variance is a statistic you calculate based off of a set of numbers with no other context besides the numbers. Why are you using 20 and 19?

13. Nov 1, 2015

### toothpaste666

I got mixed up with the sample size n=20 for the 25 samples. I see the mistake now. thank you