Sampling distribution of mean item number?

In summary: Var}(M) = 44/5.5 = 8.12.In summary, the mean number drawn from a sample of size 30 with replacement is 5.5.
  • #1
Economics2012
9
0

Homework Statement



If you have a box of 1,000 items, with numbers 1-10 on them, 100 for each!
And this proves the discrete uniform probability distribution.
1/10 for each.

Homework Equations


Mean = u = Exp(x) = e(x)
St dev worked out by the variation.
St Dev = square root of the variation.
Variation )5 columns - X, X-U, X-U squared, p(x) and (x-u)squared multiplied by p(x)

The Attempt at a Solution


If you have a box of 1,000 items, with numbers 1-10 on them, 100 for each!
And this proves the discrete uniform probability distribution.
1/10 for each. I got a mean of 5.5 and a std dev of 2.872 when I worked this out.

If I am asked then to conduct a sampling distribution of the mean item number? with a sample size 30? how would you do this? Is it just finding the mean and std error or is there a few steps?

what new mean and standard error would it be? would it be still mean of 5.5 and std error of 0.5244? (old std dev/sq of 30)

This is as much as I can work out? I'm stuck basically on the conduct of a sampling distribution of the mean item?If I posted this wrong can somebody tell me, I don't want a warning :)
 
Physics news on Phys.org
  • #2
"Central Limit Theorem": If you have n samples from any distribution, with mean [itex]\mu[/itex], and standard deviation [itex]\sigma[/itex], then the mean value of the samples is approximately normal with mean [itex]\mu[/itex] and standard deviation [itex]\sqrt{n}\sigma[/itex]. The larger n is the better the approximation.
 
  • #3
Sorry, I don't understand what you mean?
 
  • #4
Economics2012 said:

Homework Statement



If you have a box of 1,000 items, with numbers 1-10 on them, 100 for each!
And this proves the discrete uniform probability distribution.
1/10 for each.


Homework Equations


Mean = u = Exp(x) = e(x)
St dev worked out by the variation.
St Dev = square root of the variation.
Variation )5 columns - X, X-U, X-U squared, p(x) and (x-u)squared multiplied by p(x)

The Attempt at a Solution


If you have a box of 1,000 items, with numbers 1-10 on them, 100 for each!
And this proves the discrete uniform probability distribution.
1/10 for each. I got a mean of 5.5 and a std dev of 2.872 when I worked this out.

If I am asked then to conduct a sampling distribution of the mean item number? with a sample size 30? how would you do this? Is it just finding the mean and std error or is there a few steps?

what new mean and standard error would it be? would it be still mean of 5.5 and std error of 0.5244? (old std dev/sq of 30)

This is as much as I can work out? I'm stuck basically on the conduct of a sampling distribution of the mean item?





If I posted this wrong can somebody tell me, I don't want a warning :)

Why do you keep using question marks at the ends of sentences that are not questions? (That is a really, really annoying bad habit.) Now on to your questions.

First you need to decide whether the sampling is "with replacement" or "without replacement".

In sampling with replacement we put each sampled item back into the box before drawing out the net item; and we shake up the box vigorously, or in some other way ensure randomness, before each drawing. This ensures that the drawings are independent of each other, and makes the subsequent analysis much easier.

In sampling without replacement we draw out items one-by-one but do not put them back in the box. Therefore, the successive drawings are not independent, because, for example, if I get the number '1' on my first draw there are now 999 items left in the box and 9 of them are labelled '1'. That changes the probabilities on the next draw, etc. The sampling problem without replacement is harder to deal with.

So, let's take the case of sampling with replacement. For a sample of size n (n = 30 in your example) the mean number drawn is the sample average
[tex] M = \frac{X_1 + X_2 + \cdots + X_n}{n}, [/tex] where [itex]X_i[/itex] is the number picked in the ith draw. Here the random variables [itex] X_1, X_2, \ldots, X_n[/itex] are independent and all have the same distribution [itex] P\{X_i = k \} = 1/10, \; k = 1, 2, \ldots, 10.[/itex] There are some basic facts that you can find in books or on-line: (1) the expectation of a sum is the sum of the expectations; (2) the expectation of cX is c times the expectation of X; (3) the variance of a sum of independent random variables is the sum of the variances; and (4) the variance of cX is c^2 times the variance of X. We have [itex] E(X_i) = 5.5 \text{ and } \text{Var}(X_i) = 44/4 = 8.25,[/itex] so
[tex] E(M) = \frac{n 5.5}{n} = 5.5 \text{ and } \text{Var}(M) = \frac{n 8.25}{n^2} = \frac{8.25}{n}.[/tex] For n = 30 the variance is 8.25/30 = 0.275 and the standard deviation is the square root of this, which = 0.5244.

What about the distribution of M? The exact distribution can (for given n) be obtained numerically by recursive methods, but there is no really easy way of getting it. However, if n is 'large', say >= 20, the distribution of M is approximately normal with mean 5.5 and variance 8.25/n; the approximation will be good enough for practical purposes if we stay with 2-3 standard deviations from the mean. So, for example, if n = 30 and you want [itex]P\{ M \leq 6.5 \},[/itex] you can use the fact that 6.5 = 5.5 + k(0.5344), where k = 1/0.5244 ~ 1.9069, so the required probability is approximately P{N(0,1) <= 1.9069} = 0.9717.

The so-called Central Limit Theorem guarantees that in the limit of large n, the appropriately-normalized version of M (sqrt(n)*M in this case) converges in distribution to a standard normal, meaning that its distribution converges to that of N(0,1). That justifies the use of a normal approximation for large n.

Gook luck working with the case of sampling without replacement; it can be done, but it is a lot more complicated and would require large amounts of computation to get numerical answers.

RGV
 
Last edited:
  • #5
Thank you very much, your too good.

Sorry about the question marks.

Can I ask you one more thing

What would be the probability of drawing 30 tiles and obtaining a meal tile number of less than 4?

I keep getting .9332 by using the tables of the normal distribution, I know that's probably wrong though :/
 
Last edited:
  • #6
Economics2012 said:
Thank you very much, your too good.

Sorry about the question marks.

Can I ask you one more thing

What would be the probability of drawing 30 tiles and obtaining a meal tile number of less than 4?

I keep getting .9332 by using the tables of the normal distribution, I know that's probably wrong though :/

Show your work; otherwise it is impossible for me to tell you what you have done wrong.

RGV
 
  • #7
I knew it had to do with sampling, so I just took 4 from 5.5 and got 1.5 on the http://www.cs.washington.edu/homes/jrl/normal_cdf.pdf here and that's how I got it, I think that's more than likely completely incorrect.
P(M>4), that's what I thought you did. Am i way off?
 
  • #8
Economics2012 said:
I knew it had to do with sampling, so I just took 4 from 5.5 and got 1.5 on the http://www.cs.washington.edu/homes/jrl/normal_cdf.pdf here and that's how I got it, I think that's more than likely completely incorrect.
P(M>4), that's what I thought you did. Am i way off?

I did not "do" anything like saying P(M>4); YOU did that. Anyway, you need P{M < 4}, and using the Normal approximation we need the probability that Z < z, where z = (4 - 5.5)/(0.5244) = -2.8604; that is, P{Z < -2.8604}. The normal approximation might be dicey in this case because we are right near or a bit beyond the limit of where the normal approximation is trustworthy when n is not very, very large.

Before calculating anything, it is always a good idea to get a "feel" for the range of an answer. In this case you want P{M < 4} and 4 is less than the population mean 5.5. Therefore, the probability of falling below 4 will be < 1/2. That should be enough to warn you that an answer like 0.9332 cannot possibly be right.

Note added in editing: we can compute the exact answer and the normal approximation and compare them. For n = 30 and M = S/30 <= 4 we have the sum S <= 120; for M < 4 we have S < 120, so S <= 119 (because the values of the sum S are integers). I am not sure whether or not you want M <= 4 or M < 4; it makes a difference in this case.

[tex] P_{exact}\{ S \leq 120 \} = 0.002155197756, \; P_{normal}\{ S \leq 120 \} = 0.002115616446 [/tex]
and
[tex] P_{exact}\{ S \leq 119 \} = 0.001746718891, \; P_{normal} \{ S \leq 119 \} = 0.001728090515.[/tex]


RGV
 
Last edited:
  • #9
Thank you so much :)

Can I ask why you use 120 here?
 
Last edited:

Related to Sampling distribution of mean item number?

What is a sampling distribution of mean item number?

A sampling distribution of mean item number is a theoretical probability distribution that represents the possible values of the mean of a sample drawn from a population.

Why is the sampling distribution of mean item number important?

The sampling distribution of mean item number is important because it allows us to make inferences about the population from which the sample was drawn. It also helps us to understand the variability of the sample mean and make more accurate predictions.

How is the sampling distribution of mean item number calculated?

The sampling distribution of mean item number is calculated by taking multiple random samples from a population and calculating the mean for each sample. These means are then plotted on a graph to create the distribution.

What factors can affect the shape of the sampling distribution of mean item number?

The shape of the sampling distribution of mean item number can be affected by the sample size, the variability of the population, and the shape of the population distribution. As the sample size increases, the shape of the distribution becomes more symmetrical and resembles a normal distribution.

How does the central limit theorem relate to the sampling distribution of mean item number?

The central limit theorem states that as the sample size increases, the sampling distribution of mean item number will become more normal regardless of the shape of the population distribution. This allows us to make inferences about the population mean using the sample mean.

Similar threads

Replies
6
Views
660
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
  • Precalculus Mathematics Homework Help
Replies
1
Views
1K
  • Precalculus Mathematics Homework Help
Replies
1
Views
605
  • Precalculus Mathematics Homework Help
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
0
Views
1K
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Precalculus Mathematics Homework Help
Replies
7
Views
2K
Replies
12
Views
2K
Replies
1
Views
1K
Back
Top