Boostrap sample probability question

In summary, the probability that the bootstrap sample has only two unique values when sampling with replacement from a sample of distinct values is ##\binom{n}{2}\left( \frac{2^n}{n^n} - \frac{2}{n^n}\right)##.
  • #1
infk
21
0
If we have ##x_1, \ldots , x_n##, all distinct values, and then sample from this with replacement and thus obtain a bootstrap sample ##x^{\star}_1, \ldots , x^{\star}_n##, what is the probability that the bootstrap sample has only two unique values?My attempt at a solution:

there are ##\binom{n}{2}## possible pairs in the original sample.
When sampling with replacement, there are ##n^n## possible bootstrap samples. The number of ways that two unique values can occur is ##n-1## so the sought-for probability is:
##\binom{n}{2} \frac{n-1}{n^n}##.
 
Last edited:
Physics news on Phys.org
  • #2
You have [itex]n^n[/itex] possible samples, which we assume are all equally likely. In order to find the probability that a sample contains only two unique values, you need to count the number of cases which meet that critereon. So ask yourself...

1. How many ways can you choose the two unique values?

2. Given two unique values, how many samples of size n can you choose which contain only those values?
 
  • #3
awkward said:
You have [itex]n^n[/itex] possible samples, which we assume are all equally likely. In order to find the probability that a sample contains only two unique values, you need to count the number of cases which meet that critereon. So ask yourself...

1. How many ways can you choose the two unique values?

2. Given two unique values, how many samples of size n can you choose which contain only those values?
1. There are ##\binom{n}{2}## ways to do that.

2. There should be in total ##n## recordings of the only two unique values. If the first one occurs once, the other one must occur ##n-1## times, or the first one occurs twice, then the other one occurs ##n-2## times, and so on and so forth, giving in total ##n-1## possible ways two distribute the two unique values in the bootstrap sample.

These two steps combined means that the probability is ##\binom{n}{2} \frac{n-1}{n^n}##, but this is incorrect (why?)Alternatively, the probability of choosing either of the two ##n## times is ##\frac{2^n}{n^n}##. It is also true that for 2 given distinct values this can occur in ##n-1## ways. Thus the probability of choosing two given values ##n## times is ##\frac{2^n}{n^n}(n-1)##. Moreover, this holds for excactly ##\binom{n}{2}## pairs of values.
So the probability is ##\binom{n}{2}\frac{2^n}{n^n}(n-1) = \binom{n}{2}\left( \frac{2^n(n-1)}{n^n}\right)##. According to the solution, the correct answer should be:
##\binom{n}{2}\left( \frac{2^n}{n^n} - \frac{2}{n^n}\right)##
 
Last edited:
  • #4
You are off on your answer to 2. If you have coins numbered 1 through n, each of which can be heads or tails (and order counts), how many arrangements are possible?

(Two of those arrangements are special, because they are either all tails or all heads.)
 
Last edited:

FAQ: Boostrap sample probability question

1. What is a bootstrap sample in probability?

A bootstrap sample is a subset of a larger population that is created by randomly sampling with replacement from the original data set. It is commonly used in statistics to estimate the variability of a statistical measure or to make inferences about a population based on a smaller sample size.

2. How is a bootstrap sample different from a traditional sample?

A traditional sample is a subset of a population that is selected without replacement, meaning that once an individual is selected, they cannot be chosen again. In contrast, a bootstrap sample allows for duplicate individuals to be selected, which helps to mimic the variability of the original population.

3. What is the purpose of using bootstrap samples in probability?

The purpose of using bootstrap samples in probability is to estimate the variability of a statistic or to make inferences about a population when the traditional methods are not feasible. It can also provide a more accurate estimate when the sample size is small or the population is non-normal.

4. How is the size of a bootstrap sample determined?

The size of a bootstrap sample is typically determined by the size of the original data set. It is recommended to use a sample size that is at least as large as the square root of the original data set size. However, some statisticians recommend using larger sample sizes for more accurate estimates.

5. What is the difference between bootstrap samples and resampling?

Bootstrap samples and resampling are often used interchangeably, but there is a slight difference between the two. Resampling refers to any sampling method that involves repeatedly drawing samples from a population, while bootstrap samples specifically refer to randomly sampling with replacement from a single data set to estimate variability or make inferences about a population.

Similar threads

Replies
5
Views
1K
Replies
1
Views
1K
Replies
4
Views
1K
Replies
0
Views
1K
Replies
6
Views
2K
Replies
1
Views
309
Replies
7
Views
2K
Back
Top