Boostrap sample probability question

  • Context: Graduate 
  • Thread starter Thread starter infk
  • Start date Start date
  • Tags Tags
    Probability
Click For Summary

Discussion Overview

The discussion revolves around calculating the probability that a bootstrap sample, drawn with replacement from a set of distinct values, contains only two unique values. Participants explore different approaches to derive this probability, examining combinatorial aspects and the implications of sampling methods.

Discussion Character

  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant proposes that the probability can be expressed as ##\binom{n}{2} \frac{n-1}{n^n}##, based on the number of ways to choose pairs and the total number of bootstrap samples.
  • Another participant emphasizes the need to count the cases that meet the criterion of having only two unique values, suggesting a breakdown into steps for clarity.
  • A further reply reiterates the counting method, stating there are ##\binom{n}{2}## ways to choose the two unique values and discusses the distribution of these values in the bootstrap sample.
  • One participant challenges the correctness of the initial probability calculation, suggesting an alternative formulation of ##\binom{n}{2}\left( \frac{2^n(n-1)}{n^n}\right)## and referencing a solution that includes a correction term.
  • Another participant questions the second step of the calculation, using a coin analogy to prompt further consideration of arrangements and outcomes.

Areas of Agreement / Disagreement

Participants do not reach consensus on the correct probability expression, with multiple competing views and formulations presented throughout the discussion.

Contextual Notes

Participants express uncertainty regarding the counting methods and the implications of different arrangements in the bootstrap sampling process. There are unresolved mathematical steps and assumptions that influence the proposed solutions.

infk
Messages
21
Reaction score
0
If we have ##x_1, \ldots , x_n##, all distinct values, and then sample from this with replacement and thus obtain a bootstrap sample ##x^{\star}_1, \ldots , x^{\star}_n##, what is the probability that the bootstrap sample has only two unique values?My attempt at a solution:

there are ##\binom{n}{2}## possible pairs in the original sample.
When sampling with replacement, there are ##n^n## possible bootstrap samples. The number of ways that two unique values can occur is ##n-1## so the sought-for probability is:
##\binom{n}{2} \frac{n-1}{n^n}##.
 
Last edited:
Physics news on Phys.org
You have n^n possible samples, which we assume are all equally likely. In order to find the probability that a sample contains only two unique values, you need to count the number of cases which meet that critereon. So ask yourself...

1. How many ways can you choose the two unique values?

2. Given two unique values, how many samples of size n can you choose which contain only those values?
 
awkward said:
You have n^n possible samples, which we assume are all equally likely. In order to find the probability that a sample contains only two unique values, you need to count the number of cases which meet that critereon. So ask yourself...

1. How many ways can you choose the two unique values?

2. Given two unique values, how many samples of size n can you choose which contain only those values?
1. There are ##\binom{n}{2}## ways to do that.

2. There should be in total ##n## recordings of the only two unique values. If the first one occurs once, the other one must occur ##n-1## times, or the first one occurs twice, then the other one occurs ##n-2## times, and so on and so forth, giving in total ##n-1## possible ways two distribute the two unique values in the bootstrap sample.

These two steps combined means that the probability is ##\binom{n}{2} \frac{n-1}{n^n}##, but this is incorrect (why?)Alternatively, the probability of choosing either of the two ##n## times is ##\frac{2^n}{n^n}##. It is also true that for 2 given distinct values this can occur in ##n-1## ways. Thus the probability of choosing two given values ##n## times is ##\frac{2^n}{n^n}(n-1)##. Moreover, this holds for excactly ##\binom{n}{2}## pairs of values.
So the probability is ##\binom{n}{2}\frac{2^n}{n^n}(n-1) = \binom{n}{2}\left( \frac{2^n(n-1)}{n^n}\right)##. According to the solution, the correct answer should be:
##\binom{n}{2}\left( \frac{2^n}{n^n} - \frac{2}{n^n}\right)##
 
Last edited:
You are off on your answer to 2. If you have coins numbered 1 through n, each of which can be heads or tails (and order counts), how many arrangements are possible?

(Two of those arrangements are special, because they are either all tails or all heads.)
 
Last edited:

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
858
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K