Help I need the expected value for a tricky situation

  • Context: Undergrad 
  • Thread starter Thread starter exdejesus
  • Start date Start date
  • Tags Tags
    Expected value Value
Click For Summary

Discussion Overview

The discussion revolves around calculating the expected number of distinct numbers assigned to 60 bins, each randomly selected from a set of 256 possible numbers. Participants explore theoretical approaches and mathematical formulations related to probability, statistics, and combinatorial methods.

Discussion Character

  • Exploratory
  • Mathematical reasoning
  • Technical explanation

Main Points Raised

  • One participant describes a scenario involving 60 bins and 256 numbers, seeking to determine the average number of distinct numbers across the bins.
  • Another participant references Stirling numbers of the second kind as a potential tool for calculating the number of ways to assign numbers to bins.
  • A subsequent reply clarifies the use of Stirling numbers in partitioning sets and suggests a method for counting distinct numbers among the bins.
  • One participant proposes a probabilistic approach using indicator random variables to calculate the expected number of distinct numbers drawn, leading to a specific formula.
  • Another participant confirms that the probabilistic method yields a value consistent with earlier simulations, indicating its validity.

Areas of Agreement / Disagreement

Participants present multiple approaches to the problem, including combinatorial methods and probabilistic reasoning. There is no consensus on a single method, as different perspectives and calculations are explored without resolution.

Contextual Notes

The discussion includes various mathematical formulations and assumptions related to probability distributions and combinatorial counting, which may depend on specific definitions and interpretations of the problem.

exdejesus
Messages
3
Reaction score
0
Hello!
I need your help. I’m really terrible at statistics, probability, permutations, and combinations. I’ve looked around a lot for the answer to this, but I haven’t seen anything just like it. I need the answer to this for some research that I’m doing.

Here’s the situation:
Imagine 60 bins or locations. Each bin is assigned one randomly-selected number, out of 256 possible numbers. It is possible that two or more bins can receive the same number.

My question is this: out of those 60 bins, on average, how many bins contain different numbers? Or, to put it another way, on average, how many different numbers are contained in the 60 bins?

Clearly, it’s possible that on some runs of this experiment, the 60 bins could all contain different numbers, so the count would be exactly 60. It’s also possible, although extremely unlikely, that all 60 bins could contain the same number, so the count would be 1.

From doing computer simulations of this situation, I get that the average count is between 53 and 54, which sounds reasonable. I just don’t know how to calculate that theoretical average count.

I also actually created simpler versions of this situation, with fewer bins and fewer numbers to assign to the bins. I then counted how many different numbers there were, and calculated the average. Here’s what I got:
For 3 bins and 4 numbers, the average count is 2.3125.
For 3 bins and 5 numbers, the average count is 2.44.
For 3 bins and 6 numbers, the average count is 2.527778.
For 3 bins and 7 numbers, the average count is 2.591837.
For 3 bins and 8 numbers, the average count is 2.640625.

So, a general formula for this situation should give these results.

I will be very grateful if you can tell me what the answer is for 60 bins and 256 numbers, and even more grateful if you can show me the formula for calculating this.

Thank you in advance!
 
Physics news on Phys.org
Last edited:
Excellent!

It sounds like Stirling numbers of the second kind are definitely related to what I need.

(By the way, I love this line from the Wikipedia article: "Stirling numbers of the second kind is one of two types of Stirling numbers, the other type being called Stirling numbers of the first kind." :) )

As I understand the description, the Stirling number tells me the total number of ways that 256 numbers can go into the 60 bins.

What I need now is a way to count how many of those "ways" use 60 different numbers, how many use 59, then 58, and so forth.

Thanks.
 
S(n,k) tells you the number of ways to partition a set of n objects into k non-empty (nondistinct) subsets. Multiply by k! to partition them into k non-empty distinct subsets (I forgot that second step - edited my earlier post).

For example: how many ways are there to have 3 distinct numbers among 5 bins? Well let's ask a slightly easier question, how many ways are there to have the specific numbers 10, 11, and 12 among 5 bins (and no other numbers)? This is the number of ways you can distribute the 5 bins among the 3 distinct sets labeled 10, 11, and 12, such that each set gets at least 1 bin. Putting a bin in one of those sets means you assign the corresponding number to that bin. So it's 3! S(5,3) = 150

But in that case we used the specific 3 numbers 10, 11, and 12, but they could really be any 3 numbers from the 60. So multiply by C(60,3) to find the number of ways to get exactly 3 numbers among 5 bins: 3! S(5,3) C(60,3) = 34220.
 
Let [tex]X_i = 1[/tex] if number i is drawn at least once, = 0 otherwise, for i = 1, 2, 3, ..., 256.

Then [tex]P(X_i = 1) = 1 - (1 - 1/256)^{60}[/tex]
and
[tex]E(X_i) = 1 - (1 - 1/256)^{60}[/tex]

So the expected number of distinct numbers drawn is
[tex]E(\sum_{i=1}^{256} X_i) = \sum_{i=1}^{256} E(X_i) = 256 \; [1 - (1 - 1/256)^{60}][/tex]

Here we have made use of the theorem that E(X+Y) = E(X) + E(Y). It's important to realize that this theorem holds even if X and Y are not independent, which is good for us in this case because the X_i's are not independent.
 
awkward said:
Let [tex]X_i = 1[/tex] if number i is drawn at least once, = 0 otherwise, for i = 1, 2, 3, ..., 256.

Then [tex]P(X_i = 1) = 1 - (1 - 1/256)^{60}[/tex]
and
[tex]E(X_i) = 1 - (1 - 1/256)^{60}[/tex]

So the expected number of distinct numbers drawn is
[tex]E(\sum_{i=1}^{256} X_i) = \sum_{i=1}^{256} E(X_i) = 256 \; [1 - (1 - 1/256)^{60}][/tex]

Here we have made use of the theorem that E(X+Y) = E(X) + E(Y). It's important to realize that this theorem holds even if X and Y are not independent, which is good for us in this case because the X_i's are not independent.
That also works... you only need the Stirling numbers if you want an explicit distribution.
 
awkward, that is it!

That gives a value of 53.5802554, which is right in the range that my simulation suggests!

Thank you so much for the formula and the explanation!
 

Similar threads

  • · Replies 37 ·
2
Replies
37
Views
5K
Replies
28
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
5K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 4 ·
Replies
4
Views
8K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K