Question of finding the exepcted value of repeated numbers

In summary, the conversation discusses finding the expectation for duplicate numbers in a set of randomly picked numbers. The speaker initially calculates the expectation to be 4500, but is later told it should be 4995. The correct method for calculating the expectation is using the formula E[M] = (NC2)E[Iij], where N is the total number of numbers and E[Iij] is the probability of two numbers matching. This results in an expectation of 4995, which is confirmed through simulation. The speaker's initial method fails because the distribution of the number of occurrences for each number is not independent.
  • #1
KFC
488
4
Hi there,
I am thinking a question of finding the expectation of duplicate numbers (how many time any pair could be made in N rows each contains one number). I start from 1000 numbers each randomly picked ranged 1 to 100. I sort the randomized 1000 numbers, count the total number of "1" called s1, count the total number of "2" called s2 and up to "100" (s100), obviously , for s1 "1", the way to combine any a pair of "1" is [tex]C_{s1}^2 = \displaystyle\frac{s1(s1-1)}{2}[/tex]. Similarly, the way for have a pair for all S2 "2" is [tex]C_{s2}^2 = \displaystyle\frac{s2(s2-1)}{2}[/tex] ...

so the total duplicates is just [tex]\sum_i \frac{S_i(S_i-1)}{2}[/tex]

However, we assume each number appears at equal probability such that in ideal case in the 1000 numbers, there will be 10 x "1", 10 x "2", 10 x "3" ... 10 x "100". Or we can say for any number 1 2 3 ... 100, the duplicate number will be 10 and which can be used to make

[tex]S = 10 \times (10-1)/2 [/tex], so total expectation should be [tex]TOTAL EV = 100\times S = \frac{100 \times 10 \times (10-1)}{2} = 4500[/tex]

But someone told me the expectation should be 4995. And I setup a simulation to simulate the above random process for 10000000 and calculate the average. I find that the average expectation number is close to 5000 (about 4997.8). So my logic in calculating the expectation is wrong. Would anyone please show me the right way to calculate the expectation for the question? Thanks.
 
Physics news on Phys.org
  • #2
Ok so you've got random variables X1,...,XN uniformly distributed on {1,...,100} and if M is the number of matching pairs you want to find E[M]. Set Iij=1 if Xi=Xj and 0 otherwise, so M=sum_{1<=i<j<=N}Iij and thus E[M]=sum(E[Iij])=(NC2).E[Iij]. NC2=N(N-1)/2 and E[Iij]=1/100 so E[M]=4995. The other method fails because the distribution of the number of occurrences of each number is not independent.
 

1. What is the expected value of repeated numbers?

The expected value of repeated numbers is a statistical concept that represents the average value that would be obtained if the same experiment or scenario were repeated an infinite number of times. In other words, it is the long-term average that can be expected from a random process.

2. How is the expected value of repeated numbers calculated?

The expected value of repeated numbers is calculated by multiplying each possible outcome by its probability and then summing all of these values together. This can be represented mathematically as E(X) = Σ(x * P(x)), where x is the possible outcome and P(x) is the probability of that outcome occurring.

3. Can the expected value of repeated numbers be negative?

Yes, the expected value of repeated numbers can be negative. This can occur if the possible outcomes have negative values and/or if their probabilities are such that the negative outcomes are more likely to occur.

4. How is the expected value of repeated numbers used in real-world applications?

The concept of expected value is used in a variety of real-world applications, particularly in fields such as finance, economics, and gambling. For example, when making investments, individuals and businesses use expected value to determine the potential return on their investments and make informed decisions.

5. What is the relationship between expected value and actual outcomes?

The expected value of repeated numbers is not necessarily equal to the actual outcome that occurs in a single experiment or scenario. It simply represents the average value that can be expected over a large number of repetitions. However, as the number of repetitions approaches infinity, the expected value will become closer to the actual outcome.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
18
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
773
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
778
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
685
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
Back
Top