Question of finding the exepcted value of repeated numbers

  • Thread starter Thread starter KFC
  • Start date Start date
  • Tags Tags
    Numbers Value
AI Thread Summary
The discussion centers on calculating the expected value of duplicate numbers from a set of 1000 randomly chosen integers ranging from 1 to 100. The initial calculation suggested an expected value of 4500 based on the assumption of equal distribution, but a simulation indicated an average closer to 4997.8. The correct approach involves defining random variables for matching pairs and applying the formula for expected value, leading to an expected value of 4995. The discrepancy arises because the distribution of occurrences is not independent, which invalidates the initial method. Understanding the independence of events is crucial for accurately calculating expectations in probability.
KFC
Messages
477
Reaction score
4
Hi there,
I am thinking a question of finding the expectation of duplicate numbers (how many time any pair could be made in N rows each contains one number). I start from 1000 numbers each randomly picked ranged 1 to 100. I sort the randomized 1000 numbers, count the total number of "1" called s1, count the total number of "2" called s2 and up to "100" (s100), obviously , for s1 "1", the way to combine any a pair of "1" is C_{s1}^2 = \displaystyle\frac{s1(s1-1)}{2}. Similarly, the way for have a pair for all S2 "2" is C_{s2}^2 = \displaystyle\frac{s2(s2-1)}{2} ...

so the total duplicates is just \sum_i \frac{S_i(S_i-1)}{2}

However, we assume each number appears at equal probability such that in ideal case in the 1000 numbers, there will be 10 x "1", 10 x "2", 10 x "3" ... 10 x "100". Or we can say for any number 1 2 3 ... 100, the duplicate number will be 10 and which can be used to make

S = 10 \times (10-1)/2, so total expectation should be TOTAL EV = 100\times S = \frac{100 \times 10 \times (10-1)}{2} = 4500

But someone told me the expectation should be 4995. And I setup a simulation to simulate the above random process for 10000000 and calculate the average. I find that the average expectation number is close to 5000 (about 4997.8). So my logic in calculating the expectation is wrong. Would anyone please show me the right way to calculate the expectation for the question? Thanks.
 
Physics news on Phys.org
Ok so you've got random variables X1,...,XN uniformly distributed on {1,...,100} and if M is the number of matching pairs you want to find E[M]. Set Iij=1 if Xi=Xj and 0 otherwise, so M=sum_{1<=i<j<=N}Iij and thus E[M]=sum(E[Iij])=(NC2).E[Iij]. NC2=N(N-1)/2 and E[Iij]=1/100 so E[M]=4995. The other method fails because the distribution of the number of occurrences of each number is not independent.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...

Similar threads

Replies
18
Views
3K
Replies
5
Views
1K
Replies
1
Views
1K
Replies
22
Views
2K
Replies
2
Views
2K
Replies
2
Views
3K
Back
Top