Calculating lottery probability

AI Thread Summary
The discussion centers on calculating lottery probabilities using the hypergeometric distribution for matching numbers in a lottery game. The user has set up a scenario where they buy 8 distinct tickets and want to determine the expected number of matches for 3 out of 5 numbers across 1,000,000 drawings. They initially believed that multiplying the probability of matching numbers by the total plays would yield accurate results, but found discrepancies in their simulations, particularly for 3 out of 5 matches. The user reports that their simulations show only 40,000 matches for 3 out of 5, while the expected calculation suggests around 80,000. This inconsistency raises concerns about potential errors in their simulation code.
Mugged
Messages
103
Reaction score
0
Hello, this lottery problem has been bothering me for months. I set it up here. Help appreciated.

Here is the setup:

Suppose you have to pick 5 numbers from a set of integers ranging from 1 to 39. The probability of matching x/5 (where x can be 1,2,3,4,5) numbers correctly can be found by computing the hypergeometric distribution formula, as found on wikipedia. we have that:

probability of matching x/5 = {5 choose x}*{(39-5) choose (5-x)} / {39 choose 5}

where {a choose b} = a! / (b!*(a-b)!)

Computing for x = 2,3,4,5 yields:

x=2: probability = 59840/575757 ≈ 0.104
x=3: probability = 1870/191919 ≈ 0.00974
x=4: probability = 170/575757 ≈ 0.000295
x=5:probability = 1/575757 ≈ 1.737e-6

Now here is the problem:

Suppose you buy 8 tickets, assume all of them have distinct sets of 5 integers, and play these 8 combinations for 1,000,000 drawings of the lottery. How many tickets in total (out of 8*1,000,000) do you expect to match x out of 5? i.e. in 8,000,000 plays, how many tickets yield 3 out of 5 matched?

I had originally thought that the solution was simply the probability of matching x/5 multiplied by the total number of plays, in this case 8e6, but this isn't the case. Although, this approach works for 2 out of 5 matches but not for 3 or 4...unsure about 5/5.

Any help is well appreciated.
 
Physics news on Phys.org
If you know the expected hits (like the probability of "3 out of 5") for each ticket, the expected number of total hits is just the sum of all those expectation values - in your case, 8 million times the probability that a single ticket wins.

The variance is a bit more tricky, if those 8 tickets are not independent, but I think this can be neglected for every reasonable calculation.
 
Right, but that approach doesn't seem to work. The reason I'm posing this question is because I've been running lottery simulations in Matlab and I'm getting results that are inconsistent with that idea.

I run 1,000,000 "games", where for each game I have 8 sets of 5 numbers chosen being compared to a winning game combination. Out of the 8 million plays, I get roughly 800,000 2/5 matches, 40,000 3/5 matches, can't remember 4/5,5/5.

But the point is the probability for 2/5 match is rougly 10%, so 8e6*.10 = 800k, which is fine. But for 3/5 match the probability is 0.009, times 8e6 is about 80k, but I only get 40k...so this doesn't match
 
I would expect a bug in the code then.
 
Back
Top