# Generalized Birthday Problem

## Main Question or Discussion Point

There is a lot of information on the web about how to calculate the probability that, in an arbitrarily-sized group, 2 people will share a birthday. However, I am trying to determine the probability that a larger number of people are born on a specific day (e.g., a group of people have a birthday today, as opposed to "sharing" a birthday any time during the year). I can't find anything on this point.

I am trying to address this problem in the context of the following example: Assuming a group of 800 people, what are the odds that 10 people in that group have a birthday today?

The probability that any particular one of those 800 people has a birthday today is 1/365. The probability that any particular one does not have a birthday today is 364/365. We want exactly ten people to have a birthday today, and the probability of any 10 having a birthday today and any 790 not having a birthday today is ((1/365)^10)*((364/365)^790). However, this needs to be multiplied by 800!/10!790!, because that is the number of combinations meeting the specified criteria. Does this reasoning make sense?

Many thanks.

Related Set Theory, Logic, Probability, Statistics News on Phys.org
mathman
There is a lot of information on the web about how to calculate the probability that, in an arbitrarily-sized group, 2 people will share a birthday. However, I am trying to determine the probability that a larger number of people are born on a specific day (e.g., a group of people have a birthday today, as opposed to "sharing" a birthday any time during the year). I can't find anything on this point.

I am trying to address this problem in the context of the following example: Assuming a group of 800 people, what are the odds that 10 people in that group have a birthday today?

The probability that any particular one of those 800 people has a birthday today is 1/365. The probability that any particular one does not have a birthday today is 364/365. We want exactly ten people to have a birthday today, and the probability of any 10 having a birthday today and any 790 not having a birthday today is ((1/365)^10)*((364/365)^790). However, this needs to be multiplied by 800!/10!790!, because that is the number of combinations meeting the specified criteria. Does this reasoning make sense?

Many thanks.
Yes - it is a direct application of binomial distribution.

Great. Thank you. There are two additional cases I would like to explore: first, if the number of people having a birthday today (let than number = n) is defined as n >=10; second, if the condition for success is not merely that n people's birthday is today, but, rather, is that n people have the same birthday at any time during the year (ignoring leap years).

In order to solve for the first case, would it be necessary to express the probability for each n? [i.e., ((1/365)^10)*((364/365)^790)*(800!/(10!790!)) + ... + ((1/365)^800)*((364/365)^0)*(800!/800!)]

In order to solve for the second case, it seems like the probability increases by a factor of 365. This is because the probability of any one person having a birthday on any particular day of the year is (365/365)^1. For n=10, this needs to be multiplied by the probability of 9 people having the same birthday and 790 people not having such birthday. This is no different from the expression I initially used, so the entire expression for this second case is: ((365/365)^1)*((1/365)^9)*((364/365)^790)*(800!/10!790!). I can't find anything wrong with this, but I have a gut feeling that it doesn't work.

mathman
You have to be a little careful for the second case. You may have more than one occurrence of 10 people having birthdays on the same day.

Excellent point. However, accounting for all of the cases in which groups of n=10 people share the same birthday produces a lengthy expression. First, you must account for the case I described above. Then you need to account for a case in which there are two groups, with each group composed of 10 people sharing the same birthday. So on and so forth for all possible groups of ten people. As there are 800 people, the most extreme case is one in which all 800 people are distributed amongst 80 groups of 10 people who share the same birthday. Each individual case (e.g., 2 groups of 10 sharing a birthday) is conjunctive. But the total probability of any one of these cases occurring is disjunctive. So it is necessary to determine the individual probability of each such case and then sum such individual probabilities. An expression for this sum would take the form (A) + (B) + ... + (Z), where (A) = ((365/365)^1)*((1/365)^9)*((364/365)^790)*(800!/10!790!); (B) = ((365/365)^1)*((1/365)^9)*((364/365)^1)*((1/365)^9)*((363/365)^780)*(800!/10!10!780!); and (Z) = ((365/365)^1)*((1/365)^9)*((364/365)^1)*((1/365)^9)*...*((286/365)^1)*((1/365)^9)*(800!/10!10!...10!514!).

This expression is quite unwieldy (and possibly also incorrect-I'm not entirely confident in it). And even though it could probably be stated more elegantly using sigma notation, the actual calculations required are substantial. Additionally, if one wanted to further generalize the equation by permitting n>=10 (rather than n=10) in the case for which the above expression applies, an even more complicated expression would be required.

Is there a "trick" for dealing with this type of counting problem, or is it really necessary to approach it through a lengthy expression as I have.

Many thanks.

mathman
I haven't tried it, but I don't think there is any trick.

Do you see the potential for the application of the poisson distribution? I have no particular way to do so, but it was suggested to me as a potential simplifying solution.

mathman
Poisson distribution is a good approximation to binomial when the mean is much smaller than the total number. I am not sure how it would apply here.

haruspex