Number of Die Throws until you get a repeated number

  • Thread starter Master1022
  • Start date
  • #1
554
109
Homework Statement:
If we have a 6-sided die (it is a fair die), how many throws would we expect to make before we see a repeated number?
Relevant Equations:
Probability
Hi,

I found a similar problem online, but there wasn't any solution. The question actually had a larger number of sides (e.g. 50 or 100, etc.), but I thought the underlying solution method should be the same for any arbitrary ##n##-sided die.

Question:
If we have a 6-sided die (it is a fair die), how many throws would we expect to make before we see a repeated number?

Attempt:
I have been thinking about this problem for a while, and am very stuck on how to proceed. I did write a Python script to simulate this game for different ##n##, and the graph aligned with intuition (larger ##n## required more throws until repetition). I remember reading that this problem was similar to the birthday paradox conceptually, however I haven't been able to use that to solve the problem.

If I follow that birthday paradox line of reasoning, I suppose I could do:
[tex] P(\text{repetition on 5th throw}) = 1 \times \frac{5}{6} \times \frac{4}{6} \times \frac{3}{6} \times \frac{4}{6} = \frac{(6)! \cdot 4}{(6-4)! \cdot 6^5} [/tex]

Thus this could be generalized to the probability of a repetition on the ##k##-th throw:
[tex] P(\text{repetition on k-th throw for n sided die}) = \frac{(n)! \times (k-1)}{(n-(k-1))! \cdot n^k} [/tex]

Is the above correct? I am not sure I am fully convinced by the abiThis question was an interview question, so how could one find the expected number without a calculator - are there tricks or approximations?

Any help would be greatly appreciated.
 

Answers and Replies

  • #2
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
You can't go more than ##7## throws without repetition. Using your formula for ##k =2 \dots 7## should be easy?
 
  • Like
Likes WWGD, FactChecker and Master1022
  • #3
Buzz Bloom
Gold Member
2,405
441
The chances are 0 in 6 = 0.0 ... that you hit on the first roll.
The chances are 1 in 6 = 0.1666 ... that you hit on the second roll.
The chances are 2 in 6 = 0.3333 ... that you hit on the third roll.
The chances are 3 in 6 = 0.5 that you hit on the fourth roll.
The chances are 4 in 6 = 0.6666 ... that you hit on the fifth roll.
The chances are 5 in 6 = 0.8333 ... that you hit on the sixth roll.
The chances are 6 in 6 =1.0 that you hit on the seventh roll.

I hope you find this helpful.

I corrected this during the time @PeroK posted his comment.
 
  • #4
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
The chances are 1 in 6 = 0.1666 ... that you hit on the first roll.
The chances are 2 in 6 = 0.3333 ... that you hit on the second roll.
The chances are 3 in 6 = 0.5 that you hit on the third roll.
The chances are 4 in 6 = 0.6666 ... that you hit on the fourth roll.
The chances are 5 in 6 = 0.8333 ... that you hit on the fifth roll.
The chances are 6 in 6 =1.0 that you hit on the sixth roll.

I hope you find this helpful.
Or, wrong!
 
  • #5
2,145
1,340
As @PeroK pointed out, you can't go more than 7 throws without repetition ##-##
You can't go more than ##7## throws without repetition. Using your formula for ##k =2 \dots 7## should be easy?
More precisely, 7 throws is enough to guarantee repetition ##-## this is an example of the pigeonhole principle.
 
  • #6
554
109
Thanks @PeroK, @Buzz Bloom , and @sysprog!

Sure, for this 6-sided die, going through different values of ##k## isn't too bad. However, I just used 6 to simplify the problem (was originally 100-sided die), in which case I think it becomes more difficult. Was my underlying method/reasoning correct?

EDIT: this question was asked in an interview with no access to calculators
 
  • #7
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
Thanks @PeroK, @Buzz Bloom , and @sysprog!

Sure, for this 6-sided die, going through different values of ##k## isn't too bad. However, I just used 6 to simplify the problem (was originally 100-sided die), in which case I think it becomes more difficult. Was my underlying method/reasoning correct?

EDIT: this question was asked in an interview with no access to calculators
What do you mean by repetition? Is that two in a row the same?
 
  • #8
2,145
1,340
Thanks @PeroK, @Buzz Bloom , and @sysprog!

Sure, for this 6-sided die, going through different values of ##k## isn't too bad. However, I just used 6 to simplify the problem (was originally 100-sided die), in which case I think it becomes more difficult. Was my underlying method/reasoning correct?

EDIT: this question was asked in an interview with no access to calculators
By the pigeonhole principle, if you throw a 100-sided die 100 times and get a different number on each throw, there is no number for throw 101 that is not a repeated number. You of course could get a repeat on throw 2; however, you have only a 1% chance of doing that. Throw 1 sets the first number that is a candidate for repetition. If throw 2 is a different number, then you have 2 candidate numbers for throw 3, meaning you have a 2% chance on throw 3 ##\dots##
 
Last edited:
  • #9
FactChecker
Science Advisor
Gold Member
6,578
2,644
One thing to realize is that the original problem with "100-sided die" might be expecting an entirely different approach to the solution. A large number of sides makes me suspect that there is a continuous approximation that they expect you to apply. I'm sorry that I am too rusty to recognize what the approximation might be (Poisson? It doesn't seem right.). There might be something appropriate in the section where the problem appears.
 
Last edited:
  • Like
Likes WWGD, sysprog and Master1022
  • #10
554
109
What do you mean by repetition? Is that two in a row the same?
Sorry. Just mean't the same number appears again, but doesn't have to be consecutive. E.g. 1, 4, 5, 1 would count as 1 has been repeated now
 
  • #11
pbuk
Science Advisor
Gold Member
2,528
1,257
EDIT: this question was asked in an interview with no access to calculators
Sometimes an interviewer will ask a question that cannot be answered within the available time or resources: they are interested in how you approach the problem rather than what the solution is.

So a good answer might go "Hmmm, that looks tricky. I can see straight away that the answer must be less than 101 [interviewer notes "can narrow down a problem by setting bounds"], but it looks like a lot less [Interviewer notes "demonstrates ability to estimate"]. It reminds me of the birthday paradox [interviewer notes "has experience of common problems in probability"]". At this point the interviewer interrupts: "yes that's right: do you know anywhere this might be useful in your job at CodesRUs?".

Candidate: "Well the same problem is encountered when we create a hash key for indexing a table - the longer the key the less frequently we get hash collisions [Interviewer notes: "follow-up interview with database team"]. And again in cryptography - I think it's even called a 'birthday attack'; this is where an attacker exploits the repetition associated with the length of the key [Interviewer notes: "follow-up interview with security team"]. And of course the birthday paradox always goes down well at parties [Interviewer notes: "not suitable for customer facing role" :-p]."
 
Last edited:
  • Like
Likes Master1022, jim mcnamara, DaveC426913 and 3 others
  • #12
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
Or, it must be about ##\frac n 2 + 1## but it's a tricky calculation to get an exact answer!
 
  • #13
2,145
1,340
Or, it must be about ##\frac n 2 + 1## but it's a tricky calculation to get an exact answer!
I think that's partly because there are multiple ways to do it ##-## here's one such that for ##n=100## I'd need a calculator to get an exact answer: to get a repetition in ##k## rolls of an ##n##-sided die, ##\sum_{k=2}^{n+1}\frac{k!}{n^{k-1}}\binom{n-1}{k-2}##. (ref)
 
Last edited:
  • #14
2,145
1,340
One thing to realize is that the original problem with "100-sided die" might be expecting an entirely different approach to the solution. A large number of sides makes me suspect that there is a continuous approximation that they expect you to apply. I'm sorry that I am too rusty to recognize what the approximation might be (Poisson? It doesn't seem right.). There might be something appropriate in the section where the problem appears.
An asymptotic approximation would be: ##\sum_{k=2}^{n+1}\frac{k!}{n^{k-1}}\binom{n-1}{k-2}\sim\sqrt{\frac{n\pi}2}+\frac23##
(with an error of approximately ##\frac1{10\sqrt{n}}##). (ref)
 
  • Like
  • Informative
Likes Master1022, FactChecker and PeroK
  • #15
FactChecker
Science Advisor
Gold Member
6,578
2,644
An asymptotic approximation would be: ##\sum_{k=2}^{n+1}\frac{k!}{n^{k-1}}\binom{n-1}{k-2}\sim\sqrt{\frac{n\pi}2}+\frac23##
(with an error of approximately ##\frac1{10\sqrt{n}}##). (ref)
Nice! I guess I was thinking of an asymptotic approximation to a known common probability distribution, like Poisson. But the question was asked in an interview, so it may be just to see how the person responds and analysis the problem, as @pbuk pointed out. There might not be any reasonably simple solution.
 
  • Like
Likes Master1022 and sysprog
  • #16
2,145
1,340
Nice! I guess I was thinking of an asymptotic approximation to a known common probability distribution, like Poisson. But the question was asked in an interview, so it may be just to see how the person responds and analysis the problem, as @pbuk pointed out. There might not be any reasonably simple solution.
Handling the (similar) 'birthday problem' as Poisson processes is discussed here.

Also, the expected number of rolls until the first ##k##-repetition with an ##n##-sided die is given by the integral: ##\mathbb{E}(T) = \int_0^\infty \left(\sum_{j=0}^{k-1} {1\over j!}\left({x\over n}\right)^j\right)^n \,e^{-x}\,dx##. (ref)

I would use my TI-83 calculator for this if I wanted a reasonably quick answer.
 
  • Like
Likes uart, FactChecker and PeroK
  • #17
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
An asymptotic approximation would be: ##\sum_{k=2}^{n+1}\frac{k!}{n^{k-1}}\binom{n-1}{k-2}\sim\sqrt{\frac{n\pi}2}+\frac23##
(with an error of approximately ##\frac1{10\sqrt{n}}##). (ref)
For ##n =6## I got ##3.847##, and your formula gives ##3.737##. It appears to converge quickly.
 
  • Like
Likes FactChecker and sysprog
  • #18
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
Also, quite amusing is the case of a coin, I.e. ##n =2##. The formula gives ##2.44 ##, which is pretty close to the precise ##2.5##.
 
  • #19
2,145
1,340
Doing the recursion manually for ##n=6## yields ##{1223\over324}\approx 3.77##. (ref)
 
  • #20
WWGD
Science Advisor
Gold Member
5,654
5,373
Sometimes an interviewer will ask a question that cannot be answered within the available time or resources: they are interested in how you approach the problem rather than what the solution is.

So a good answer might go "Hmmm, that looks tricky. I can see straight away that the answer must be less than 101 [interviewer notes "can narrow down a problem by setting bounds"], but it looks like a lot less [Interviewer notes "demonstrates ability to estimate"]. It reminds me of the birthday paradox [interviewer notes "has experience of common problems in probability"]". At this point the interviewer interrupts: "yes that's right: do you know anywhere this might be useful in your job at CodesRUs?".

Candidate: "Well the same problem is encountered when we create a hash key for indexing a table - the longer the key the less frequently we get hash collisions [Interviewer notes: "follow-up interview with database team"]. And again in cryptography - I think it's even called a 'birthday attack'; this is where an attacker exploits the repetition associated with the length of the key [Interviewer notes: "follow-up interview with security team"]. And of course the birthday paradox always goes down well at parties [Interviewer notes: "not suitable for customer facing role" :-p]."
A technical point : 101 throws gives you a guarantee, but I believe the question was about the expected number of throws. Would n/2 or ( least integer greater than n/2 ) for an n-sided fair die be a good rule of thumb?
 
  • #21
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
A technical point : 101 throws gives you a guarantee, but I believe the question was about the expected number of throws. Would n/2 or ( least integer greater than n/2 ) for an n-sided fair die be a good rule of thumb?
A formula that provides a good approximation was provided here:

An asymptotic approximation would be: ##\sum_{k=2}^{n+1}\frac{k!}{n^{k-1}}\binom{n-1}{k-2}\sim\sqrt{\frac{n\pi}2}+\frac23##
(with an error of approximately ##\frac1{10\sqrt{n}}##). (ref)
 
  • Like
Likes Master1022, WWGD and sysprog
  • #22
FactChecker
Science Advisor
Gold Member
6,578
2,644
A technical point : 101 throws gives you a guarantee, but I believe the question was about the expected number of throws. Would n/2 or ( least integer greater than n/2 ) for an n-sided fair die be a good rule of thumb?
That seems like a good guess, but it is not close to the approximation, ##\sqrt{n\pi/2}+2/3##, that @sysprog gave. That gives about 13 when n=100.
Compare this problem to the well-known problem of the odds of two students in a class of 30 having the same birthday. The answer to that is that it is very likely. But half of 365 would make it very unlikely.
 
  • Like
Likes Master1022, sysprog and WWGD
  • #23
WWGD
Science Advisor
Gold Member
5,654
5,373
That seems like a good guess, but it is not close to the approximation, ##\sqrt{n\pi/2}+2/3##, that @sysprog gave. That gives about 13 when n=100.
Compare this problem to the well-known problem of the odds of two students in a class of 30 having the same birthday. The answer to that is that it is very likely. But half of 365 would make it very unlikely.
Thanks. So much for my intuition ;).
 
  • #24
FactChecker
Science Advisor
Gold Member
6,578
2,644
Thanks. So much for my intuition ;).
Mine too. When I read your post I thought you had to be right. But now I guess not.
 
  • #25
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
18,744
10,372
Mine too. When I read your post I thought you had to be right. But now I guess not.
I beat you both to it:

Or, it must be about ##\frac n 2 + 1## but it's a tricky calculation to get an exact answer!
 
  • Like
Likes FactChecker

Related Threads on Number of Die Throws until you get a repeated number

Replies
3
Views
255
Replies
7
Views
12K
Replies
2
Views
7K
Replies
5
Views
2K
  • Last Post
Replies
2
Views
1K
  • Last Post
Replies
3
Views
1K
  • Last Post
Replies
6
Views
1K
  • Last Post
Replies
1
Views
8K
Replies
6
Views
1K
Top