Random sequence - full alphabet run length

  • Context: Undergrad 
  • Thread starter Thread starter Monte_Carlo
  • Start date Start date
  • Tags Tags
    Length Random Sequence
Click For Summary
SUMMARY

The discussion centers on calculating the average run length of a random sequence of digits from 0 to 9 until every digit has been observed at least once. This problem is formally known as the "coupon collector's problem," which falls under the branch of mathematics dealing with probability theory. The average run length for such sequences is approximately 29.2896, derived from summing the series 1 + 10/9 + 10/8 + ... + 10/1. Monte Carlo simulations can also be employed to estimate this average length.

PREREQUISITES
  • Understanding of probability theory
  • Familiarity with the coupon collector's problem
  • Basic knowledge of geometric series
  • Experience with Monte Carlo simulations
NEXT STEPS
  • Research the coupon collector's problem in-depth
  • Learn about geometric series and their applications in probability
  • Explore Monte Carlo simulation techniques for statistical estimation
  • Investigate other applications of run length encoding in data analysis
USEFUL FOR

Mathematicians, statisticians, data scientists, and anyone interested in probability theory and its applications in random sequences.

Monte_Carlo
Messages
70
Reaction score
0
Hi,

Suppose we're looking at a random sequence of digits from 0 to 9. We start off reading the digits until every digit from 0 to 9 has been seen at least once and we mark the count of digits read up to that point (run length). We then reset the run length and continue until the whole random sequence has been read. In the end, for every finite random sequence, there is a corresponding sequence of run lengths.

How would we be able to analytically arrive at average length of such these runs? What is the formal mathematical name given to such a run? What branch of mathematics concerns itself with this?

Monte
 
Physics news on Phys.org
Let me know if I am in the wrong place

Just want to make sure this forum is appropriate for the question above - I've seen some people have looked at the problem but so far I garnered zero responses. This is not a homework problem.

Again, basically, given finite alphabet and a finite sequence composed using that alphabet, we start reading the sequence until every symbol in the alphabet has been at read least once. We then record how many symbols it took us to reach that point. We repeat, until we've read the whole sequence, thus finishing with sequence of run lengths. The question is, what is average (expected) length of these runs (looks like around 27 for random sequence of digits from 0 to 9).
 
I suspect that you haven't gotten any response because it appears to be a very difficult calculation. Did you use some sort of Monte Carlo simulation to get 27?
 
I believe that your question is equivalent to what's known as the "coupon collector's problem." You should be able to find an answer using that information.
 
Once you've seen n digits, the next digit has a probability of n/10 of already been seen, so there is a probability of n/10 that a second digit is needed, n^2/100 that a third digit is needed etc. the sum of this is 1 + n/10 + n^2/100 + ... = 1/(1 - n/10). (sum of geometric series). This is the average number of digits you need to read until you get a new digit.
To get the average run length until all the digits are seen you need to sum 1/(1 - n/10) for n ranging from 0 up to 9, so you get 1 + 10/9 + 10/8 + 10/7 + ... . + 10/2 + 10/1 which is 29.2896...
 

Similar threads

  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 66 ·
3
Replies
66
Views
7K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 29 ·
Replies
29
Views
9K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
4
Views
2K
  • · Replies 15 ·
Replies
15
Views
3K