# I Probability of some sequence in a list of numbers

1. Aug 10, 2017

### Vrbic

Hello,
would like to derive a length of list of random numbers in which I may find some special sequence of few numbers with some probability.

For clearness I give an example: I have two generator of (pseudo) random numbers with same range of numbers, let's say (1-k). First generator give a random sequence of n numbers from range (1-k) and numbers may repiet. The second do the same but sequence of m numbers and m>n. What has to be m and n if I would like to find n in m (anywhere) with probability p?

2. Aug 10, 2017

### Vrbic

Ok my opinion is:
in long sequence "m" is $m-n+1$ of "n" sequences. A number of combinations of small sequence "n" is $k^n$. So probability than is $p=\frac{m-n+1}{k^n}$
Is it right?

3. Aug 10, 2017

### FactChecker

You need to explain more. For instance, you should say if you are talking about integers in some range or about real numbers. And what type of "special sequence" are you talking about? -- exact values, increasing numbers, etc. etc.

4. Aug 10, 2017

### Vrbic

Sorry, I'm talking about integers. And both sequences are random sequences of numbers from range (1-k). The word special was wrong.

5. Aug 10, 2017

### FactChecker

Sorry, I see that that was probably clear in your original post. I misread it.

6. Aug 10, 2017

### Vrbic

Nevermind, and what do you mean about my formula from post #2?

7. Aug 10, 2017

### FactChecker

I don't know the answer. And I'm still not completely clear on the question. Are you starting with a given sequence that needs to be matched or are you asking about any repeats of a certain length? Either way, I am not sure that I can answer your questions. I'll have to leave that to people who are better at it.

8. Aug 10, 2017

### Staff: Mentor

It is complicated.

Let's start with an easier question: For a given sequence of n integers from 1 to k, what is the probability that this sequence appears in a sequence of m random integers from 1 to k?
It turns out that this probability depends on the sequence. As a simple example, consider k=1, m=3 and compare the sequences "11" and "12". For the string of length 3, there are 8 options (111, 112, 121, 122, 211, 212, 221, 222), three of them contain "11" but four of them contain "12". That does not mean "11" would be less likely to appear, but it appears twice in the same string (in 111) while "12" does not.

For a given sequence, you can calculate the probability that it appears with a Markov chain. There is no nice general formula, and you have to do this for every type of pattern a sequence of n integers can form.
If n=1 or k is large, the approximation of m-n+1 independent sequences of length n will give a reasonable approximation in most cases, and the cases where it fails are long repetitions, they are unlikely for large k anyway (especially for large n).

9. Aug 11, 2017

### Vrbic

What is behind:
Roulette and very well known myth about winning due a betting on one color and doubling of a bet. Probably you know it. You bet still one color (black/red) and when you lose you double your bet untill you win.
So question is what is probability of sequence of let's say 6-10 same color in line?
May I use my formula for that?

10. Aug 11, 2017

### Staff: Mentor

Neglecting 0 we have the case k=2, where your formula shows large deviations.

11. Aug 14, 2017

### Vrbic

Ok, so there doesn't exist some easy formula how to describe such situation?

12. Aug 14, 2017

### Staff: Mentor

Not that I am aware of, but calculating it with Markov chains is fast. You can probably find some approximation for most cases, where only some special cases (like long series of the same color) lead to some deviations, but these special cases are rare.

13. Aug 15, 2017

### Vrbic

Ok, thank you. And do you know about good article or some text about Markov chains?

14. Aug 15, 2017

### Staff: Mentor

The Wikipedia article should give a good introduction, and there are various textbooks with more details.