Probability of consecutive elements in set

Click For Summary
The discussion focuses on calculating the probability of having 'Y' consecutive initial numbers remaining after removing a fraction 'F' from a set of 'N' numbers. It suggests that if numbers are removed independently, the probability that 'Y' numbers follow a given sequence member is approximately (1-p)^Y, where p is the removal probability. For large N, the chance that a given member is followed by Y-1 others is approximated by (1-F)^(Y-1). The overall probability that at least one remaining member is followed by Y-1 others is given by 1 - (1 - (1-F)^(Y-1))^(N(1-F)). The formula's reliability diminishes for smaller N, with a recommendation that N should ideally exceed 50Y for accuracy.
lzkelley
Messages
276
Reaction score
2
Lets say i have some number 'N' of numbers - in a particular order.
I then remove some fraction 'F' of those numbers.

I want to know the probability of there being (some number) 'Y' consecutive initial-numbers remaining.

Any ideas?

Would it just be F^Y ?
 
Physics news on Phys.org
If the numbers are removed independently with probability p, then the chance that Y numbers follow some given sequence member (not too close to the end) is (1-p)^Y.

If one element is removed uniformly at random from the sequence, then a second, a third, and so on until FN have been removed, then (for large N) this approximates the above with p = F.

Thus for each remaining member, the chance that it is followed by Y-1 members is approximately (1-F)^(Y-1).

Since we're assuming N is large, then very few members are close to the end, so we'll assume all N(1-F) remaining members are far from the end.

Now the chance that a given member is not followed by Y-1 others is, of course, 1 - (1-F)^(Y-1) in our approximation. Thus the chance that all N(1-F) remaining members are not followed by Y-1 others is about
(1 - (1-F)^(Y-1))^(N(1-F))
and so the chance that at least one not-removed member is followed by Y-1 others
1 - (1 - (1-F)^(Y-1))^(N(1-F))

For example, removing half of a million members, the chance that 20 in a row remain is about 62%.

If N is not large, the formula falls apart. I'm not sure how large you'd need, but maybe N > 50Y would be good enough.
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 14 ·
Replies
14
Views
6K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K