Probability of consecutive elements in set

Click For Summary
SUMMARY

The discussion focuses on calculating the probability of having 'Y' consecutive initial numbers remaining after removing a fraction 'F' from a set of 'N' numbers. The probability is derived using the formula 1 - (1 - (1-F)^(Y-1))^(N(1-F)), which approximates the likelihood of at least one remaining member being followed by 'Y-1' others. For large 'N', such as removing half of a million members, the chance of retaining 20 consecutive numbers is approximately 62%. The analysis emphasizes that the formula's validity diminishes when 'N' is not sufficiently large, suggesting a threshold of N > 50Y for accuracy.

PREREQUISITES
  • Understanding of probability theory and independent events
  • Familiarity with combinatorial mathematics
  • Basic knowledge of sequences and their properties
  • Experience with mathematical approximations and limits
NEXT STEPS
  • Explore advanced probability concepts in combinatorial settings
  • Study the implications of large sample sizes in probability calculations
  • Learn about uniform random selection and its effects on sequences
  • Investigate applications of probability in statistical modeling
USEFUL FOR

Mathematicians, statisticians, data scientists, and anyone interested in probability theory and its applications in analyzing sequences and random selections.

lzkelley
Messages
276
Reaction score
2
Lets say i have some number 'N' of numbers - in a particular order.
I then remove some fraction 'F' of those numbers.

I want to know the probability of there being (some number) 'Y' consecutive initial-numbers remaining.

Any ideas?

Would it just be F^Y ?
 
Physics news on Phys.org
If the numbers are removed independently with probability p, then the chance that Y numbers follow some given sequence member (not too close to the end) is (1-p)^Y.

If one element is removed uniformly at random from the sequence, then a second, a third, and so on until FN have been removed, then (for large N) this approximates the above with p = F.

Thus for each remaining member, the chance that it is followed by Y-1 members is approximately (1-F)^(Y-1).

Since we're assuming N is large, then very few members are close to the end, so we'll assume all N(1-F) remaining members are far from the end.

Now the chance that a given member is not followed by Y-1 others is, of course, 1 - (1-F)^(Y-1) in our approximation. Thus the chance that all N(1-F) remaining members are not followed by Y-1 others is about
(1 - (1-F)^(Y-1))^(N(1-F))
and so the chance that at least one not-removed member is followed by Y-1 others
1 - (1 - (1-F)^(Y-1))^(N(1-F))

For example, removing half of a million members, the chance that 20 in a row remain is about 62%.

If N is not large, the formula falls apart. I'm not sure how large you'd need, but maybe N > 50Y would be good enough.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 14 ·
Replies
14
Views
7K
  • · Replies 11 ·
Replies
11
Views
4K