High School Not sure what this problem would be called....

Click For Summary
SUMMARY

The discussion centers on calculating the probability of mismatching bits between two randomly generated N-length binary arrays, with a focus on the binomial distribution. Users clarify that the probability of generating a "1" must be specified, and that the bell curve observed relates to the distribution of exact matches rather than mismatches. The binomial distribution formula, P(N,M) = (N over M) × (1/2)^N, is highlighted as essential for determining the probability of M matches in this context. The conversation concludes with acknowledgment of the binomial distribution curve as the necessary solution.

PREREQUISITES
  • Understanding of binomial distribution
  • Familiarity with Pascal's triangle
  • Basic knowledge of probability theory
  • Experience with binary arrays and bit manipulation
NEXT STEPS
  • Study the properties of binomial distribution in depth
  • Learn how to implement Pascal's triangle in programming
  • Explore applications of binomial distribution in real-world scenarios
  • Investigate statistical software tools for probability calculations
USEFUL FOR

Mathematicians, data scientists, software developers, and anyone interested in statistical modeling and probability calculations involving binary data.

Povilas M
Messages
9
Reaction score
1
Hi,

I am trying to write a function which would take two arguments:
-> number of bits (which are a binary 0 or 1 value) N
-> and acceptable number of mismatching bits M
The function would statistically determine the probability of having M or less mismatching bits when randomly generating two N long bit arrays and comparing them. Difference between 1001 and 1011 would be 1 for example as only the third bit differs. My testing indicates a bell curve is formed around N/2, but I've not done maths for years, it's hard for me to figure the formula out...

Thanks a lot
 
Physics news on Phys.org
Povilas M said:
My testing indicates a bell curve is formed around N/2
Hi Povilas:

Your problem statement confuses me a bit.
1. You did not mention what the the probability of generating a "1" is with the bit generator you are using. In the above quote the "N/2" suggests the probability is exactly 1/2, but you did not say that explicitly.
2. You say that your experiments produce a "bell curve", but your description says you want a curve for "M or less mismatching bits". That seems wrong to me since a curve for that result would not be bell shaped. What would be approximately bell shaped is a curve for exactly M matches.

If the "bell curve" you want is for exactly M matches, the answer is a binomial distribution curve. Take a look at

Hope this helps.

Regards,
Buzz
 
  • Like
Likes Povilas M and jedishrfu
Can you provide a context of where you might use this function?

I was thinking that you could simplify the problem by considering the first N bit array as all zeros and
now you only need to count all the N bit arrays with <=M bits as ones:

Basically its a binomial distribution problem ala pascal's triangle kind of problem:

https://en.wikipedia.org/wiki/Pascal's_triangle

540px-Pascal%27s_triangle_5.svg.png


so for a five bit string you'd have:
- 1 string of zeros
- 5 strings with 1 one bit
- 10 strings with 2 one bits
- 10 strings with 3 one bits
- 5 strings with 4 one bitys
- 1 string with all one bits
 
  • Like
Likes Povilas M
jedishrfu said:
Can you provide a context of where you might use this function?
Hi @jedishrfu:

I am not sure you were asking the question for me to answer, but here it is.

The binomial distribution has many uses. A common example is to calculate the probability of getting M heads when flipping "an honest coin" N times.
This is exactly the same probability of getting M matches (heads and heads or tails and tails) when flipping two "honest coins" M times.

jedishrfu said:
Basically its a binomial distribution problem ala pascal's triangle kind of problem:
Pascals triangle is a way of calculating the number of combinations of N things when M are of one kind and N-M are of a different kind. This usually noted as
(N over M).​
(Sorry about the awkwardness of the way I show the notation. The Wikipedia citation in post #2 shows it correctly.)
In the quote above I am not sure what "ala" means exactly, but
P(N,M) = (N over M) × (1/2)N,​
where P is the binomial distribution probability when P(1,1) = 1/2, the probability of heads for an "honest coin".

Regards,
Buzz
 
  • Like
Likes Povilas M
Thanks all! Binomial distribution curve was exactly what I needed, I figured out the rest from there. Jedishrfu, simplifying the problem was a very useful point as well!
Thanks again!
 
  • Like
Likes jedishrfu
My context question was for the original poster.

Hopefully, he can tell us more about the problem he's trying to solve.
 
Povilas M said:
Thanks all! Binomial distribution curve was exactly what I needed, I figured out the rest from there. Jedishrfu, simplifying the problem was a very useful point as well!
Thanks again!
So i guess youre not going to tell us the context.
 

Similar threads

  • · Replies 29 ·
Replies
29
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
2
Views
2K
Replies
1
Views
1K
  • · Replies 11 ·
Replies
11
Views
3K