Probability of finding a k-subset in a d-dimensional random

Click For Summary
SUMMARY

The discussion centers on calculating the probability of finding a k-subset in a d-dimensional random binary array, specifically with parameters n=10, d=2, k=4, s_1=3, and s_2=4. The problem involves determining the likelihood of selecting a subset where at least three individuals wear hats and at least four wear glasses, given fixed totals of 5 hats and 6 glasses. The participants suggest that this is a combinatorial problem potentially linked to Ramsey Theory and recommend using simulations for approximate solutions due to the complexity of analytical solutions.

PREREQUISITES
  • Understanding of combinatorial probability
  • Familiarity with binary arrays and their properties
  • Knowledge of Ramsey Theory concepts
  • Experience with simulation techniques for probability estimation
NEXT STEPS
  • Research combinatorial probability methods for k-subset selection
  • Explore Ramsey Theory and its applications in probability
  • Learn about Monte Carlo simulations for estimating probabilities
  • Investigate the properties of random binary arrays in statistical analysis
USEFUL FOR

Mathematicians, statisticians, data scientists, and anyone interested in combinatorial probability and its applications in random binary systems.

derpetermann
Messages
1
Reaction score
0
For a somwehat simple version of the problem, imagine a group of ten people (n=10), for which we observe two binary attributes (d= 2). One of the attributes tells us if any given person wears glasses (d_1) the other if any given person wears a hat (d_2). We know that in total 5 people wear hats and 6 people wear glasses, but we don't know who wears what (random binary array). What is the probability of finding a subset of size 4 (k= 4) such that at least three people in the subset wear a hat (s_1 = 3) AND at least four people wear glasses (s_2 = 4)? It's somewhat clear that there are realisations where wearing a hat and wearing glasses are mutually exclusive, such that those who have a hat don't have glasses, and realisations where having glasses and having a hat go together. Thus, for any realisation of the random binary array there might or might not be a subset of size k with the above properties. I can't figure, though, how to calculate the probabilities.

The more general definition of the problem is this:

What is the probability of finding a k-subset in a d-dimensional random binary array of length n, such that this subset has at least s_1 ones in dimension 1, s_2 ones in dimension 2, ..., s_d ones in dimension d. The total number of ones per dimension (ts_1 to ts_d) are fixed.

Is there a name to this problem? I tried to google, but I really did not know what to look for, since the problem definition is quite unwieldy to begin with.

Is the problem tractable analytically? If yes does the solution scale to larger d, k and n when solved algorithmically?

Your help is highly appreciated! Thanks a lot in advance!
 
Physics news on Phys.org
derpetermann said:
For a somwehat simple version of the problem, imagine a group of ten people (n=10), for which we observe two binary attributes (d= 2). One of the attributes tells us if any given person wears glasses (d_1) the other if any given person wears a hat (d_2). We know that in total 5 people wear hats and 6 people wear glasses, but we don't know who wears what (random binary array). What is the probability of finding a subset of size 4 (k= 4) such that at least three people in the subset wear a hat (s_1 = 3) AND at least four people wear glasses (s_2 = 4)? It's somewhat clear that there are realisations where wearing a hat and wearing glasses are mutually exclusive, such that those who have a hat don't have glasses, and realisations where having glasses and having a hat go together. Thus, for any realisation of the random binary array there might or might not be a subset of size k with the above properties. I can't figure, though, how to calculate the probabilities.

The more general definition of the problem is this:

What is the probability of finding a k-subset in a d-dimensional random binary array of length n, such that this subset has at least s_1 ones in dimension 1, s_2 ones in dimension 2, ..., s_d ones in dimension d. The total number of ones per dimension (ts_1 to ts_d) are fixed.

Is there a name to this problem? I tried to google, but I really did not know what to look for, since the problem definition is quite unwieldy to begin with.

Is the problem tractable analytically? If yes does the solution scale to larger d, k and n when solved algorithmically?

Your help is highly appreciated! Thanks a lot in advance!

Sorry, I know this is vague ( and I will look into it further and get back to you when/if I find something better) but it seems like a Combinatorial question maybe related to Ramsey Theory https://duckduckgo.com/?q=Ramsey+numbers&t=h_&ia=web
 
  • Like
Likes   Reactions: Klystron
Are d_1 and d_2 independent?
There are 5 options for the number of people wearing both a hat and glasses, you can calculate the probability separately for each case and then add them.

I don't know about nice analytic solutions when there are too many cases to consider all of them, but simulations should work very well for approximate answers.
 

Similar threads

  • · Replies 29 ·
Replies
29
Views
6K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
6K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
4K
Replies
4
Views
5K
Replies
9
Views
2K