Combinatorial Family of Problems

  • A
  • Thread starter mathman
  • Start date
  • #1
mathman
Science Advisor
7,741
406

Summary:

Distributing balls in boxes

Main Question or Discussion Point

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls? More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m##\ne ##n.
 

Answers and Replies

  • #2
11,269
4,732
Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be ##(1/k)^m## right?
 
  • #3
Math_QED
Science Advisor
Homework Helper
2019 Award
1,349
484
Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be ##(1/k)^m## right?
I don't think it is that easy. Your answer doesn't even depend on the amount of balls.
 
  • #4
pbuk
Science Advisor
Gold Member
1,235
262
Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be ##(1/k)^m## right?
I don't think it is that easy. Your answer doesn't even depend on the amount of balls.
Indeed. Assuming the question is what is the probability that there will be exactly 1 box containing exactly m balls it is fairly easy though, its just a binomial trial with m successes out of i trials.
$$ \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$

More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m##\ne ##n.
Yes this is more difficult, again restating as what is the probability that there will be exactly two boxes containing exactly m and n balls respectively where m##\ne ##n.

It is a bit late here to be doing things like this, but it should be something like
$$ P(i, 2) p_m p_n (1 - p_m - p_n)^{(i - 2)} $$ where ## P(i, 2) ## is the number of permutations and ## p_m ## and ## p_n ## are calculated as in the solution to the first question above.
 
  • #5
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
11,266
4,448
Indeed. Assuming the question is what is the probability that there will be exactly 1 box containing exactly m balls it is fairly easy though, its just a binomial trial with m successes out of i trials.
$$ \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$
That's the probability that if you pick a particular box there will be be ##m## balls in it. That doesn't test whether there are ##m## balls in some other box.
 
  • #6
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
11,266
4,448
Summary:: Distributing balls in boxes

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls?
You can get the expected number of boxes with ##m## balls. First, pick a box and compute the probability that box has exactly ##m## balls in it - as @pbuk did above. The expected number of boxes with ##m## balls is then ##k## times this.

Clearly if ##m > i/2## then that gives the answer, as you can have at most one box with ##m## balls.

But, if ##m \le i/2## then you can get two or more boxes with ##m## balls. If you let ##r## be the number of boxes with ##m## balls, then you want to calculate the probability that ##r \ne 0##.

In general, I think this is difficult. But, if ##i## is large you could perhaps estimate ##r## as some sort of distribution with a mean of the calculated expected value, hence estimate the probability that ##r = 0##.
 
  • #7
S.G. Janssens
Science Advisor
Education Advisor
862
624
I would consult Volume 1 of Feller (An Introduction to Probability Theory and Its Applications), particularly his rather extensive treatment of what he refers to as "occupancy problems". There is a positive probability that you will find there what you are looking for.
 
  • #8
pbuk
Science Advisor
Gold Member
1,235
262
That's the probability that if you pick a particular box there will be be ##m## balls in it. That doesn't test whether there are ##m## balls in some other box.
Good catch. So let
$$ p_m = \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$
be the probability that if you pick a particular box there will be be ##m## balls in it, then isn't the probability that there will be exactly one such box
$$ \binom{k}{1} p_m(1-p_m)^{(k-1)} $$
? And isn't my answer to part 2 almost correct, barring confusion over i, k and m?
$$ P(k, 2) p_m p_n (1 - p_m - p_n)^{(k - 2)} $$
I may run a Monte Carlo check of these later.
 
  • #9
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
11,266
4,448
Good catch. So let
$$ p_m = \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$
be the probability that if you pick a particular box there will be be ##m## balls in it, then isn't the probability that there will be exactly one such box
$$ \binom{k}{1} p_m(1-p_m)^{(k-1)} $$
No, because there being ##m## balls in one box and ##m## balls in another are not independent.
 
  • #10
pbuk
Science Advisor
Gold Member
1,235
262
No, because there being ##m## balls in one box and ##m## balls in another are not independent.
Of course - I have been very obtuse. This is a difficult problem.
 
  • #11
mathman
Science Advisor
7,741
406
Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.
 
  • #12
Klystron
Gold Member
516
599
Tossing the balls in to boxes "at random" rather than choosing balls, potentially adds more complication to this question.

Consider the 'skee-ball effect' where the player rolls each ball up an inclined ramp attempting to score points by tossing the ball into concentric circles with highest scores nearest the origin. Even stipulating a grid of |sqrt(k)| boxes and a blind toss should favor the central boxes, say utilizing a 10x10 grid.

While the question does not address the design of the box pattern or the size of the box opening in relation to the ball radius or the angle of the 'tosser' (NPI) to the box openings, this is not just a question of counting. The pigeon-hole principle as I understand it, presumes the pigeons fly into the first/next least occupied box, depending on the exact wording of the question.
 
  • #13
mathman
Science Advisor
7,741
406
Tossing the balls in to boxes "at random" rather than choosing balls, potentially adds more complication to this question.

Consider the 'skee-ball effect' where the player rolls each ball up an inclined ramp attempting to score points by tossing the ball into concentric circles with highest scores nearest the origin. Even stipulating a grid of |sqrt(k)| boxes and a blind toss should favor the central boxes, say utilizing a 10x10 grid.

While the question does not address the design of the box pattern or the size of the box opening in relation to the ball radius or the angle of the 'tosser' (NPI) to the box openings, this is not just a question of counting. The pigeon-hole principle as I understand it, presumes the pigeons fly into the first/next least occupied box, depending on the exact wording of the question.
The poblem I posed assumed the simplest possible arrangement. For each ball, all boxes are equally likely to be chosen and the choices are independent between balls.
 
  • #14
WWGD
Science Advisor
Gold Member
2019 Award
5,124
2,322
Summary:: Distributing balls in boxes

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls? More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m##\ne ##n.
Any condition on whether i>k, i<k i=k? EDIT: It seems we can use the balls in boxes results to:
throw m balls in one box and then count the number of ways of having i-m balls in other boxes and combine with some inclusion-exclusion formula.
 
  • #15
StoneTemplePython
Science Advisor
Gold Member
2019 Award
1,138
540
Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.
for small i like this, what I'm going to say won't be that useful, but for medium to large i, there's a nice way to get an upper bound in the 1 dimensional case using Azuma-Hoeffding. The structural key features are (a) easy to estimate expected value and (b) bounded differences for each 'random placement'. For a 2-d problem as you've posed, though I haven't looked that closely the structural features look about the same so I imagine Hoeffding or Bernstein type concentration inequalities should work.

Again these type of inequalities are pretty weak for small i, but the bounds do tighten exponentially.

There may be a closed form solution here but it's likely to be rather unpleasant. The pleasant approach for these kinds of problem is a one-two punch of (1) work it out numerically (enumeration or simulation) and (2) get some nice analytical bounds that you can use. An important special case: for vary large i and k, enumeration will tap out, and for low probability events, simulation could be problematic but (2) works very nicely in that case.
 
  • #16
WWGD
Science Advisor
Gold Member
2019 Award
5,124
2,322
Any condition on whether i>k, i<k i=k? EDIT: It seems we can use the balls in boxes results to:
throw m balls in one box and then count the number of ways of having i-m balls in other boxes and combine with some inclusion-exclusion formula.
I am thinking on inducting on the quotient q of number of balls by number b of boxes. This will give us the number of boxes that can have a certain number n of balls as q/n ( taking largest integer less than the quotient) then you know how many boxes can have n or more balls. So say we have 20 balls in 7 boxes and want to find the probability of having 3 in one. From 20/3 , we know that we may have at most 6 boxes with 3 or more balls, etc.
 
Last edited:
  • #17
pbuk
Science Advisor
Gold Member
1,235
262
Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.
Well there are only ## 4^{10} \approx 10^6 ## ways to toss 10 balls into 4 containers so this can easily be brute forced.
 
  • #18
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
11,266
4,448
Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.
It's not too hard if you have specific numbers. In this case, there are four successful "patterns" for how many balls are in each box: ##0, 2, 3, 5## and ##1, 2, 3, 4##. These are the only patterns that have exactly one box with two balls and exactly one box with three balls.

Each pattern has ##24## permutations for which box has each number of balls. So, we can calculate the probablity that we get ##0, 2, 3, 5## in that order and then multiply by ##24## to get the total.

Now, we use the trick to number the balls and count how many ways we can get ##0, 2, 3, 5##, which is:

##N_1 = 24 \times 1 \times \binom{10}{2} \times \binom{8}{3} = 60,480##

And, for ##1, 2, 3, 4## we have:

##N_2 = 24 \times 10 \times \binom{9}{2} \times \binom{7}{3} = 302,400##

With the balls numbered, there are a total of ##4^{10}## possibilities, so the probability we want is:

##p = \frac{N_1 + N_2}{4^{10}} \approx 0.35##

It seems quite high, but I checked some other possibilities and it looks about right.

PS It all works out. The pattern ##1,2,3,4## is far and away the most likely, with a probability of ##0.29##. The second most likely pattern is ##2,2,3,3## at ##0.14##.

PPS The only additional pattern where there is at least one box with two balls and at least one box with three balls is ##2,2,3,3##. That would bump the probability up to ##0.49## for the alternative problem.
 
Last edited:
  • #19
mathman
Science Advisor
7,741
406
The statement includes 2,2,3,3 as allowed.
 
  • #20
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
11,266
4,448
The statement includes 2,2,3,3 as allowed.
Okay, so it's ##0.49## then.
 
  • #21
pbuk
Science Advisor
Gold Member
1,235
262
The statement includes 2,2,3,3 as allowed.
No, it was later clarified:
To clarify original question - m and n were meant to be exact, not at least.
 

Related Threads for: Combinatorial Family of Problems

  • Last Post
Replies
4
Views
1K
  • Last Post
Replies
1
Views
522
  • Last Post
Replies
0
Views
1K
  • Last Post
Replies
10
Views
3K
Replies
0
Views
2K
Replies
13
Views
3K
  • Last Post
Replies
7
Views
3K
  • Last Post
Replies
8
Views
1K
Top