# Combinatorial Family of Problems

• A

## Summary:

Distributing balls in boxes

## Main Question or Discussion Point

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls? More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m$\ne$n.

Related Set Theory, Logic, Probability, Statistics News on Phys.org
jedishrfu
Mentor
Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be $(1/k)^m$ right?

Math_QED
Homework Helper
2019 Award
Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be $(1/k)^m$ right?
I don't think it is that easy. Your answer doesn't even depend on the amount of balls.

pbuk
Gold Member
Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be $(1/k)^m$ right?
I don't think it is that easy. Your answer doesn't even depend on the amount of balls.
Indeed. Assuming the question is what is the probability that there will be exactly 1 box containing exactly m balls it is fairly easy though, its just a binomial trial with m successes out of i trials.
$$\binom{i}{m} (1/k)^m (1-1/k)^{(i-m)}$$

More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m$\ne$n.
Yes this is more difficult, again restating as what is the probability that there will be exactly two boxes containing exactly m and n balls respectively where m$\ne$n.

It is a bit late here to be doing things like this, but it should be something like
$$P(i, 2) p_m p_n (1 - p_m - p_n)^{(i - 2)}$$ where $P(i, 2)$ is the number of permutations and $p_m$ and $p_n$ are calculated as in the solution to the first question above.

PeroK
Homework Helper
Gold Member
Indeed. Assuming the question is what is the probability that there will be exactly 1 box containing exactly m balls it is fairly easy though, its just a binomial trial with m successes out of i trials.
$$\binom{i}{m} (1/k)^m (1-1/k)^{(i-m)}$$
That's the probability that if you pick a particular box there will be be $m$ balls in it. That doesn't test whether there are $m$ balls in some other box.

PeroK
Homework Helper
Gold Member
Summary:: Distributing balls in boxes

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls?
You can get the expected number of boxes with $m$ balls. First, pick a box and compute the probability that box has exactly $m$ balls in it - as @pbuk did above. The expected number of boxes with $m$ balls is then $k$ times this.

Clearly if $m > i/2$ then that gives the answer, as you can have at most one box with $m$ balls.

But, if $m \le i/2$ then you can get two or more boxes with $m$ balls. If you let $r$ be the number of boxes with $m$ balls, then you want to calculate the probability that $r \ne 0$.

In general, I think this is difficult. But, if $i$ is large you could perhaps estimate $r$ as some sort of distribution with a mean of the calculated expected value, hence estimate the probability that $r = 0$.

S.G. Janssens
I would consult Volume 1 of Feller (An Introduction to Probability Theory and Its Applications), particularly his rather extensive treatment of what he refers to as "occupancy problems". There is a positive probability that you will find there what you are looking for.

pbuk
Gold Member
That's the probability that if you pick a particular box there will be be $m$ balls in it. That doesn't test whether there are $m$ balls in some other box.
Good catch. So let
$$p_m = \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)}$$
be the probability that if you pick a particular box there will be be $m$ balls in it, then isn't the probability that there will be exactly one such box
$$\binom{k}{1} p_m(1-p_m)^{(k-1)}$$
? And isn't my answer to part 2 almost correct, barring confusion over i, k and m?
$$P(k, 2) p_m p_n (1 - p_m - p_n)^{(k - 2)}$$
I may run a Monte Carlo check of these later.

PeroK
Homework Helper
Gold Member
Good catch. So let
$$p_m = \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)}$$
be the probability that if you pick a particular box there will be be $m$ balls in it, then isn't the probability that there will be exactly one such box
$$\binom{k}{1} p_m(1-p_m)^{(k-1)}$$
No, because there being $m$ balls in one box and $m$ balls in another are not independent.

pbuk
Gold Member
No, because there being $m$ balls in one box and $m$ balls in another are not independent.
Of course - I have been very obtuse. This is a difficult problem.

Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.

Klystron
Gold Member
Tossing the balls in to boxes "at random" rather than choosing balls, potentially adds more complication to this question.

Consider the 'skee-ball effect' where the player rolls each ball up an inclined ramp attempting to score points by tossing the ball into concentric circles with highest scores nearest the origin. Even stipulating a grid of |sqrt(k)| boxes and a blind toss should favor the central boxes, say utilizing a 10x10 grid.

While the question does not address the design of the box pattern or the size of the box opening in relation to the ball radius or the angle of the 'tosser' (NPI) to the box openings, this is not just a question of counting. The pigeon-hole principle as I understand it, presumes the pigeons fly into the first/next least occupied box, depending on the exact wording of the question.

Tossing the balls in to boxes "at random" rather than choosing balls, potentially adds more complication to this question.

Consider the 'skee-ball effect' where the player rolls each ball up an inclined ramp attempting to score points by tossing the ball into concentric circles with highest scores nearest the origin. Even stipulating a grid of |sqrt(k)| boxes and a blind toss should favor the central boxes, say utilizing a 10x10 grid.

While the question does not address the design of the box pattern or the size of the box opening in relation to the ball radius or the angle of the 'tosser' (NPI) to the box openings, this is not just a question of counting. The pigeon-hole principle as I understand it, presumes the pigeons fly into the first/next least occupied box, depending on the exact wording of the question.
The poblem I posed assumed the simplest possible arrangement. For each ball, all boxes are equally likely to be chosen and the choices are independent between balls.

WWGD
Gold Member
2019 Award
Summary:: Distributing balls in boxes

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls? More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m$\ne$n.
Any condition on whether i>k, i<k i=k? EDIT: It seems we can use the balls in boxes results to:
throw m balls in one box and then count the number of ways of having i-m balls in other boxes and combine with some inclusion-exclusion formula.

StoneTemplePython
Gold Member
2019 Award
Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.
for small i like this, what I'm going to say won't be that useful, but for medium to large i, there's a nice way to get an upper bound in the 1 dimensional case using Azuma-Hoeffding. The structural key features are (a) easy to estimate expected value and (b) bounded differences for each 'random placement'. For a 2-d problem as you've posed, though I haven't looked that closely the structural features look about the same so I imagine Hoeffding or Bernstein type concentration inequalities should work.

Again these type of inequalities are pretty weak for small i, but the bounds do tighten exponentially.

There may be a closed form solution here but it's likely to be rather unpleasant. The pleasant approach for these kinds of problem is a one-two punch of (1) work it out numerically (enumeration or simulation) and (2) get some nice analytical bounds that you can use. An important special case: for vary large i and k, enumeration will tap out, and for low probability events, simulation could be problematic but (2) works very nicely in that case.

WWGD
Gold Member
2019 Award
Any condition on whether i>k, i<k i=k? EDIT: It seems we can use the balls in boxes results to:
throw m balls in one box and then count the number of ways of having i-m balls in other boxes and combine with some inclusion-exclusion formula.
I am thinking on inducting on the quotient q of number of balls by number b of boxes. This will give us the number of boxes that can have a certain number n of balls as q/n ( taking largest integer less than the quotient) then you know how many boxes can have n or more balls. So say we have 20 balls in 7 boxes and want to find the probability of having 3 in one. From 20/3 , we know that we may have at most 6 boxes with 3 or more balls, etc.

Last edited:
pbuk
Gold Member
Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.
Well there are only $4^{10} \approx 10^6$ ways to toss 10 balls into 4 containers so this can easily be brute forced.

PeroK
Homework Helper
Gold Member
Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.
It's not too hard if you have specific numbers. In this case, there are four successful "patterns" for how many balls are in each box: $0, 2, 3, 5$ and $1, 2, 3, 4$. These are the only patterns that have exactly one box with two balls and exactly one box with three balls.

Each pattern has $24$ permutations for which box has each number of balls. So, we can calculate the probablity that we get $0, 2, 3, 5$ in that order and then multiply by $24$ to get the total.

Now, we use the trick to number the balls and count how many ways we can get $0, 2, 3, 5$, which is:

$N_1 = 24 \times 1 \times \binom{10}{2} \times \binom{8}{3} = 60,480$

And, for $1, 2, 3, 4$ we have:

$N_2 = 24 \times 10 \times \binom{9}{2} \times \binom{7}{3} = 302,400$

With the balls numbered, there are a total of $4^{10}$ possibilities, so the probability we want is:

$p = \frac{N_1 + N_2}{4^{10}} \approx 0.35$

It seems quite high, but I checked some other possibilities and it looks about right.

PS It all works out. The pattern $1,2,3,4$ is far and away the most likely, with a probability of $0.29$. The second most likely pattern is $2,2,3,3$ at $0.14$.

PPS The only additional pattern where there is at least one box with two balls and at least one box with three balls is $2,2,3,3$. That would bump the probability up to $0.49$ for the alternative problem.

Last edited:
The statement includes 2,2,3,3 as allowed.

PeroK
Homework Helper
Gold Member
The statement includes 2,2,3,3 as allowed.
Okay, so it's $0.49$ then.

pbuk