Combinatorial Family of Problems

mathman · Jan 13, 2020

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls? More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m##\ne ##n.

jedishrfu · Jan 13, 2020

Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be ##(1/k)^m## right?

member 587159 · Jan 13, 2020

jedishrfu said:

Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be ##(1/k)^m## right?

I don't think it is that easy. Your answer doesn't even depend on the amount of balls.

pbuk · Jan 13, 2020

Math_QED said:

jedishrfu said:

Throwing one ball means 1/k chance of it being in a box so that for m balls in a box it would be ##(1/k)^m## right?

I don't think it is that easy. Your answer doesn't even depend on the amount of balls.

Indeed. Assuming the question is what is the probability that there will be exactly 1 box containing exactly m balls it is fairly easy though, its just a binomial trial with m successes out of i trials.
$$ \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$

mathman said:

More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m##\ne ##n.

Yes this is more difficult, again restating as what is the probability that there will be exactly two boxes containing exactly m and n balls respectively where m##\ne ##n.

It is a bit late here to be doing things like this, but it should be something like
$$ P(i, 2) p_m p_n (1 - p_m - p_n)^{(i - 2)} $$ where ## P(i, 2) ## is the number of permutations and ## p_m ## and ## p_n ## are calculated as in the solution to the first question above.

PeroK · Jan 14, 2020

pbuk said:

Indeed. Assuming the question is what is the probability that there will be exactly 1 box containing exactly m balls it is fairly easy though, its just a binomial trial with m successes out of i trials.
$$ \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$

That's the probability that if you pick a particular box there will be be ##m## balls in it. That doesn't test whether there are ##m## balls in some other box.

PeroK · Jan 14, 2020

mathman said:

Summary:: Distributing balls in boxes

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls?

You can get the expected number of boxes with ##m## balls. First, pick a box and compute the probability that box has exactly ##m## balls in it - as @pbuk did above. The expected number of boxes with ##m## balls is then ##k## times this.

Clearly if ##m > i/2## then that gives the answer, as you can have at most one box with ##m## balls.

But, if ##m \le i/2## then you can get two or more boxes with ##m## balls. If you let ##r## be the number of boxes with ##m## balls, then you want to calculate the probability that ##r \ne 0##.

In general, I think this is difficult. But, if ##i## is large you could perhaps estimate ##r## as some sort of distribution with a mean of the calculated expected value, hence estimate the probability that ##r = 0##.

S.G. Janssens · Jan 14, 2020

I would consult Volume 1 of Feller (An Introduction to Probability Theory and Its Applications), particularly his rather extensive treatment of what he refers to as "occupancy problems". There is a positive probability that you will find there what you are looking for.

pbuk · Jan 14, 2020

PeroK said:

That's the probability that if you pick a particular box there will be be ##m## balls in it. That doesn't test whether there are ##m## balls in some other box.

Good catch. So let
$$ p_m = \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$
be the probability that if you pick a particular box there will be be ##m## balls in it, then isn't the probability that there will be exactly one such box
$$ \binom{k}{1} p_m(1-p_m)^{(k-1)} $$
? And isn't my answer to part 2 almost correct, barring confusion over i, k and m?
$$ P(k, 2) p_m p_n (1 - p_m - p_n)^{(k - 2)} $$
I may run a Monte Carlo check of these later.

PeroK · Jan 14, 2020

pbuk said:

Good catch. So let
$$ p_m = \binom{i}{m} (1/k)^m (1-1/k)^{(i-m)} $$
be the probability that if you pick a particular box there will be be ##m## balls in it, then isn't the probability that there will be exactly one such box
$$ \binom{k}{1} p_m(1-p_m)^{(k-1)} $$

No, because there being ##m## balls in one box and ##m## balls in another are not independent.

pbuk · Jan 14, 2020

PeroK said:

No, because there being ##m## balls in one box and ##m## balls in another are not independent.

Of course - I have been very obtuse. This is a difficult problem.

mathman · Jan 14, 2020

Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.

Klystron · Jan 15, 2020

Tossing the balls into boxes "at random" rather than choosing balls, potentially adds more complication to this question.

Consider the 'skee-ball effect' where the player rolls each ball up an inclined ramp attempting to score points by tossing the ball into concentric circles with highest scores nearest the origin. Even stipulating a grid of |sqrt(k)| boxes and a blind toss should favor the central boxes, say utilizing a 10x10 grid.

While the question does not address the design of the box pattern or the size of the box opening in relation to the ball radius or the angle of the 'tosser' (NPI) to the box openings, this is not just a question of counting. The pigeon-hole principle as I understand it, presumes the pigeons fly into the first/next least occupied box, depending on the exact wording of the question.

mathman · Jan 15, 2020

Klystron said:

Tossing the balls into boxes "at random" rather than choosing balls, potentially adds more complication to this question.

Consider the 'skee-ball effect' where the player rolls each ball up an inclined ramp attempting to score points by tossing the ball into concentric circles with highest scores nearest the origin. Even stipulating a grid of |sqrt(k)| boxes and a blind toss should favor the central boxes, say utilizing a 10x10 grid.

While the question does not address the design of the box pattern or the size of the box opening in relation to the ball radius or the angle of the 'tosser' (NPI) to the box openings, this is not just a question of counting. The pigeon-hole principle as I understand it, presumes the pigeons fly into the first/next least occupied box, depending on the exact wording of the question.

The poblem I posed assumed the simplest possible arrangement. For each ball, all boxes are equally likely to be chosen and the choices are independent between balls.

WWGD · Jan 15, 2020

mathman said:

Summary:: Distributing balls in boxes

Given i identical balls, tossed in at random in k boxes, what is the probability that there will be a box containing m balls? More difficult question, what is the probability that there will be two boxes containing m and n balls respectively where m##\ne ##n.

Any condition on whether i>k, i<k i=k? EDIT: It seems we can use the balls in boxes results to:
throw m balls in one box and then count the number of ways of having i-m balls in other boxes and combine with some inclusion-exclusion formula.

StoneTemplePython · Jan 15, 2020

mathman said:

Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.

for small i like this, what I'm going to say won't be that useful, but for medium to large i, there's a nice way to get an upper bound in the 1 dimensional case using Azuma-Hoeffding. The structural key features are (a) easy to estimate expected value and (b) bounded differences for each 'random placement'. For a 2-d problem as you've posed, though I haven't looked that closely the structural features look about the same so I imagine Hoeffding or Bernstein type concentration inequalities should work.

Again these type of inequalities are pretty weak for small i, but the bounds do tighten exponentially.

There may be a closed form solution here but it's likely to be rather unpleasant. The pleasant approach for these kinds of problem is a one-two punch of (1) work it out numerically (enumeration or simulation) and (2) get some nice analytical bounds that you can use. An important special case: for vary large i and k, enumeration will tap out, and for low probability events, simulation could be problematic but (2) works very nicely in that case.

WWGD · Jan 16, 2020

WWGD said:

Any condition on whether i>k, i<k i=k? EDIT: It seems we can use the balls in boxes results to:
throw m balls in one box and then count the number of ways of having i-m balls in other boxes and combine with some inclusion-exclusion formula.

I am thinking on inducting on the quotient q of number of balls by number b of boxes. This will give us the number of boxes that can have a certain number n of balls as q/n ( taking largest integer less than the quotient) then you know how many boxes can have n or more balls. So say we have 20 balls in 7 boxes and want to find the probability of having 3 in one. From 20/3 , we know that we may have at most 6 boxes with 3 or more balls, etc.

pbuk · Jan 16, 2020

mathman said:

Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.

Well there are only ## 4^{10} \approx 10^6 ## ways to toss 10 balls into 4 containers so this can easily be brute forced.

PeroK · Jan 17, 2020

mathman said:

Background: There was a specific problem (math.stackexchange.com) where i=10, k=4, and (m,n)=(2,3), which was not solved and then was closed by the nannies of that forum.

To clarify original question - m and n were meant to be exact, not at least.

It's not too hard if you have specific numbers. In this case, there are four successful "patterns" for how many balls are in each box: ##0, 2, 3, 5## and ##1, 2, 3, 4##. These are the only patterns that have exactly one box with two balls and exactly one box with three balls.

Each pattern has ##24## permutations for which box has each number of balls. So, we can calculate the probability that we get ##0, 2, 3, 5## in that order and then multiply by ##24## to get the total.

Now, we use the trick to number the balls and count how many ways we can get ##0, 2, 3, 5##, which is:

##N_1 = 24 \times 1 \times \binom{10}{2} \times \binom{8}{3} = 60,480##

And, for ##1, 2, 3, 4## we have:

##N_2 = 24 \times 10 \times \binom{9}{2} \times \binom{7}{3} = 302,400##

With the balls numbered, there are a total of ##4^{10}## possibilities, so the probability we want is:

##p = \frac{N_1 + N_2}{4^{10}} \approx 0.35##

It seems quite high, but I checked some other possibilities and it looks about right.

PS It all works out. The pattern ##1,2,3,4## is far and away the most likely, with a probability of ##0.29##. The second most likely pattern is ##2,2,3,3## at ##0.14##.

PPS The only additional pattern where there is at least one box with two balls and at least one box with three balls is ##2,2,3,3##. That would bump the probability up to ##0.49## for the alternative problem.

mathman · Jan 17, 2020

The statement includes 2,2,3,3 as allowed.

PeroK · Jan 17, 2020

mathman said:

The statement includes 2,2,3,3 as allowed.

Okay, so it's ##0.49## then.

pbuk · Jan 17, 2020

mathman said:

The statement includes 2,2,3,3 as allowed.

No, it was later clarified:

mathman said:

To clarify original question - m and n were meant to be exact, not at least.

Combinatorial Family of Problems

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight