# Homework Help: Probability problem

Tags:
1. Feb 8, 2015

### tony24810

1. The problem statement, all variables and given/known data

A teacher would like to distribute 20 candies to 5 children, each of which receives at least two candies.

(a) Find the probability that at least one child receives at least 6 candies.

(b) Find the probability that at least one child receives at least 7 candies if at least one child receives at least 6 candies

2. Relevant equations

(a) P(requried) = $1 - \frac {1} {C^{20-1}_{5-1}}$

= $\frac {3875}{3876}$

(b) P(required) = $\frac{\frac {3875}{3876}-\frac{5 C_{4-1}^{14-1}}{C_{5-1}^{20-1}}+\frac{C_2^5 C_{3-1}^{8-1}}{C_{5-1}^{20-1}}-\frac{C_3^5}{C_{5-1}^{20-1}}}{\frac {3875}{3876}}$

= $\frac{529}{775}$

3. The attempt at a solution

I thought of simplifying the problem at first.Because each child has to get at least 2 candies, so I thought of changing the question so that it becomes distributing 10 candies to 5 children with no restriction.

And then I have no idea what to do.

I looked at the answer but I couldn't work out what it is doing.

Please could someone give me some hints? Thanks.

2. Feb 8, 2015

### haruspex

The problem is badly worded. The probabilities depend on the teacher's algorithm for distributing them.
.
Suppose the algorithm is to divide the candies (considered as identical) into five piles (considered non-identical), then discard the distribution if it doesn't meet the criteria and repeat until we happen on one that does. Can you make some progress from there?
(Making such a choice about whether a given collection, children or candies, consists of identical, interchangeable entities or distinct, 'labelled' entities affects the answer, and can make a huge difference to the ease of calculation. I've chosen a combination that commonly applies to such questions and is mathematically manageable.)
Whatever the correct interpretation, your simplification is not going to work.
The given answer looks peculiar, but I haven't checked it yet. Bedtime.

Must have been past bedtime. I'll correct one statement: your simplification may well be right (probably is) with some interpretations, but not with others.

Last edited: Feb 8, 2015
3. Feb 8, 2015

### Ray Vickson

haruspex brings up a very good point, that I will expand upon. Suppose we want the conditional probability distribution $P(j_1,j_2,j_3,j_4,j_5 | E)$, where $E = \{\text{ each} \: j_i \geq 2 \}$. Here, $j_i$ is the number of candies received by child $i = 1,2,3,4,5$.

You can look at the problem in two ways:

Method 1:
As a straight conditional probability problem, so that for each $j_i \geq 2$ we have
$$P(j_1,j_2,j_3,j_4,j_5 | E) = \frac{P(j_1,j_2,j_3,j_4,j_5)}{P(E)}$$
(This holds because when each $j_i \geq 2$ the event $E$ automatically occurs, so $(j_1,j_2,j_3,j_4,j_5) \cap E = (j_1,j_2,j_3,j_4,j_5)$ in the numerator.)

Method 2: First give two candies to each child, then distribute the remaining candies to the children at random and without restrictions.

Method 1 is essentially the "rejection method" proposed by haruspex, while Method 2 is the one you proposed in your partial solution.

Here comes the absolutely amazing part: the two methods give different answers!

I will illustrate this on a smaller example: we have 6 items to be placed in two cells, and we want the conditional distribution of cell occupancy numbers, given at least one item in each cell. I will look at occupancy numbers $(1+i, 5-i), i=0, \ldots, 4$, giving at least one item in each cell. We need only know $i$ to know everything.

If $X_6$ is binomial(6,1/2), then the conditional probability of occupancy $(1+i,5-i)$, given at least one item in each cell is $P_1(i)$, computed as follows:

$$P_1(i) = P(X=1+i | 0 \leq i \leq 4) = \frac{P(X =1+i)}{\sum_{j=0}^4 P(X = 1+j)}, i = 0,1,2,3,4$$
The denominator is
$$P(E) = \sum_{j=0}^4 P(X = 1+j) = \sum_{j=0}^4 C(6,1+j)/2^6 = 31/32,$$
so
$$P_1(i) = C(6,1+i)/62, \; i = 0, \ldots, 4$$
That is what we get from Method 1.

On the other hand, from Method 2 we obtain the (supposedly-conditional) probability of occupancy $(1+i,5-i)$ as
$$P_2(i) = C(4, i)/2^4, \; i=0, \ldots, 4$$

Here is a comparison:
$$\begin{array}{ccc} i & P_1(i) & P_2(i) \\ 0 & 3/31 & 1/16 \\ 1 & 15/62 & 1/4 \\ 2 & 10/31 & 3/8 \\ 3 & 15/62 & 1/4 \\ 4 & 3/31 & 1/16 \end{array}$$

I will leave you to ponder the question: if Method 2 gives the wrong answer, why is that? How could the distribution of the remaining 4 items NOT be independent and at random?

Last edited: Feb 8, 2015
4. Feb 8, 2015

### haruspex

Thanks for taking it further Ray.
To reach an understanding, it might help to reduce it to as simple an example as possible. Consider four identical items placed into two cells, at least one in each.
Rejection method: 1, 4, 6, 4, 1 is the probability distribution for an attempt, so rejecting the extremes leaves a distribution 4, 6, 4.
Prefill method: Placing one in each then distributing the rest gives 1, 2, 1.
Now consider the items distinct, labelling them a, b, c, d. If we choose to prefill with a and b, we get four distributions:
a bcd, ac bd, ad bc, acd b. The prefill method considers these equally likely.
In the rejection method, we don't single out a and b like that, so we need to consider what happens if we vary which ordered pair of items to prefill with. In principle, each of the four possibilities expands to 12: four for 'a' times three for 'b'.
The pattern a bcd encompasses
a bcd, a cbd, a dcb, b acd, b cad, b dac, c abd, c bad, c dab, d abc, d bac, d cab.
On inspection, we see that this is really only four patterns, each repeated three times.
Now consider the ac bd pattern of the four where a and b were singled out. Cycling through the same twelve choices of the prefill items:
ac bd, ab cd, ab dc, bc ad, ba cd, ba dc, cb ad, ca bd, ca db, db ac, da bc, da cb. Again, four actual patterns, each appearing three times.
But here's the difference: the four patterns we generated from a bcd are distinct from any patterns generated by ac bd, ad bc or acd b; the four generated by ac bd overlaps with the four generated by ad bc to produce only six 2-2 distinct labelled patterns.

5. Feb 10, 2015

### tony24810

Gosh, this is rather difficult. Give me some time, I'll think for a while longer.

6. Feb 10, 2015

### haruspex

At the risk of confusing you further, there is at least one more way of looking at it. Many combinatorial problems involve arrange items into groups under constraints, but counting the distinct arrangements rather than discussing probability. A neat trick is used.
In my last example, four objects into two cells, consider 4+2-1=5 things in a line. These represent the four objects and the dividing line between the two cells. The number of ways of partitioning the objects into the cells is the same as the number of ways of choosing which 'thing' is the dividing line: 5. When we apply the 'at least one in each cell' rule it drops to three. As a basis for some probability question about those distributions, they are equally likely. So we now have three different interpretations yielding three different probability distributions:
2/7, 3/7, 2/7; 1/4, 1/2, 1/4; 1/3, 1/3, 1/3.
This last view gives you likely the easiest version of your original problem. For (a) I get 0.9 (almost exactly).

The given answer just looks wild. I'd love to know how anyone came up with it.