Balls and bins problem from 'introduction to algorithms' textbook

Click For Summary

Discussion Overview

The discussion revolves around the "balls and bins" problem from the "Introduction to Algorithms" textbook, specifically focusing on calculating the expected number of tosses required before at least one bin contains two balls. Participants explore various approaches to derive a closed form solution, expressing their thoughts on the problem's ambiguity and the nature of the expected value calculation.

Discussion Character

  • Exploratory
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant describes their understanding of the problem and the need to find the expected number of tosses before a collision occurs.
  • Another participant suggests that the probability of all bins having at most one ball can be computed as a product, which can be expressed in terms of factorials and powers.
  • A participant defines a random variable X to represent the event of a collision occurring on the k'th toss and provides a formula for its probability.
  • Different methods for calculating the expected value E(X) are proposed, including the use of indicator random variables and summations.
  • There is a discussion about the relationship between the probability of collisions occurring and the specific conditions of the problem, emphasizing the need for clarity in the definitions used.
  • One participant expresses frustration at not being able to derive a closed form solution despite having a sum representation of the expected value.
  • Another participant points out that the expression provided is already in closed form, leading to a clarification about the desire to eliminate the summation while retaining factorials.

Areas of Agreement / Disagreement

Participants do not reach a consensus on whether a closed form solution is achievable. Multiple competing views on the interpretation of the problem and the methods to approach it remain evident throughout the discussion.

Contextual Notes

Participants note ambiguities in the problem statement and the assumptions underlying their calculations. There are unresolved mathematical steps and dependencies on definitions that affect the clarity of the discussion.

Who May Find This Useful

This discussion may be of interest to those studying probability theory, combinatorial analysis, or algorithm design, particularly in the context of expected values and random processes.

mikepol
Messages
19
Reaction score
0
Hi Everyone,

I've been trying to do the following problem, which is exercise 5.4-2 in "Introduction to Algorithms 2ED" textbook by Cormen/Leiserson/Rivest/Stein. It's not starred, so should be fairly easy, but I just can't come up with a closed form solution! The problem is:

Suppose that balls are tossed into b bins. Each toss is independent, and each ball is equally likely to end up in any bin. What is the expected number of ball tosses before at least one of the bins contains two balls?

I have a solution (which I'm pretty sure is correct), but in a form of a sum. Can anyone give me a little hint of how to get a closed form solution or even if a closed form solution is possible?

Thanks in advance!
 
Physics news on Phys.org
The probability that all bins will have at most one ball can be computed by a simple product. The product can then be re-expressed in terms of factorials and powers.
 
Thanks for your fast reply CRGreathouse! I have tried the way you suggested, but still a closed form solution evades me! Here is my analysis:

First, I think the problem is stated a little ambiguously, so this is my understanding on it. Suppose I perform the following experiment: take b empty bins and start tossing balls into them at random. As soon as a ball lands in already occupied bin, I record the number of throws I had made. The problem asks to find the expected number these recorded numbers, which will be the average over this experiment performed great many times. In other words, I need to find the expected number tosses before a collision occurs.

I have approached this from two paths.
First way is define a random variable X such that event X=k means that after k-1 tosses, there were no collisions, but k'th toss resulted in a collission. In other words, X=k is the event that k'th toss results in a single bin having two balls, while all the rest have at most one. Then probability of X=k is:

[tex]P(X=k) = \frac{(b)_{k-1}(k-1)}{b^k}[/tex]
 
Sorry, please disregard my previous post, I hit submit by mistake.

Thanks for your fast reply CRGreathouse! I have tried the way you suggested, but still a closed form solution evades me! Here is my analysis:

First, I think the problem is stated a little ambiguously, so this is my understanding of it. Suppose I perform the following experiment: take b empty bins and start tossing balls into them at random. As soon as a ball lands in already occupied bin, I record the number of throws I had made. The problem asks to find the expected value of these recorded numbers, which will be the average over this experiment performed great many times. In other words, I need to find the expected number of tosses that result in a single collision on the last toss.

I have approached this from two paths.

First way is define a random variable X such that event X=k means that after k-1 tosses, there were no collisions, but k'th toss resulted in a collission. In other words, X=k is the event that k'th toss results in a single bin having two balls, while all the rest have at most one. Then probability of X=k is:

[tex]P(X=k) = \frac{(b)_{k-1}(k-1)}{b^k}[/tex]

Where [tex](b)_{k}=\frac{b!}{(b-k)!}[/tex]. And the expected value that the problem is to compute is:

[tex]E(X) = \sum_{k=2}^{b+1} kP(X=k) = \sum_{k=2}^{b+1} \frac{(b)_{k-1}(k-1)}{b^k}k[/tex]

This is a solution in a form of a sum.

From the fact that X is a random variable and that it takes values from 2 to b+1, we also have:

[tex]\sum_{k=2}^{b+1}P(X=k) = \sum_{k=2}^{b+1}\frac{(b)_{k-1}(k-1)}{b^k} = \frac{1}{b}\sum_{k=1}^{b}\frac{(b)_{k}k}{b^k}=1[/tex]

A second way of getting an answer is define an indicator random variable [tex]I_{k}[/tex] such that the event [tex]I_{k}=1[/tex] means that there was k'th toss, and define [tex]I_{k}=0[/tex] otherwise. In other words, it means that previous tosses have resulted in no collisions, and therefore we made k'th toss, the result of which is unknown. The probability of this event is equals to the probability that previous k-1 tosses had no collisions. It is:

[tex]P(I_{k}=1) = \frac{(b)_{k-1}}{b^{k-1}} = E(I_{k})[/tex]

And random variable X can be expressed in terms of I's:

[tex]X=\sum_{k=1}^{b+1}I_k[/tex]

So by linearity of expectations we have:

[tex]E(X)=\sum_{k=1}^{b+1}E(I_k) = \sum_{k=1}^{b+1}\frac{(b)_{k-1}}{b^{k-1}} = \sum_{k=0}^{b}\frac{(b)_k}{b^k}[/tex]

And I can show that this formula for E(X) is consistent with previous formula for E(X) by direct derivation:

[tex]\sum_{k=1}^{b+1}\frac{(b)_{k-1}(k-1)}{b^k}k = \sum_{k=1}^{b}\frac{(b)_{k-1}}{b^{k-1}}k - \sum_{k=1}^{b}\frac{(b)_k k}{b^k} + \frac{b! b (b+1)}{b^{b+1}} = \sum_{k=1}^{b-1}\frac{(b)_k}{b^k}(k+1) - \sum_{k=1}^{b}\frac{(b)_k k}{b^k} + \frac{b! (b+1)}{b^b} = \sum_{k=1}^{b-1}\frac{(b)_k}{b^k} + \frac{b!}{b^b} = \sum_{k=1}^{b}\frac{(b)_k}{b^k}[/tex]

But I still don't have a closed form solution! This problem has been annoying me for about a day already. I think the authors wouldn't put this with other simple problems if it wasn't relatively simple to solve, so I still believe that a closed form solution is out there. Does anyone have any ideas?

Thank you.
 
Chance of all bins having at most one ball:
Turn 1, 1
Turn 2, 1 - 1/b
Turn 3, (turn 2) * (1 - 2/b) = (1 - 1/b)(1 - 2/b)
Turn 4, (turn 3) * (1 - 3/b) = (1 - 1/b)(1 - 2/b)(1 - 3/b)
etc.
 
Thanks CRGreathouse.

The formula for all b bins having at most 1 ball after k tossings that you wrote is:

[tex]\frac{(b)_k}{b^k}[/tex]

So the probability that at least one collision occurs in k throwings (that at least one bin will contain at least two balls) is:

[tex]1-\frac{(b)_k}{b^k}[/tex]

But that probability is not related directly to this problem since the collision does not necessarily occur at the k'th toss and there might be more than one bin with at least two balls or bins with more than two balls.

I think the problem is asking for probability of the first and only collision in k throws to occur on the last toss. In other words, after k-1 throws there were no collisions, but k'th throw resulted in a collision. Thus, there is only one bin with two balls, and the second ball in that bin resulted from the last toss. The probability of this event is:

[tex]\frac{(b)_{k-1}(k-1)}{b^k}[/tex]

And then the expected number of throws that the problem is asking for would be:

[tex]E(X) = \sum_{k=1}^{b+1} \frac{(b)_{k-1}(k-1)}{b^k} k[/tex]

But now I don't know how to express this in closed form, if that is at all possible. I also have another expression for E(X) as a sum in my previous post, but that does not help me either.

Any other ideas?

Thanks.
 
mikepol said:
[tex]E(X) = \sum_{k=1}^{b+1} \frac{(b)_{k-1}(k-1)}{b^k} k[/tex]

But now I don't know how to express this in closed form, if that is at all possible. I also have another expression for E(X) as a sum in my previous post, but that does not help me either.

Any other ideas?

Thanks.

Sorry but your answer is already in closed form. Why doesn't this help you?
 
Hi Focus, thanks for pointing this out.

Sorry for the confusion, I didn't use this term in a prcise way. What I want is get rid of the sum. However, factorials are still allowed. Is this possible for this problem?

Thanks!
 
mikepol said:
But that probability is not related directly to this problem since the collision does not necessarily occur at the k'th toss and there might be more than one bin with at least two balls or bins with more than two balls.

Pr(collision in exactly k turns) = Pr(collision happens in at most k turns) - Pr(collision happens in at most k-1 turns)
 

Similar threads

  • · Replies 29 ·
Replies
29
Views
6K
  • · Replies 45 ·
2
Replies
45
Views
6K
  • · Replies 57 ·
2
Replies
57
Views
7K
Replies
2
Views
3K
  • · Replies 25 ·
Replies
25
Views
12K
  • · Replies 0 ·
Replies
0
Views
4K
  • · Replies 7 ·
Replies
7
Views
4K
Replies
2
Views
1K
  • · Replies 2 ·
Replies
2
Views
5K
Replies
5
Views
3K