Probability question involving picking balls from a bag

Click For Summary
SUMMARY

This discussion focuses on calculating the probability distribution of run lengths when drawing balls from a bag containing 70 red balls and 30 blue balls, without replacement. The user seeks to determine the expected value of the number of runs and the corresponding probability distribution for red and blue run lengths. Key insights include that the minimum number of runs is 2 and the maximum is 31, with the maximum run lengths being 70 for red and 30 for blue. The user is advised to either derive formulas for the distributions or utilize simulations for approximations.

PREREQUISITES
  • Understanding of basic probability concepts
  • Familiarity with combinatorial mathematics
  • Knowledge of statistical distributions, particularly exponential distribution
  • Experience with simulation techniques in probability
NEXT STEPS
  • Calculate the expected value of the number of runs in a simulation of ball-picking
  • Research the derivation of probability distributions for run lengths in combinatorial settings
  • Implement a simulation to model the ball-picking process and analyze run lengths
  • Explore the properties of exponential distributions and their applications in probability theory
USEFUL FOR

Students and professionals in mathematics, statistics, and data science, particularly those interested in probability theory and simulation methods.

Ryuzaki
Messages
46
Reaction score
0
I’m working on a chemistry problem, which essentially translates to finding the answer to a related probability problem. However, my knowledge in probability is very limited and I'd be grateful if someone could help me out with it. The following is the problem:-

Suppose I have a bag containing 70 red balls and 30 blue balls. For the purpose of illustration, let’s call them Rs (red balls) and Bs (blue balls). Now, I am going to pick one ball at a time from this bag, without replacement. I define a run to be a sequence of consecutive Rs (or alternately, Bs) picked, along with the first B (or R) that is picked. And I define a red (or blue) run length to be the number of consecutive Rs (or Bs) I pick in a run, before I encounter a B (or R) or until the number of balls run out.

As examples, RRRRRRB is a run (for simplicity, let me denote it by R_6 in shorthand) with red run length 6, RB is a run (denoted by R_1) with red run length 1, BBBR is a run (denoted by B_3) with blue run length 3.

In each simulation, I keep doing runs until all the 100 balls are picked out (since the balls are picked without replacement, the number of runs and the red/blue run lengths are both finite).

Let’s look at a typical simulation of ball-picking: R_{50}R_{10}B_{28}R_9. In this simulation, there are 4 runs. The first run consists of 50 consecutive red balls, until a blue ball is encountered. The second run consists of 10 consecutive red balls until a ball is encountered. The third run consists of 28 consecutive blue balls until a red ball is encountered. And the last run consists of 9 consecutive red balls, and the simulation ends as there are no more balls to be picked.

It is easy to see that the minimum possible number of runs is 2 (attained by R_{70} followed by B_{29}, or B_{30} followed by R_{69}) and the maximum possible number of runs is 31 (attained by R_1 30 times followed by R_{40}, or B_1 30 times followed by R_{70}).

Also, the maximum possible value of red run length is 70 and that of blue run length is 30.

Now, I’m interested in knowing the probability distribution of the red and blue run lengths. For this, I believe that I must first find the expected value of the number of runs in a simulation. But I’m not sure how to proceed from here. So to sum up, the following are my questions:-

1. How do I find the expected value of the number of runs in a simulation?

2. For that expected value, how do I calculate the probability distribution of red and blue run lengths?
 
Physics news on Phys.org
It might be possible to find formulas, but that problem looks messy if you want exact answers.

You can get a reasonable approximation for small run lengths (=the most frequent case) if you assume each ball has a .7 probability to be red and a .3 probability to be blue, even if those numbers change during a run. You'll get an exponential distribution for run lengths.

Alternatively, simulate it. Especially for the expected number of runs, this is probably the easiest way.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 76 ·
3
Replies
76
Views
6K
  • · Replies 34 ·
2
Replies
34
Views
13K
  • · Replies 2 ·
Replies
2
Views
3K
Replies
3
Views
3K
  • · Replies 18 ·
Replies
18
Views
7K
  • · Replies 3 ·
Replies
3
Views
7K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 66 ·
3
Replies
66
Views
7K