I’m working on a chemistry problem, which essentially translates to finding the answer to a related probability problem. However, my knowledge in probability is very limited and I'd be grateful if someone could help me out with it. The following is the problem:-(adsbygoogle = window.adsbygoogle || []).push({});

Suppose I have a bag containing [itex]70[/itex] red balls and [itex]30[/itex] blue balls. For the purpose of illustration, let’s call them [itex]R[/itex]s (red balls) and [itex]B[/itex]s (blue balls). Now, I am going to pick one ball at a time from this bag, without replacement. I define a run to be a sequence of consecutive [itex]R[/itex]s (or alternately, [itex]B[/itex]s) picked, along with the first [itex]B[/itex] (or [itex]R[/itex]) that is picked. And I define a red (or blue) run length to be the number of consecutive [itex]R[/itex]s (or [itex]B[/itex]s) I pick in a run, before I encounter a [itex]B[/itex] (or [itex]R[/itex]) or until the number of balls run out.

As examples, [itex]RRRRRRB[/itex] is a run (for simplicity, let me denote it by [itex]R_6[/itex] in shorthand) with red run length [itex]6[/itex], [itex]RB[/itex] is a run (denoted by [itex]R_1[/itex]) with red run length [itex]1[/itex], [itex]BBBR[/itex] is a run (denoted by [itex]B_3[/itex]) with blue run length [itex]3[/itex].

In each simulation, I keep doing runs until all the [itex]100[/itex] balls are picked out (since the balls are picked without replacement, the number of runs and the red/blue run lengths are both finite).

Let’s look at a typical simulation of ball-picking: [itex]R_{50}R_{10}B_{28}R_9[/itex]. In this simulation, there are [itex]4[/itex] runs. The first run consists of [itex]50[/itex] consecutive red balls, until a blue ball is encountered. The second run consists of [itex]10[/itex] consecutive red balls until a ball is encountered. The third run consists of [itex]28[/itex] consecutive blue balls until a red ball is encountered. And the last run consists of [itex]9[/itex] consecutive red balls, and the simulation ends as there are no more balls to be picked.

It is easy to see that the minimum possible number of runs is [itex]2[/itex] (attained by [itex]R_{70}[/itex] followed by [itex]B_{29}[/itex], or [itex]B_{30}[/itex] followed by [itex]R_{69}[/itex]) and the maximum possible number of runs is [itex]31[/itex] (attained by [itex]R_1[/itex] [itex]30[/itex] times followed by [itex]R_{40}[/itex], or [itex]B_1[/itex] [itex]30[/itex] times followed by [itex]R_{70}[/itex]).

Also, the maximum possible value of red run length is [itex]70[/itex] and that of blue run length is [itex]30[/itex].

Now, I’m interested in knowing the probability distribution of the red and blue run lengths. For this, I believe that I must first find the expected value of the number of runs in a simulation. But I’m not sure how to proceed from here. So to sum up, the following are my questions:-

1. How do I find the expected value of the number of runs in a simulation?

2. For that expected value, how do I calculate the probability distribution of red and blue run lengths?

**Physics Forums | Science Articles, Homework Help, Discussion**

Join Physics Forums Today!

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

# Probability question involving picking balls from a bag

**Physics Forums | Science Articles, Homework Help, Discussion**