Undergrad "Inverse" probability distribution question

malawi_glenn · Apr 18, 2019

Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance

StoneTemplePython · Apr 18, 2019

It's an underspecified problem. Start with something simpler, like ##s=1##. Where did this come from? If there e.g. is a stopping rule in place--i.e. stop when you have 1 success, then you have the distribution of ##X## where ##X## is geometric.

now for the ##s=20## case, supposing there is a stopping rule in place (i.e. stop when there are 20 successes), then you have the distribution of ##\sum_{i=1}^{20} X_i##, i.e. the convolution of 20 geometric distributions... if you work through the math you end up with a negative binomial.

- - - -
but there are many other ways in which this ##s=20## could have occurred without a stopping rule. If that is the case your question is a bit like asking about non-elephants -- what do they look like, how do they behave, etc. In short you need to impose some more structure here to make it a sensible question. The general approach for 'inverse probability' is bayes rule, but some more structure is needed first.

Periwinkle · Apr 18, 2019

StoneTemplePython said:

It's an underspecified problem. Start with something simpler, like ##s=1##. Where did this come from? If there e.g. is a stopping rule in place--i.e. stop when you have 1 success, then you have the distribution of ##X## where ##X## is geometric.

now for the ##s=20## case, supposing there is a stopping rule in place (i.e. stop when there are 20 successes), then you have the distribution of ##\sum_{i=1}^{20} X_i##, i.e. the convolution of 20 geometric distributions... if you work through the math you end up with a negative binomial.

- - - -
but there are many other ways in which this ##s=20## could have occurred without a stopping rule. If that is the case your question is a bit like asking about non-elephants -- what do they look like, how do they behave, etc. In short you need to impose some more structure here to make it a sensible question. The general approach for 'inverse probability' is bayes rule, but some more structure is needed first.

The geometric distribution assumes an unlimited continuation of the Bernoulli experiment. However, this was not the case here either. Here we have only limited (n-length) Bernoulli experiment sequences and the question of the distribution of ns.

I agree that there is a problem with the question, but the question is precisely where the error is.

StoneTemplePython · Apr 18, 2019

Periwinkle said:

The geometric distribution assumes an unlimited continuation of the Bernoulli experiment. However, this was not the case here either. Here we have only limited (n-length) Bernoulli experiment sequences and the question of the distribution of ns.

I find your comment to be a non-sequitor. I clearly stated a stopping rule as an example and am trying to get OP to come back with something more coherent.

Ray Vickson · Apr 19, 2019

drmalawi said:

Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance

Yes, it is all standard Probability 101 material. The number of Bernoulli trials ##N## until ##n=1## success is the so-called geometric distribution ##P(N=k) = p (1-p)^{k-1}, \;k = 1, 2, \ldots.## Note that there is no upper limit on ##k##; possibly you might need 10 billion coin tosses until you get your first "head", but that is unlikely.

The number of trials until the ##n##th success is the sum of ##n## independent geometric random variables. It has the so-called negative binomial distribution. The expected value and variance (etc.) of N have well-known formulas in terms of ##p## and ##n##. I could, of course, write everything explicitly here, but many others have already written everything out and you can read those on other web pages just as easily as on this one.

Periwinkle · Apr 19, 2019

Ray Vickson said:

Yes, it is all standard Probability 101 material. The number of Bernoulli trials ##N## until ##n=1## success is the so-called geometric distribution ##P(N=k) = p (1-p)^{k-1}, \;k = 1, 2, \ldots.## Note that there is no upper limit on ##k##; possibly you might need 10 billion coin tosses until you get your first "head", but that is unlikely.

The number of trials until the ##n##th success is the sum of ##n## independent geometric random variables. It has the so-called negative binomial distribution. The expected value and variance (etc.) of N have well-known formulas in terms of ##p## and ##n##. I could, of course, write everything explicitly here, but many others have already written everything out and you can read those on other web pages just as easily as on this one.

However, the person who asked the question did not ask that if we perform an infinite series of Bernoulli experiments and that there were ##s## successful events, what is the probability of that ##n## attempts had to be made to do this. If he had asked this, it would be the right answer.

He asked if we were doing a finite, 1, 2, 3, ... Bernoulli experiment, and it is known that there were ##s## successful events, what is the probability of finite ##n## Bernoulli's attempts to do this. This is a very different question.

I'm still thinking about solving it.

Ray Vickson · Apr 19, 2019

Periwinkle said:

However, the person who asked the question did not ask that if we perform an infinite series of Bernoulli experiments and that there were ##s## successful events, what is the probability of that ##n## attempts had to be made to do this. If he had asked this, it would be the right answer.

He asked if we were doing a finite, 1, 2, 3, ... Bernoulli experiment, and it is known that there were ##s## successful events, what is the probability of finite ##n## Bernoulli's attempts to do this. This is a very different question.

I'm still thinking about solving it.

I just answered the questions as I read them:

"I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials." ---- no mention of a finite bound on the total number of trials.

"Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?"

Again, he/she fixes the number of successes and wants the number of trials.

Anyway, you have your interpretation and I have mine. I would like to hear from the OP before commenting further.

Periwinkle · Apr 20, 2019

drmalawi said:

Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance

I consider it an excellent and interesting question. I was trying to go to the end somehow.

The question itself is quite apparent. If we perform ##n## Bernoulli experiments (for which the probability of the favorable event is ##p##, the unfavorable event ##1-p##), then the probability that ##k## favorable events will occur ## \binom n k p^k (1-p)^{n-k}## . Someone rightly asks that we determine the inverse probability, that is if we know that ##k## favorable events occurred, how much probability is that we have completed an ##n##-long Bernoulli experiment series?

It should be seen that it is not about doing an unlimited number of Bernoulli experiments and observing how many unfavorable events occur on average until the desired ##k## favorable events are obtained. (In this case, the distribution of the number of unfavorable events follows the so-called Pascal distribution.) We do not repeat unlimited Bernoulli experiment sequences, but fixed ##n##-length bounded series of experiments and observed the average number of ##k## favorable events. Just don't know how much this ##n##?

The number of favorable experiments is ##k##. There are fewer Bernoulli experiments in vain, and we cannot get ##k## favorable events. So only the ## k, k+1, k+2, ...## -length experiment series should be examined.

##k## length experiment series

Probability of occurrence of k favorable events ## a_0 = \binom k k p^k (1-p)^0 = p^k ##.

##k+1## length experiment series

Probability of occurrence of k favorable events ## a_1 = \binom {k+1} k p^k (1-p) ##.

##...##

##k+r## length experiment series

Probability of occurrence of k favorable events ## a_r = \binom {k+r} k p^k (1-p)^r ##.

##k+r+1## length experiment series

Probability of occurrence of k favorable events ## a_{r+1}=\binom {k+r+1} k p^k (1-p)^{r+1} ##.

##...##

Make the ## a_0+a_1+...+a_r+a_{r+1}+...## sum of the above probabilities

$$ p^k + \binom {k+1} k p^k (1-p) +...+\binom {k+r} k p^k (1-p)^r+\binom {k+r+1} k p^k (1-p)^{r+1}+...
\\=p^k \left[ 1+\binom {k+1} k (1-p)+...+\binom {k+r} k (1-p)^r+\binom {k+r+1} k (1-p)^{r+1}+...\right]. $$

Here we see a power series multiplied by ##p^k##. If we take the quotient of the ##k+r~##th and ##k+r+1~##th coefficient of the power series

$$ \frac {\binom{k+r} k } { \binom{k+r+1} k } = \frac {r+1} {k+r+1}. $$
The limit of the ratio of successive coefficients is equal to 1, so the above power series converges if ## 0 < p \leq 1##, and the ## a_0+a_1+...+a_r+a_{r+1}+...## sum is ##A##.

If the probability of occurrence of ##k## favorable events in one of the Bernoulli experiment series is higher than that of the other experiment series, in the case of experiencing the appearance of ##k## favorable events, we can quite rightly say that it is more likely that this was caused by the series of experiments in which it is more likely to occur. And probabilities are equally proportional to each other. (This is the basic idea of the Bayesian theorem.)

Therefore, the ##k, k+1, k+2, ...##-length Bernoulli experiment series occur with ## \frac {a_0} {A}, \frac {a_1} {A}, \frac {a_2} {A},...## probabilities if we find that ##k## favorable events have occurred.

In the case of ##k=1## it is easy to see that this distribution is different from the geometric distribution.

Ray Vickson · Apr 20, 2019

Periwinkle said:

However, the person who asked the question did not ask that if we perform an infinite series of Bernoulli experiments and that there were ##s## successful events, what is the probability of that ##n## attempts had to be made to do this. If he had asked this, it would be the right answer.

He asked if we were doing a finite, 1, 2, 3, ... Bernoulli experiment, and it is known that there were ##s## successful events, what is the probability of finite ##n## Bernoulli's attempts to do this. This is a very different question.

I'm still thinking about solving it.

Every Bernoulli experiment is finite, because the probability that you need more than ##n## trials (to get your first "success") is ##(1-p)^n##, which goes to zero in the limit ##n \to \infty.## It is true that you cannot put a guaranteed bound on the number of trials to reach success, but it is always a finite number.

The number of failures until the ##n##th success is ##N_f##, which is one of the two types of negative-binomial random variables:
$$P(N_f = k) = {n+k-1 \choose k} p^n (1-p)^k, \; k=0,1,2, \ldots.$$
This is unbounded, but definitely finite: ##P(N_f = \infty) = 0.## No "infinite" experiments are performed, but at the same time we cannot predict a 100% guaranteed finite bound on the number of experiments.

I can see one type of alteration to the usual setup, that might (I cannot tell) be roughly what you are thinking about. Say we perform 20 Bernoulli(p) trials and observe 7 successes. One might, possibly, be interested in the statistical properties of the trial at which the 7th success was obtained, given a restriction to 20 trials altogether. This would be some type of conditional probability, because when we restrict ourselves to any fixed number of trials, it might be that we never reach 7 successes, but conditioned on the event that we DO reach 7 successes, the probability distribution of when that happens can be of interest, I guess.

I am going to leave this thread now, because I no longer possesses the energy or inclination to enter into a very meaningful "deep" discussion/analysis. The unfortunate fact is that I am dying of stomach cancer, and have lately just been doing simple posts on PF to take my mind off things. I might be able to sit at a computer terminal for another week or two, but that is unpredictable. (I debated with myself whether to include this paragraph, because it may come across as a bit "whiney", etc. But then I figured...what's the difference at this point.)

WWGD · Apr 23, 2019

drmalawi said:

Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance

Please be careful with notation like 1:10. It may be understood as either ",1 in 10", or as an odds ratio of 10 to 1, where the associated probability is 1/11. Re the number of trials,etc. ,this just seems like the value of a binomial for a fixed number of successes. A problem I see is you may have infinitely many fails before reaching a fixed number of successes. Did I
misunderstand(midunderestimate?;)) your qu

Undergrad "Inverse" probability distribution question

Similar threads

High School A Little Probability Puzzle

Undergrad A variant of the Monty Hall problem

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

High School How Rare Is Low Smartphone Usage Among Metro Travelers in Japan?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers