"Inverse" probability distribution question

In summary, you can use the Binomial distribution to calculate the probability of success for a given number of trials, but you can also use the geometric distribution to calculate the probability of success for a given number of successes.
  • #1
malawi_glenn
Science Advisor
Homework Helper
Gold Member
6,735
2,448
Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance
 
Last edited by a moderator:
  • Like
Likes Periwinkle
Physics news on Phys.org
  • #2
It's an underspecified problem. Start with something simpler, like ##s=1##. Where did this come from? If there e.g. is a stopping rule in place--i.e. stop when you have 1 success, then you have the distribution of ##X## where ##X## is geometric.

now for the ##s=20## case, supposing there is a stopping rule in place (i.e. stop when there are 20 successes), then you have the distribution of ##\sum_{i=1}^{20} X_i##, i.e. the convolution of 20 geometric distributions... if you work through the math you end up with a negative binomial.

- - - -
but there are many other ways in which this ##s=20## could have occurred without a stopping rule. If that is the case your question is a bit like asking about non-elephants -- what do they look like, how do they behave, etc. In short you need to impose some more structure here to make it a sensible question. The general approach for 'inverse probability' is bayes rule, but some more structure is needed first.
 
  • #3
StoneTemplePython said:
It's an underspecified problem. Start with something simpler, like ##s=1##. Where did this come from? If there e.g. is a stopping rule in place--i.e. stop when you have 1 success, then you have the distribution of ##X## where ##X## is geometric.

now for the ##s=20## case, supposing there is a stopping rule in place (i.e. stop when there are 20 successes), then you have the distribution of ##\sum_{i=1}^{20} X_i##, i.e. the convolution of 20 geometric distributions... if you work through the math you end up with a negative binomial.

- - - -
but there are many other ways in which this ##s=20## could have occurred without a stopping rule. If that is the case your question is a bit like asking about non-elephants -- what do they look like, how do they behave, etc. In short you need to impose some more structure here to make it a sensible question. The general approach for 'inverse probability' is bayes rule, but some more structure is needed first.

The geometric distribution assumes an unlimited continuation of the Bernoulli experiment. However, this was not the case here either. Here we have only limited (n-length) Bernoulli experiment sequences and the question of the distribution of ns.

I agree that there is a problem with the question, but the question is precisely where the error is.
 
  • #4
Periwinkle said:
The geometric distribution assumes an unlimited continuation of the Bernoulli experiment. However, this was not the case here either. Here we have only limited (n-length) Bernoulli experiment sequences and the question of the distribution of ns.

I find your comment to be a non-sequitor. I clearly stated a stopping rule as an example and am trying to get OP to come back with something more coherent.
 
  • #5
drmalawi said:
Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance

Yes, it is all standard Probability 101 material. The number of Bernoulli trials ##N## until ##n=1## success is the so-called geometric distribution ##P(N=k) = p (1-p)^{k-1}, \;k = 1, 2, \ldots.## Note that there is no upper limit on ##k##; possibly you might need 10 billion coin tosses until you get your first "head", but that is unlikely.

The number of trials until the ##n##th success is the sum of ##n## independent geometric random variables. It has the so-called negative binomial distribution. The expected value and variance (etc.) of N have well-known formulas in terms of ##p## and ##n##. I could, of course, write everything explicitly here, but many others have already written everything out and you can read those on other web pages just as easily as on this one.
 
Last edited:
  • #6
Ray Vickson said:
Yes, it is all standard Probability 101 material. The number of Bernoulli trials ##N## until ##n=1## success is the so-called geometric distribution ##P(N=k) = p (1-p)^{k-1}, \;k = 1, 2, \ldots.## Note that there is no upper limit on ##k##; possibly you might need 10 billion coin tosses until you get your first "head", but that is unlikely.

The number of trials until the ##n##th success is the sum of ##n## independent geometric random variables. It has the so-called negative binomial distribution. The expected value and variance (etc.) of N have well-known formulas in terms of ##p## and ##n##. I could, of course, write everything explicitly here, but many others have already written everything out and you can read those on other web pages just as easily as on this one.

However, the person who asked the question did not ask that if we perform an infinite series of Bernoulli experiments and that there were ##s## successful events, what is the probability of that ##n## attempts had to be made to do this. If he had asked this, it would be the right answer.

He asked if we were doing a finite, 1, 2, 3, ... Bernoulli experiment, and it is known that there were ##s## successful events, what is the probability of finite ##n## Bernoulli's attempts to do this. This is a very different question.

I'm still thinking about solving it.
 
  • #7
Periwinkle said:
However, the person who asked the question did not ask that if we perform an infinite series of Bernoulli experiments and that there were ##s## successful events, what is the probability of that ##n## attempts had to be made to do this. If he had asked this, it would be the right answer.

He asked if we were doing a finite, 1, 2, 3, ... Bernoulli experiment, and it is known that there were ##s## successful events, what is the probability of finite ##n## Bernoulli's attempts to do this. This is a very different question.

I'm still thinking about solving it.

I just answered the questions as I read them:

"I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials." ---- no mention of a finite bound on the total number of trials.

"Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?"

Again, he/she fixes the number of successes and wants the number of trials.

Anyway, you have your interpretation and I have mine. I would like to hear from the OP before commenting further.
 
  • Like
Likes StoneTemplePython
  • #8
drmalawi said:
Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance

I consider it an excellent and interesting question. I was trying to go to the end somehow.

The question itself is quite apparent. If we perform ##n## Bernoulli experiments (for which the probability of the favorable event is ##p##, the unfavorable event ##1-p##), then the probability that ##k## favorable events will occur ## \binom n k p^k (1-p)^{n-k}## . Someone rightly asks that we determine the inverse probability, that is if we know that ##k## favorable events occurred, how much probability is that we have completed an ##n##-long Bernoulli experiment series?

It should be seen that it is not about doing an unlimited number of Bernoulli experiments and observing how many unfavorable events occur on average until the desired ##k## favorable events are obtained. (In this case, the distribution of the number of unfavorable events follows the so-called Pascal distribution.) We do not repeat unlimited Bernoulli experiment sequences, but fixed ##n##-length bounded series of experiments and observed the average number of ##k## favorable events. Just don't know how much this ##n##?

The number of favorable experiments is ##k##. There are fewer Bernoulli experiments in vain, and we cannot get ##k## favorable events. So only the ## k, k+1, k+2, ...## -length experiment series should be examined.

##k## length experiment series

Probability of occurrence of k favorable events ## a_0 = \binom k k p^k (1-p)^0 = p^k ##.

##k+1## length experiment series

Probability of occurrence of k favorable events ## a_1 = \binom {k+1} k p^k (1-p) ##.

##...##

##k+r## length experiment series

Probability of occurrence of k favorable events ## a_r = \binom {k+r} k p^k (1-p)^r ##.
##k+r+1## length experiment series

Probability of occurrence of k favorable events ## a_{r+1}=\binom {k+r+1} k p^k (1-p)^{r+1} ##.

##...##

Make the ## a_0+a_1+...+a_r+a_{r+1}+...## sum of the above probabilities

$$ p^k + \binom {k+1} k p^k (1-p) +...+\binom {k+r} k p^k (1-p)^r+\binom {k+r+1} k p^k (1-p)^{r+1}+...
\\=p^k \left[ 1+\binom {k+1} k (1-p)+...+\binom {k+r} k (1-p)^r+\binom {k+r+1} k (1-p)^{r+1}+...\right]. $$

Here we see a power series multiplied by ##p^k##. If we take the quotient of the ##k+r~##th and ##k+r+1~##th coefficient of the power series

$$ \frac {\binom{k+r} k } { \binom{k+r+1} k } = \frac {r+1} {k+r+1}. $$
The limit of the ratio of successive coefficients is equal to 1, so the above power series converges if ## 0 < p \leq 1##, and the ## a_0+a_1+...+a_r+a_{r+1}+...## sum is ##A##.

If the probability of occurrence of ##k## favorable events in one of the Bernoulli experiment series is higher than that of the other experiment series, in the case of experiencing the appearance of ##k## favorable events, we can quite rightly say that it is more likely that this was caused by the series of experiments in which it is more likely to occur. And probabilities are equally proportional to each other. (This is the basic idea of the Bayesian theorem.)

Therefore, the ##k, k+1, k+2, ...##-length Bernoulli experiment series occur with ## \frac {a_0} {A}, \frac {a_1} {A}, \frac {a_2} {A},...## probabilities if we find that ##k## favorable events have occurred.

In the case of ##k=1## it is easy to see that this distribution is different from the geometric distribution.
 
Last edited:
  • #9
Periwinkle said:
However, the person who asked the question did not ask that if we perform an infinite series of Bernoulli experiments and that there were ##s## successful events, what is the probability of that ##n## attempts had to be made to do this. If he had asked this, it would be the right answer.

He asked if we were doing a finite, 1, 2, 3, ... Bernoulli experiment, and it is known that there were ##s## successful events, what is the probability of finite ##n## Bernoulli's attempts to do this. This is a very different question.

I'm still thinking about solving it.

Every Bernoulli experiment is finite, because the probability that you need more than ##n## trials (to get your first "success") is ##(1-p)^n##, which goes to zero in the limit ##n \to \infty.## It is true that you cannot put a guaranteed bound on the number of trials to reach success, but it is always a finite number.

The number of failures until the ##n##th success is ##N_f##, which is one of the two types of negative-binomial random variables:
$$P(N_f = k) = {n+k-1 \choose k} p^n (1-p)^k, \; k=0,1,2, \ldots.$$
This is unbounded, but definitely finite: ##P(N_f = \infty) = 0.## No "infinite" experiments are performed, but at the same time we cannot predict a 100% guaranteed finite bound on the number of experiments.

I can see one type of alteration to the usual setup, that might (I cannot tell) be roughly what you are thinking about. Say we perform 20 Bernoulli(p) trials and observe 7 successes. One might, possibly, be interested in the statistical properties of the trial at which the 7th success was obtained, given a restriction to 20 trials altogether. This would be some type of conditional probability, because when we restrict ourselves to any fixed number of trials, it might be that we never reach 7 successes, but conditioned on the event that we DO reach 7 successes, the probability distribution of when that happens can be of interest, I guess.

I am going to leave this thread now, because I no longer possesses the energy or inclination to enter into a very meaningful "deep" discussion/analysis. The unfortunate fact is that I am dying of stomach cancer, and have lately just been doing simple posts on PF to take my mind off things. I might be able to sit at a computer terminal for another week or two, but that is unpredictable. (I debated with myself whether to include this paragraph, because it may come across as a bit "whiney", etc. But then I figured...what's the difference at this point.)
 
Last edited:
  • #10
drmalawi said:
Hi, I think I am stuck in my understanding of "inverse" probability distributions.

This is a question I would like to have help understanding.

I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.

Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.

I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?

Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?

Thanks in advance
Please be careful with notation like 1:10. It may be understood as either ",1 in 10", or as an odds ratio of 10 to 1, where the associated probability is 1/11. Re the number of trials,etc. ,this just seems like the value of a binomial for a fixed number of successes. A problem I see is you may have infinitely many fails before reaching a fixed number of successes. Did I
misunderstand(midunderestimate?;)) your qu
 

1. What is an inverse probability distribution?

An inverse probability distribution is a statistical concept that describes the likelihood of a particular outcome based on its inverse, or opposite. It is used to determine the probability of a certain input value given a specific output value.

2. How is an inverse probability distribution different from a regular probability distribution?

An inverse probability distribution is the opposite of a regular probability distribution. While a regular probability distribution describes the likelihood of an output value given an input value, an inverse probability distribution describes the likelihood of an input value given an output value.

3. What are the applications of an inverse probability distribution?

Inverse probability distributions are commonly used in fields such as finance, engineering, and physics to model and analyze complex systems. They can also be used in data analysis and machine learning to make predictions based on observed data.

4. How is an inverse probability distribution calculated?

The calculation of an inverse probability distribution involves using Bayes' theorem, which states that the probability of an event occurring is equal to the prior probability of the event multiplied by the likelihood of the event given the evidence, divided by the total probability of the evidence.

5. What are some common examples of inverse probability distributions?

Some common examples of inverse probability distributions include the inverse Gaussian distribution, the inverse exponential distribution, and the inverse Weibull distribution. These distributions are commonly used in various fields to model real-world phenomena.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
888
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
938
  • Set Theory, Logic, Probability, Statistics
Replies
15
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
0
Views
991
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
192
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
392
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
Back
Top