drmalawi said:
Hi, I think I am stuck in my understanding of "inverse" probability distributions.
This is a question I would like to have help understanding.
I want to figure out the distribution of number of trials for a given fixed number of successes and given probability for success for Bernoulli trials.
Denote the probability of success per trial as p , the number of successes as s, the number of trials t, I want to obtain the distribution of t if p and s are given/fixed.
I know that for a fixed number of trials t, I can get the probability distribution for number of successes s with the Binomial distribution, but can i "invert" it so to say?
Let's say the probability of success is 1:10 (p = 0.10).
If I get say s = 20 of those events, is there a way to figure out a probability distribution of the number of trials t?
Thanks in advance
I consider it an excellent and interesting question. I was trying to go to the end somehow.
The question itself is quite apparent. If we perform ##n## Bernoulli experiments (for which the probability of the favorable event is ##p##, the unfavorable event ##1-p##), then the probability that ##k## favorable events will occur ## \binom n k p^k (1-p)^{n-k}## . Someone rightly asks that we determine the inverse probability, that is if we know that ##k## favorable events occurred,
how much probability is that we have completed an ##n##-long Bernoulli experiment series?
It should be seen that it is not about doing an unlimited number of Bernoulli experiments and observing how many unfavorable events occur on average until the desired ##k## favorable events are obtained. (In this case, the distribution of the number of unfavorable events follows the so-called Pascal distribution.) We do not repeat unlimited Bernoulli experiment sequences, but
fixed ##n##-length bounded series of experiments and observed the average number of ##k## favorable events. Just don't know how much this ##n##?
The number of favorable experiments is ##k##. There are fewer Bernoulli experiments in vain, and we cannot get ##k## favorable events. So only the ## k, k+1, k+2, ...## -length experiment series should be examined.
##k## length experiment series
Probability of occurrence of k favorable events ## a_0 = \binom k k p^k (1-p)^0 = p^k ##.
##k+1## length experiment series
Probability of occurrence of k favorable events ## a_1 = \binom {k+1} k p^k (1-p) ##.
##...##
##k+r## length experiment series
Probability of occurrence of k favorable events ## a_r = \binom {k+r} k p^k (1-p)^r ##.
##k+r+1## length experiment series
Probability of occurrence of k favorable events ## a_{r+1}=\binom {k+r+1} k p^k (1-p)^{r+1} ##.
##...##
Make the ## a_0+a_1+...+a_r+a_{r+1}+...## sum of the above probabilities
$$ p^k + \binom {k+1} k p^k (1-p) +...+\binom {k+r} k p^k (1-p)^r+\binom {k+r+1} k p^k (1-p)^{r+1}+...
\\=p^k \left[ 1+\binom {k+1} k (1-p)+...+\binom {k+r} k (1-p)^r+\binom {k+r+1} k (1-p)^{r+1}+...\right]. $$
Here we see a power series multiplied by ##p^k##. If we take the quotient of the ##k+r~##th and ##k+r+1~##th coefficient of the power series
$$ \frac {\binom{k+r} k } { \binom{k+r+1} k } = \frac {r+1} {k+r+1}. $$
The limit of the ratio of successive coefficients is equal to 1, so the above power series converges if ## 0 < p \leq 1##, and the ## a_0+a_1+...+a_r+a_{r+1}+...## sum is ##A##.
If the probability of occurrence of ##k## favorable events in one of the Bernoulli experiment series is higher than that of the other experiment series, in the case of experiencing the appearance of ##k## favorable events, we can quite rightly say that
it is more likely that this was caused by the series of experiments in which it is more likely to occur. And probabilities are equally proportional to each other. (This is the basic idea of the Bayesian theorem.)
Therefore, the ##k, k+1, k+2, ...##-length Bernoulli experiment series occur with ## \frac {a_0} {A}, \frac {a_1} {A}, \frac {a_2} {A},...## probabilities if we find that ##k## favorable events have occurred.
In the case of ##k=1## it is easy to see that this distribution is different from the geometric distribution.