# Probability of at least two happening

## Homework Statement

A computer program is tested by 5 independent tests. If there is an error, these tests
will discover it with probabilities 0.1, 0.2, 0.3, 0.4, and 0.5, respectively. Suppose that the
program contains an error. What is the probability that it will be found by at least two tests

## The Attempt at a Solution

I found the probability that it will happen for at least one:

1-(.9*.8*.7*.6*.5) = .8488

Now my question is: what is the compliment of "at least two tests"

I think for some reason it is "1 and none" but how do I know weather it is "1 and none" or "1 or none"

Pr(at least 2) + Pr(1 and none) = 1
Pr(at least 2) + Pr(1 or none) = 1

it has to be one of these but I'm not sure which.

With Pr(1 and none), Pr(1 and none) = Pr(1)*.8488

With Pr(1 or none) = Pr(1)+Pr(none) - Pr(1)*.8488

Related Precalculus Mathematics Homework Help News on Phys.org
Wrichik Basu
Gold Member
Now my question is: what is the compliment of "at least two tests"
It should be Pr(None or 1), that is,

Required Probability = 1 - (Pr(None of the tests find an error) + Pr(Exactly one of the tests find an error)).

Ray Vickson
Homework Helper
Dearly Missed

## Homework Statement

A computer program is tested by 5 independent tests. If there is an error, these tests
will discover it with probabilities 0.1, 0.2, 0.3, 0.4, and 0.5, respectively. Suppose that the
program contains an error. What is the probability that it will be found by at least two tests

## The Attempt at a Solution

I found the probability that it will happen for at least one:

1-(.9*.8*.7*.6*.5) = .8488

Now my question is: what is the compliment of "at least two tests"

I think for some reason it is "1 and none" but how do I know weather it is "1 and none" or "1 or none"

Pr(at least 2) + Pr(1 and none) = 1
Pr(at least 2) + Pr(1 or none) = 1

it has to be one of these but I'm not sure which.

With Pr(1 and none), Pr(1 and none) = Pr(1)*.8488

With Pr(1 or none) = Pr(1)+Pr(none) - Pr(1)*.8488
Problems like this one cry out for the use of generating functions. Let
$$I_i = \begin{cases} 1 & \text{if test i discovers the fault} \\ 0 & \text{otherwise} \end{cases}$$ The different random variables ##I_1, I_2, I_3, I_4, I_5## are independent, and the number of tests that detect the fault is:
$$\text{number that detect the fault} = N = I_1 + I_2 + I_3 + I_4 + I_5$$
The distribution of ##N## can be found by looking at its generating function ##\tilde{N}(z),## which in this case is a fifth-degree polynomial:
$$\tilde{N}(z) = \sum_{n=0}^5 P(N = n) z^n$$
The really neat thing about generating functions is the following theorem (which is actually quite easy to prove):

Theorem: The generating function of a sum of independent random variables is the product of their individual generating functions.

In this case, ##\tilde{I_1}(z) = 0.9 + 0.1 z, ## etc., so we have
$$\tilde{N}(z) = (.9+.1 z)(.8+.2 z) (.7 + .3 z) (.6 + .4 z) (.5 +.5 z).$$
If you expand that out algebraically to get the fifth-degree polynomial, the coefficient of ##z^j## is just ##P(N = j)##, so all you need to do is sum some coefficients.

Last edited:
StoneTemplePython
Gold Member
2019 Award
The generating function approach seems like a pretty good one in general. (Alternatively various bounding methods like Chernoff were made for inhomogenous Bernouli trial setups like this.)

In this particular case, since OP recently had a question on sample spaces and events, perhaps a 3rd approach is illuminating.

- - - - -
we can look at the problem in terms of its complement:

and
the (compound) event of exactly zero heads over all trials is denoted by ##A_0## and the event of exactly one heads over all trials is denoted by ##A_1##. These events are mutually exclusive.
- - - -
side note:
it may be a bit overkill, but for avoidance of doubt, we can easily derive the above from first principles. (I spoiler tagged it as it may be too much symbol manipulation for a basic result.)

##P\big(\text{at least two heads}\big) = P\big(A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big)##
and over any sequence of ##n## tosses either 0, 1, 2, 3, ..., n-1, or n, heads must have occurred. These events are mutually exclusive, so the associated probabilities add. That is

##P\big(A_0 \cup A_1 \cup A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) = 1##
because we've summed over all mutually exclusive probabilities in the sample space.

But taking advantage of mutual exclusivity, we know
## P\big(A_0 \cup A_1\big) + P\big(A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) = P\big(A_0 \cup A_1 \cup A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) = 1 ##

hence the openning setup is to take the above equation and subtract ##P\big(A_0 \cup A_1\big)## from each side, giving

## P\big(\text{at least two heads}\big)= P\big(A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) =1 - P\big(A_0 \cup A_1\big) = 1- P\big(\text{at most one heads}\big) ##
which is what was shown at the top
- - - -
##P\big(\text{at most one heads}\big) = P\big(A_0 \cup A_1\big) = P\big(A_0\big) + P\big(A_1\big)##
because the probabilities of mutually exclusive events add.

now consider the n-tuple that has the probability of failures for each trial, i.e.
##\mathbf q = (q_1, q_2, ..., q_n)##
(where n=5 in the problem)

##P\big(A_0\big) = q_1 \cdot q_2 \cdot ... \cdot q_n = e_n\big(\mathbf q\big)##
where ##e_n## is the nth elementary symmetric function aka just the product of all n terms. These probabilities multiply because underlying trials / coin tosses (aka simple events), are independent

now
##P\big(A_1\big)## is given by the probability of coin ##i## being heads and all other coins being tails and then summing over all i -- equivalently we are 'leaving out' one failure and 'including' one success and we are summing over all ##\binom{n}{n-1} = n## possibilities here.

so consider the case of ##i = 1##, the associated probability is
##(1-q_1)\cdot \big(q_2\cdot q_3 \cdot ... \cdot q_n \big) = \big(q_2\cdot q_3 \cdot ... \cdot q_n\big) - \big(q_1 \cdot q_2\cdot q_3 \cdot ... \cdot q_n\big) = \big( q_2\cdot q_3 \cdot ... \cdot q_n\big) - e_n\big(\mathbf q\big)##

and in general for having trial ##i## be the success, this is given by
##\big(\prod_{k\neq i} q_k \big) - e_n\big(\mathbf q\big)##

- - - - -
side note:
each one of these corresponds to an event
the above may be interpreted the probability of the event where trial ##i## is heads and all other trials are tails. We can denote this event as ##A_1^{(i)}##. A little bit of thought tells us that

##A_1 = A_1^{(1)} \cup A_1^{(2)} \cup ... \cup A_1^{(n)}##
and each of these events are mutually exclusive (e.g. it is impossible to have both trial 1 be the only one with heads and for trial 2 to be the only one with heads). So

##P\big(A_1\big) = P\big(A_1^{(1)} \cup A_1^{(2)} \cup ... \cup A_1^{(n)}\big) = P\big(A_1^{(1)} \big) + P\big( A_1^{(2)}\big) + ... + P\big(A_1^{(n)}\big)= \sum_{i=1}^n P\big(A_1^{(i)} \big)##

because we can add probabilities for mutually exclusive events. This gives the rationale for summing over all ##i##, which is what is shown below.

- - - - -
and summing over all i gives

##P\big(A_1\big) =\sum_{i=1}^n P\big(A_1^{(i)} \big) = \sum_{i=1}^n \Big(\big(\prod_{k\neq i} q_k \big) - e_n\big(\mathbf q\big)\Big) = \Big(\sum_{i=1}^n \prod_{k\neq i} q_k \Big) - \sum_{i=1}^n e_n\big(\mathbf q\big) = e_{n-1}\big(\mathbf q\big) - n \cdot e_n\big(\mathbf q\big)##

where we recognized that
##e_{n-1}\big(\mathbf q\big)## is the ##(n-1)##th elementary symmetric function -- i.e. it is equivalent to choosing ##n-1## of the ##n## q values and multiplying them, and then summing over all ##\binom{n}{n-1} = n## combinations.

- - - - -
This approach may become a bit cumbersome if there is a followup problem of finding "probability of at least 3 happening" while Ray's method will easily capture it.

On the other hand this approach is good practice for emphasizing the role of events and the sample space -- with special emphasis on the power of mutually exclusive events. (It also gives you another way of seeing that the special case of of homogenous probabilities for each coin toss simplifies all the calculations, leading to the binomial distribution.) The common thread between both approaches I suppose is that we are modelling / embedding the probabilities in terms of a polynomial -- Ray's method explicitly does this whereas in my approach it's implicit-- the elementary symmetric functions are intimately connected with roots of polynomials.

Last edited:
Ray Vickson
Homework Helper
Dearly Missed
The generating function approach seems like a pretty good one in general. (Alternatively various bounding methods like Chernoff were made for inhomogenous Bernouli trial setups like this.)

In this particular case, since OP recently had a question on sample spaces and events, perhaps a 3rd approach is illuminating.

- - - - -
we can look at the problem in terms of its complement:

and
the (compound) event of exactly zero heads over all trials is denoted by ##A_0## and the event of exactly one heads over all trials is denoted by ##A_1##. These events are mutually exclusive.
- - - -
side note:
it may be a bit overkill, but for avoidance of doubt, we can easily derive the above from first principles. (I spoiler tagged it as it may be too much symbol manipulation for a basic result.)

##P\big(\text{at least two heads}\big) = P\big(A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big)##
and over any sequence of ##n## tosses either 0, 1, 2, 3, ..., n-1, or n, heads must have occurred. These events are mutually exclusive, so the associated probabilities add. That is

##P\big(A_0 \cup A_1 \cup A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) = 1##
because we've summed over all mutually exclusive probabilities in the sample space.

But taking advantage of mutual exclusivity, we know
## P\big(A_0 \cup A_1\big) + P\big(A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) = P\big(A_0 \cup A_1 \cup A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) = 1 ##

hence the openning setup is to take the above equation and subtract ##P\big(A_0 \cup A_1\big)## from each side, giving

## P\big(\text{at least two heads}\big)= P\big(A_2 \cup A_3 \cup... \cup A_{n-1} \cup A_n\big) =1 - P\big(A_0 \cup A_1\big) = 1- P\big(\text{at most one heads}\big) ##
which is what was shown at the top
- - - -
##P\big(\text{at most one heads}\big) = P\big(A_0 \cup A_1\big) = P\big(A_0\big) + P\big(A_1\big)##
because the probabilities of mutually exclusive events add.

now consider the n-tuple that has the probability of failures for each trial, i.e.
##\mathbf q = (q_1, q_2, ..., q_n)##
(where n=5 in the problem)

##P\big(A_0\big) = q_1 \cdot q_2 \cdot ... \cdot q_n = e_n\big(\mathbf q\big)##
where ##e_n## is the nth elementary symmetric function aka just the product of all n terms. These probabilities multiply because underlying trials / coin tosses (aka simple events), are independent

now
##P\big(A_1\big)## is given by the probability of coin ##i## being heads and all other coins being tails and then summing over all i -- equivalently we are 'leaving out' one failure and 'including' one success and we are summing over all ##\binom{n}{n-1} = n## possibilities here.

so consider the case of ##i = 1##, the associated probability is
##(1-q_1)\cdot \big(q_2\cdot q_3 \cdot ... \cdot q_n \big) = \big(q_2\cdot q_3 \cdot ... \cdot q_n\big) - \big(q_1 \cdot q_2\cdot q_3 \cdot ... \cdot q_n\big) = \big( q_2\cdot q_3 \cdot ... \cdot q_n\big) - e_n\big(\mathbf q\big)##

and in general for having trial ##i## be the success, this is given by
##\big(\prod_{k\neq i} q_k \big) - e_n\big(\mathbf q\big)##

- - - - -
side note:
each one of these corresponds to an event
the above may be interpreted the probability of the event where trial ##i## is heads and all other trials are tails. We can denote this event as ##A_1^{(i)}##. A little bit of thought tells us that

##A_1 = A_1^{(1)} \cup A_1^{(2)} \cup ... \cup A_1^{(n)}##
and each of these events are mutually exclusive (e.g. it is impossible to have both trial 1 be the only one with heads and for trial 2 to be the only one with heads). So

##P\big(A_1\big) = P\big(A_1^{(1)} \cup A_1^{(2)} \cup ... \cup A_1^{(n)}\big) = P\big(A_1^{(1)} \big) + P\big( A_1^{(2)}\big) + ... + P\big(A_1^{(n)}\big)= \sum_{i=1}^n P\big(A_1^{(i)} \big)##

because we can add probabilities for mutually exclusive events. This gives the rationale for summing over all ##i##, which is what is shown below.

- - - - -
and summing over all i gives

##P\big(A_1\big) =\sum_{i=1}^n P\big(A_1^{(i)} \big) = \sum_{i=1}^n \Big(\big(\prod_{k\neq i} q_k \big) - e_n\big(\mathbf q\big)\Big) = \Big(\sum_{i=1}^n \prod_{k\neq i} q_k \Big) - \sum_{i=1}^n e_n\big(\mathbf q\big) = e_{n-1}\big(\mathbf q\big) - n \cdot e_n\big(\mathbf q\big)##

where we recognized that
##e_{n-1}\big(\mathbf q\big)## is the ##(n-1)##th elementary symmetric function -- i.e. it is equivalent to choosing ##n-1## of the ##n## q values and multiplying them, and then summing over all ##\binom{n}{n-1} = n## combinations.

- - - - -
This approach may become a bit cumbersome if there is a followup problem of finding "probability of at least 3 happening" while Ray's method will easily capture it.

On the other hand this approach is good practice for emphasizing the role of events and the sample space -- with special emphasis on the power of mutually exclusive events. (It also gives you another way of seeing that the special case of of homogenous probabilities for each coin toss simplifies all the calculations, leading to the binomial distribution.) The common thread between both approaches I suppose is that we are modelling / embedding the probabilities in terms of a polynomial -- Ray's method explicitly does this whereas in my approach it's implicit-- the elementary symmetric functions are intimately connected with roots of polynomials.
Along the same lines: Feller's book (among many others) develops usable formulas for such probabilities as "P{exactly k of the events occur}" and "P{at least k of the events occur}", for events ##A_1, A_2, \ldots, A_n##. These formulas are generalizations of the standard inclusion/exclusion formula. They involve the quantities
$$S_r = \displaystyle \sum_{1 \leq i_1 < i_2 < \cdots < i_r \leq n}^{} P(A_{i_1} \cap A_{i_2} \cap \cdots \cap A_{i_r})$$ for ##r = 1,2, \ldots, n.## Basically, they look at higher ##S_j, j > i## to cancel the double-counting contained in ##S_i##, and so the formulas have the ##S_j## occurring with alternating signs, together with some combinatorial coefficients. (These coefficients are different for the "exactly k" and the "at least k" cases.) The alternating signs in this method arise as an alternative to the generating-function method, which uses both ##p =P(A)## and ##q = 1-P(A).## Both methods involve subtraction, but in different ways and places.

I avoided mentioning the generalized inclusion-exclusion method, just because it is harder to justify than the straightforward generating function method. On the other hand, it (the inclusion-exclusion approach) applies generally, even to events that are not independent, while the generating function method needs independence in order to be useful.

StoneTemplePython