# Probability and normal distribution

Tags:
1. Apr 29, 2015

### ichabodgrant

1. The problem statement, all variables and given/known data
Due to the pollution from the industry around an apple farm, the apples grown there may be contaminated by heavy metals. It is believed that the amount of heavy metals in an apple of the farm follows the Normal distribution.
N(16,16) which has a mean μ = 16 units and σ = √16 = 4 units. An apple is said to be "acceptable" if the amount of heavy metals is less than 18 units.

First, what is the probability that a randomly selected apple is acceptable?

Second, if each apple is taken to be conducted in a test of heavy metals content (assuming each test is absolutely accurate), what is the probability that more than 2 tests are required for obtaining the first acceptable apple.

Third, now the pollution is getting worse and an investigation is conducted by randomly selecting 100 apples. The average amount of heavy metals is 20 units. We assume that the s.d. remains the same as the question says. Construct a 95% confidence interval for the mean amount of heavy metals μ.

2. Relevant equations
P(X <=-k) = P(X>=k)
The normal distribution pdf

Z = (X - μ)/σ

P( (barX - μ)/(σ/√n) <= Zα/2) = β, where confidence level β = 1 - α

3. The attempt at a solution
Let X be the random variable meaning the amount of heavy metals found in an apple.
For the first question, I try to calculate P(X<=18). By converting it to standard normal distribution using
Z = (X - μ)/σ. But I have a question on whether I should calculate P(X<18). If so, does it mean I have to minus P(X=18) from P(X<=18)? Or are they the same? Or I should us the continuity correction factor?

For the second question, I have thought about calculating that P(more than 2 tests) = 1 - P(1 test) - P(2 tests).
Does it mean that I use the prob. I got in part a i: P(1st rejected) = 1 - P(1st acceptable) = 1 - P?

For the third question, I just plug in the formula P( (barX - μ)/(σ/√n) <= Zα/2) = β to find the intervals?

Last edited: Apr 29, 2015
2. Apr 29, 2015

### Ray Vickson

(1) For a continuous random variable, like your X, the probability of getting any particular exact value is 0, so P(X = 18) = 0. There is no difference between P(X < 18) and P(X <= 18). There is no reason why the metal content X must come in discrete chunks; presumably, X can be any real value >= 0.

The "continuity correction factor" you ask about is used when approximating a discrete random variable (which might be difficult to work with in computations) by a continuous approximating random variable. So, for example, we might replace a binomial random variable N, taking values in {0,1,2,3,..., 100} by a normal approximation X with the same mean and variance as N. The normal X can take any real value, and the continuity correction would come into play when re-writing events like {N <= k} in terms of X.

(2) If the apples are independent, do you think that whether or not apple #1 passes the test will influence whether or not apple #2 passes or fails? What do you know about probabilities for independent events?

(3) Try it and see for yourself.

Last edited: Apr 29, 2015
3. Apr 29, 2015

### ichabodgrant

I understand 1st and 3rd question now.
For 2nd question,
my approach now is

let p be the P(accpetable) in 1st question.
P(more than 2 tests) = 1 - [ P(1st acceptable) + P(1st rejected, 2nd accpetable) ] = 1 - [ p + (1-p)p ]. Is this correct?

4. Apr 29, 2015

### BvU

For the first question: You can not add things that do not have the same dimension. P(X=18) is a probability density (a height if you want to consider it in this context). In contrast P(X<=18) is a probability, a short form for $$\int_{-\infty}^{18} P(x) dx$$ (an area in this context).

There is no difference between P(X<=18) and P(X<18), strange as it may seem. The difference between the two is an integral over a range with width zero.

For the second test you want to think of a different formula: if you continue tests your formula would end up with a negative probability !

 slow typist

5. Apr 29, 2015

### ichabodgrant

So is this P(more than 2 tests) = 1 - [ P(1st acceptable) + P(1st rejected, 2nd accpetable) ] = 1 - [ p + (1-p)p ] wrong?

6. Apr 29, 2015

### BvU

Makes me think of "what is the probability there is no 6 when you throw 6 dice" . Lots of folks go 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 1 so the answer is 0.

You want it a lot simpler: P = P(1st rejected and 2nd rejected)

7. Apr 29, 2015

### ichabodgrant

because question 1 is P(acceptable)...so I use it directly...

8. Apr 29, 2015

### BvU

So P(unacceptable) = 1 - P(acceptable). To make it necessary to grab more than 2 apples you square that.

There is an implicit assumption here that there are a lot of apples to select from

And yes, $(1-p)^2 \ne 1-p^2$

9. Apr 29, 2015

### Ray Vickson

P(reject) = 1 - P(accept), so you can use whichever is more convenient. Once and for all, rid your mind of the thought that you must always work directly with what you are given initially. Sometimes, when you want information about acceptances it is easier to look instead of rejections and work with those. In probability especially, it is very common to compute a probability of an event by, instead, computing the probability of the complementary even, then subtracting that from 1.