# Integrating a normal density to find a CDF

1. Jul 1, 2013

### Catchfire

1. The problem statement, all variables and given/known data
Let X ~ norm(5,10). Find P(X>10).

2. Relevant equations
f(x) = $\frac{1}{δ\sqrt{2π}} e^{-\frac{(x-μ)^2}{2δ^2}}$
F(x) = P(X<x) = $\int_{-∞}^x f(u) du$

3. The attempt at a solution
P(X>10) = 1 - P(X<10)

P(X<10) = $\int_{-∞}^{10} \frac{1}{δ\sqrt{2π}} e^{-\frac{(x-μ)^2}{2δ^2}} dx$
= $\frac{1}{\sqrt{2π}} \int_{-∞}^{10} \frac{1}{δ} e^{-\frac{(x-μ)^2}{2δ^2}} dx$

Let $-\frac{(x-μ)^2}{2δ^2} = -\frac{y^2}{2}$, then
$y = \frac{x-μ}{δ}$, Since $μ = 5, δ = 10 → y = 0.5$, when $x = 10$
$y = \frac{x-μ}{δ}$, Since $μ = 5, δ = 10 → y = -∞$, when $x = -∞$
$x = δy + μ$, So $dx = δ dy$.

So P(X<10) = $\frac{1}{\sqrt{2π}} \int_{-∞}^{0.5} e^{-\frac{y^2}{2}} dy$
= $\frac{1}{\sqrt{2π}} (\int_{-∞}^{0.5} e^{-\frac{y^2}{2}} dy \int_{-∞}^{0.5} e^{-\frac{x^2}{2}} dx )^{0.5}$
= $\frac{1}{\sqrt{2π}} (\int_{0}^{2π} \int_0^{0.5}re^{-\frac{r^2}{2}} drdθ)^{0.5}$
= $\frac{1}{\sqrt{2π}} (\int_{0}^{2π} [-e^{-\frac{r^2}{2}}]\stackrel{0}{0.5} dθ)^{0.5}$
= $\frac{1}{\sqrt{2π}} (\int_{0}^{2π} 1-e^{-\frac{1}{8}}dθ)^{0.5}$
= $\frac{1}{\sqrt{2π}} (2π (1-e^{-\frac{1}{8}})^{0.5})$
= $(1-e^{-\frac{1}{8}})^{0.5}$

So $P(X>10) = 1 - (1-e^{-\frac{1}{8}})^{0.5} = 0.6572...$, but it should be 0.3085.... What went wrong?

Last edited: Jul 1, 2013
2. Jul 1, 2013

### krome

The mistake happens in the third line when you convert to polar coordinates. You are no longer integrating over the same region. In the Cartesian case (second line) the integration region is a semi-infinite square with top-right corner at coordinates (0.5,0.5). In the polar case (third line) the integration region is the region inside the circle of radius 0.5 centered at the origin.

It's a neat thought to use that trick to try to calculate the integral but it is only useful when the integral goes over the entire line. Otherwise, the integral has to be computed numerically or called the Gauss error function.

By the way, you probably have context for the notation norm(*,*) so you know perfectly well what it means. But, I don't think the notation is universal. For example, in Wikipedia and in my learning, the second number is the variance, not the standard deviation. So I actually thought that $\delta = \sqrt{10}$ rather than $\delta = 10$. But, it's definitely $\delta = 10$ for this problem since that gives P(X>10)=0.3085.

Last edited: Jul 1, 2013
3. Jul 1, 2013

### Simon Bridge

You want the probability of finding a result greater than $\mu+\sigma /2$ ?
That should certainly be less than 0.5.

1 - 0.6572 = 0.3428 ... closer right?
But not the expected answer (which I take-it is a model answer?)
It is suggestive though ... I'd check the limits when you change coordinates.

Note: I won't do the problem, but I will guide you in looking for the trouble with what you've done.

Aside:
I'm puzzled you didn't just go: $$P(X>10)=\frac{1}{5\sqrt{2}}\int_{10}^\infty \exp \left [ -\frac{(x-5)^2}{200}\right ] \; dx$$

Observe your choices: P(X>10) = 1-P(X<10) = P(X<0) .... that last one looks like it could have easier limits... anyway: but I think the calculation would proceed the same way.

Krome beat me to it.
@ Krome - I hadn't noticed that quirk in wikipedia, I automatically read it as P~N(μ,σ) though I'm used to "norm(x,y)" meaning something different. I think a lot of older stats texts use the standard deviation version as standard.

Last edited: Jul 1, 2013
4. Jul 1, 2013

### Catchfire

Yeah I had a feeling that's where the issue was.

Thanks, I wasn't sure if it would work but I'd already tried integration by parts and that didn't seem to be going in the right direction. Guess I should scrap this solution.

And yes you're right, it should be X~N(5,100), at least according to my text. As I'm sure it's plain to see, I'm new to this notation.

5. Jul 1, 2013

### Catchfire

Can you elaborate? How does P(X>10) = P(X<0)?

Also both of you mention changing the limits when going from Cartesian to Polar. I'm guessing going to polar coordinates isn't the right method since we're dealing with a rectangular area.

6. Jul 1, 2013

### Catchfire

Sorry I think you miss read my post. The minus sign almost looks like a dot if not in latex tags. I've edited the original post to clarify things.

7. Jul 2, 2013

### krome

$P(X>10) = P(X<0)$ in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, $P(X> \mu + \epsilon) = P(X< \mu - \epsilon )$ for arbitrary real $\epsilon$. In this case, $P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)$.

8. Jul 2, 2013

### krome

$P(X>10) = P(X<0)$ in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, $P(X> \mu + \epsilon) = P(X< \mu - \epsilon )$ for arbitrary real $\epsilon$. In this case, $P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)$.

Correct.

9. Jul 2, 2013

### Simon Bridge

What krome said :)

In the bad old days you could only to normal distribution calcs for X~N(0,1), and only for P(0<X<x), so we had to learn to exploit all kinds of properties of the gaussian.

Although you can work a rectangular area in polar co-ords, it's not usually a pleasant experience. You translated the function and the area element correctly, but you got the wrong limits so it won't work.

Don't see anything to change what I wrote ... since your result was bigger than 0.5, there was the possibility that you misplaced a minus sign or had one too many "1-"'s. "1-<your result>" being close to what you needed suggests this may be worth looking into.

The other source of mistakes is, of course, integrating with the wrong limits... you started out right, but several transformations down the track...

10. Jul 2, 2013

### Catchfire

Ahh yes I see. That's a very handy property I would suspect, thanks.

So I was getting the impression I was going to have to solve this with the error function (which isn't mentioned in my text... or at the very least it's not in the index and I haven't come accross it yet). So are you saying I can solve this problem with the line of reasoning I initially presented (minus the incorrect limits and other arithmatic errors that might be present)?

11. Jul 2, 2013

### Ray Vickson

Yes, you need to use the error function, or a normal cdf table, or (nowadays) a scientific calculator or a spreadsheet. You are asking for the probability that a normal random variable is more than 1/2 standard deviation above the mean, and there is no simple, closed-form formula for that.

Last edited: Jul 2, 2013
12. Jul 2, 2013

### Catchfire

So that's what all those tables in the back of my text are for, lol. I guess the text I'm using assumes some level of familiarity with these tables as I haven't seen them mentioned yet. Well that makes things much easier. Still it's much more satisfying to work the problem than look it up.

In the future how would one know before attempting to find a closed-form, if one actually exists or not? I'm guessing like most things that have to do with integration it all comes down to experience.

Thanks for all the help, it may not have been what I was looking for initially but I've learned plenty!

13. Jul 2, 2013

### Simon Bridge

FWIW: I've never done this without the errorfunction ... which basically means some sort of numerical method.
Paying careful attention to the limits would have shown you why - it was a pedagogical ploy ;)

Those tables at the back of your book are the lifesaver here - learn to use them.
Aside: your approach was kinda neat though - it sure looks like it should work doesn't it? - I don't think I've seen anyone try that before.

14. Jul 5, 2013

### Catchfire

So after some integration with polar coordinates refreshing the limits should have been something like pi to 3/2pi and 0 to infinity. This however would have left me with my centre at the origin not (0.5, 0.5). Which is why when I integrate from the origin I get 0.5 which is less than 0.6915 (the actual area). I'm guessing there is no way to translate the integral to the origin. (Was that the goal of your ploy?) This is still not a proof of why there is no closed form, could it lead to one?

In my quest to figure out how to prove a function has no closed form I found this paper:

http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.pjm/1102991609

Could someone post the partial fraction this is refering to?

15. Jul 5, 2013

### Simon Bridge

... there's something nagging me about those limits... I'll figure it out later.
It may help with your overall objective to look at the overall geometry that you have constructed. eg. Is this a surface integral or are you looking at the volume between a surface and the x-y plane?

It should throw light your "center" issue too ... should there be a lower limit on r or maybe it should depend on theta?

assuming a' = da/dz then eg.

$$\frac{da}{dz}+2za = 1$$
Never mind that, books are always saying that this or that is obvious if you just look at the other drives me up the wall!
Have a go solving it - it's 1st order linear ODE right? You can do those!

Note "proving" and "understanding" are different things.
I don't know how to prove that there is no closed form.