Integrating a normal density to find a CDF

Catchfire · Jul 1, 2013

Homework Statement

Let X ~ norm(5,10). Find P(X>10).

Homework Equations

f(x) = [itex]\frac{1}{δ\sqrt{2π}} e^{-\frac{(x-μ)^2}{2δ^2}}[/itex]
F(x) = P(X<x) = [itex]\int_{-∞}^x f(u) du[/itex]

The Attempt at a Solution

P(X>10) = 1 - P(X<10)

P(X<10) = [itex]\int_{-∞}^{10} \frac{1}{δ\sqrt{2π}} e^{-\frac{(x-μ)^2}{2δ^2}} dx[/itex]
= [itex]\frac{1}{\sqrt{2π}} \int_{-∞}^{10} \frac{1}{δ} e^{-\frac{(x-μ)^2}{2δ^2}} dx[/itex]

Let [itex]-\frac{(x-μ)^2}{2δ^2} = -\frac{y^2}{2}[/itex], then
[itex]y = \frac{x-μ}{δ}[/itex], Since [itex]μ = 5, δ = 10 → y = 0.5[/itex], when [itex]x = 10[/itex]
[itex]y = \frac{x-μ}{δ}[/itex], Since [itex]μ = 5, δ = 10 → y = -∞[/itex], when [itex]x = -∞[/itex]
[itex]x = δy + μ[/itex], So [itex]dx = δ dy[/itex].

So P(X<10) = [itex]\frac{1}{\sqrt{2π}} \int_{-∞}^{0.5} e^{-\frac{y^2}{2}} dy[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{-∞}^{0.5} e^{-\frac{y^2}{2}} dy \int_{-∞}^{0.5} e^{-\frac{x^2}{2}} dx )^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{0}^{2π} \int_0^{0.5}re^{-\frac{r^2}{2}} drdθ)^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{0}^{2π} [-e^{-\frac{r^2}{2}}]\stackrel{0}{0.5} dθ)^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{0}^{2π} 1-e^{-\frac{1}{8}}dθ)^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (2π (1-e^{-\frac{1}{8}})^{0.5})[/itex]
= [itex](1-e^{-\frac{1}{8}})^{0.5}[/itex]

So [itex]P(X>10) = 1 - (1-e^{-\frac{1}{8}})^{0.5} = 0.6572...[/itex], but it should be 0.3085... What went wrong?

krome · Jul 1, 2013

The mistake happens in the third line when you convert to polar coordinates. You are no longer integrating over the same region. In the Cartesian case (second line) the integration region is a semi-infinite square with top-right corner at coordinates (0.5,0.5). In the polar case (third line) the integration region is the region inside the circle of radius 0.5 centered at the origin.

It's a neat thought to use that trick to try to calculate the integral but it is only useful when the integral goes over the entire line. Otherwise, the integral has to be computed numerically or called the Gauss error function.

By the way, you probably have context for the notation norm(*,*) so you know perfectly well what it means. But, I don't think the notation is universal. For example, in Wikipedia and in my learning, the second number is the variance, not the standard deviation. So I actually thought that [itex]\delta = \sqrt{10}[/itex] rather than [itex]\delta = 10[/itex]. But, it's definitely [itex]\delta = 10[/itex] for this problem since that gives P(X>10)=0.3085.

Simon Bridge · Jul 1, 2013

You want the probability of finding a result greater than ##\mu+\sigma /2## ?
That should certainly be less than 0.5.

1 - 0.6572 = 0.3428 ... closer right?
But not the expected answer (which I take-it is a model answer?)
It is suggestive though ... I'd check the limits when you change coordinates.

Note: I won't do the problem, but I will guide you in looking for the trouble with what you've done.

Aside:
I'm puzzled you didn't just go: $$P(X>10)=\frac{1}{5\sqrt{2}}\int_{10}^\infty \exp \left [ -\frac{(x-5)^2}{200}\right ] \; dx$$

Observe your choices: P(X>10) = 1-P(X<10) = P(X<0) ... that last one looks like it could have easier limits... anyway: but I think the calculation would proceed the same way.

[edit]Krome beat me to it.
@ Krome - I hadn't noticed that quirk in wikipedia, I automatically read it as P~N(μ,σ) though I'm used to "norm(x,y)" meaning something different. I think a lot of older stats texts use the standard deviation version as standard.

Catchfire · Jul 1, 2013

krome said:

The mistake happens in the third line when you convert to polar coordinates. You are no longer integrating over the same region. In the Cartesian case (second line) the integration region is a semi-infinite square with top-right corner at coordinates (0.5,0.5). In the polar case (third line) the integration region is the region inside the circle of radius 0.5 centered at the origin.

It's a neat thought to use that trick to try to calculate the integral but it is only useful when the integral goes over the entire line. Otherwise, the integral has to be computed numerically or called the Gauss error function.

By the way, you probably have context for the notation norm(*,*) so you know perfectly well what it means. But, I don't think the notation is universal. For example, in Wikipedia and in my learning, the second number is the variance, not the standard deviation. So I actually thought that [itex]\delta = \sqrt{10}[/itex] rather than [itex]\delta = 10[/itex]. But, it's definitely [itex]\delta = 10[/itex] for this problem since that gives P(X>10)=0.3085.

Yeah I had a feeling that's where the issue was.

Thanks, I wasn't sure if it would work but I'd already tried integration by parts and that didn't seem to be going in the right direction. Guess I should scrap this solution.

And yes you're right, it should be X~N(5,100), at least according to my text. As I'm sure it's plain to see, I'm new to this notation.

Catchfire · Jul 1, 2013

Simon Bridge said:

You want the probability of finding a result greater than ##\mu+\sigma /2## ?
That should certainly be less than 0.5.

1 - 0.6572 = 0.3428 ... closer right?
But not the expected answer (which I take-it is a model answer?)
It is suggestive though ... I'd check the limits when you change coordinates.

Note: I won't do the problem, but I will guide you in looking for the trouble with what you've done.

Aside:
I'm puzzled you didn't just go: $$P(X>10)=\frac{1}{5\sqrt{2}}\int_{10}^\infty \exp \left [ -\frac{(x-5)^2}{200}\right ] \; dx$$

Observe your choices: P(X>10) = 1-P(X<10) = P(X<0) ... that last one looks like it could have easier limits... anyway: but I think the calculation would proceed the same way.

[edit]Krome beat me to it.
@ Krome - I hadn't noticed that quirk in wikipedia, I automatically read it as P~N(μ,σ) though I'm used to "norm(x,y)" meaning something different. I think a lot of older stats texts use the standard deviation version as standard.

Can you elaborate? How does P(X>10) = P(X<0)?

Also both of you mention changing the limits when going from Cartesian to Polar. I'm guessing going to polar coordinates isn't the right method since we're dealing with a rectangular area.

Catchfire · Jul 1, 2013

Simon Bridge said:

You want the probability of finding a result greater than ##\mu+\sigma /2## ?
That should certainly be less than 0.5.

1 - 0.6572 = 0.3428 ... closer right?

Sorry I think you miss read my post. The minus sign almost looks like a dot if not in latex tags. I've edited the original post to clarify things.

krome · Jul 2, 2013

[itex]P(X>10) = P(X<0)[/itex] in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, [itex]P(X> \mu + \epsilon) = P(X< \mu - \epsilon )[/itex] for arbitrary real [itex]\epsilon[/itex]. In this case, [itex]P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)[/itex].

krome · Jul 2, 2013

Catchfire said:

Can you elaborate? How does P(X>10) = P(X<0)?

[itex]P(X>10) = P(X<0)[/itex] in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, [itex]P(X> \mu + \epsilon) = P(X< \mu - \epsilon )[/itex] for arbitrary real [itex]\epsilon[/itex]. In this case, [itex]P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)[/itex].

Catchfire said:

Also both of you mention changing the limits when going from Cartesian to Polar. I'm guessing going to polar coordinates isn't the right method since we're dealing with a rectangular area.

Correct.

Simon Bridge · Jul 2, 2013

Catchfire said:

Can you elaborate? How does P(X>10) = P(X<0)?

What krome said :)

In the bad old days you could only to normal distribution calcs for X~N(0,1), and only for P(0<X<x), so we had to learn to exploit all kinds of properties of the gaussian.

Also both of you mention changing the limits when going from Cartesian to Polar. I'm guessing going to polar coordinates isn't the right method since we're dealing with a rectangular area.

Although you can work a rectangular area in polar co-ords, it's not usually a pleasant experience. You translated the function and the area element correctly, but you got the wrong limits so it won't work.

Catchfire said:

Sorry I think you miss read my post. The minus sign almost looks like a dot if not in latex tags. I've edited the original post to clarify things.

Don't see anything to change what I wrote ... since your result was bigger than 0.5, there was the possibility that you misplaced a minus sign or had one too many "1-"'s. "1-<your result>" being close to what you needed suggests this may be worth looking into.

The other source of mistakes is, of course, integrating with the wrong limits... you started out right, but several transformations down the track...

Catchfire · Jul 2, 2013

krome said:

[itex]P(X>10) = P(X<0)[/itex] in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, [itex]P(X> \mu + \epsilon) = P(X< \mu - \epsilon )[/itex] for arbitrary real [itex]\epsilon[/itex]. In this case, [itex]P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)[/itex].

Ahh yes I see. That's a very handy property I would suspect, thanks.

Simon Bridge said:

Don't see anything to change what I wrote ... since your result was bigger than 0.5, there was the possibility that you misplaced a minus sign or had one too many "1-"'s. "1-<your result>" being close to what you needed suggests this may be worth looking into.

The other source of mistakes is, of course, integrating with the wrong limits... you started out right, but several transformations down the track...

So I was getting the impression I was going to have to solve this with the error function (which isn't mentioned in my text... or at the very least it's not in the index and I haven't come across it yet). So are you saying I can solve this problem with the line of reasoning I initially presented (minus the incorrect limits and other arithmatic errors that might be present)?

Ray Vickson · Jul 2, 2013

Catchfire said:

Ahh yes I see. That's a very handy property I would suspect, thanks.
So I was getting the impression I was going to have to solve this with the error function (which isn't mentioned in my text... or at the very least it's not in the index and I haven't come across it yet). So are you saying I can solve this problem with the line of reasoning I initially presented (minus the incorrect limits and other arithmatic errors that might be present)?

Yes, you need to use the error function, or a normal cdf table, or (nowadays) a scientific calculator or a spreadsheet. You are asking for the probability that a normal random variable is more than 1/2 standard deviation above the mean, and there is no simple, closed-form formula for that.

Catchfire · Jul 2, 2013

So that's what all those tables in the back of my text are for, lol. I guess the text I'm using assumes some level of familiarity with these tables as I haven't seen them mentioned yet. Well that makes things much easier. Still it's much more satisfying to work the problem than look it up.

In the future how would one know before attempting to find a closed-form, if one actually exists or not? I'm guessing like most things that have to do with integration it all comes down to experience.

Thanks for all the help, it may not have been what I was looking for initially but I've learned plenty!

Simon Bridge · Jul 2, 2013

Catchfire said:

So that's what all those tables in the back of my text are for, lol. I guess the text I'm using assumes some level of familiarity with these tables as I haven't seen them mentioned yet. Well that makes things much easier. Still it's much more satisfying to work the problem than look it up.

In the future how would one know before attempting to find a closed-form, if one actually exists or not? I'm guessing like most things that have to do with integration it all comes down to experience.

Thanks for all the help, it may not have been what I was looking for initially but I've learned plenty!

FWIW: I've never done this without the errorfunction ... which basically means some sort of numerical method.
Paying careful attention to the limits would have shown you why - it was a pedagogical ploy ;)

Those tables at the back of your book are the lifesaver here - learn to use them.
Aside: your approach was kinda neat though - it sure looks like it should work doesn't it? - I don't think I've seen anyone try that before.

Catchfire · Jul 5, 2013

Simon Bridge said:

FWIW: I've never done this without the errorfunction ... which basically means some sort of numerical method.
Paying careful attention to the limits would have shown you why - it was a pedagogical ploy ;)

Those tables at the back of your book are the lifesaver here - learn to use them.
Aside: your approach was kinda neat though - it sure looks like it should work doesn't it? - I don't think I've seen anyone try that before.

So after some integration with polar coordinates refreshing the limits should have been something like pi to 3/2pi and 0 to infinity. This however would have left me with my centre at the origin not (0.5, 0.5). Which is why when I integrate from the origin I get 0.5 which is less than 0.6915 (the actual area). I'm guessing there is no way to translate the integral to the origin. (Was that the goal of your ploy?) This is still not a proof of why there is no closed form, could it lead to one?

In my quest to figure out how to prove a function has no closed form I found this paper:

http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.pjm/1102991609

The classic integrals \[itex]\int e^{z^2} dz[/itex], [itex]\int e^z/z dz[/itex] are not elementary, for they
give rise to the equations 1 = a' + 2za and 1/z — a' + a, neither of which has a solution a in C(z), as one sees by looking at the partial fraction expression for a.

Could someone post the partial fraction this is referring to?

Simon Bridge · Jul 5, 2013

So after some integration with polar coordinates refreshing the limits should have been something like pi to [3pi/2] and 0 to infinity.

... there's something nagging me about those limits... I'll figure it out later.
It may help with your overall objective to look at the overall geometry that you have constructed. eg. Is this a surface integral or are you looking at the volume between a surface and the x-y plane?

It should throw light your "center" issue too ... should there be a lower limit on r or maybe it should depend on theta?

give rise to the equations 1 = a' + 2za and 1/z — a' + a, neither of which has a solution a in C(z), as one sees by looking at the partial fraction expression for a.

assuming a' = da/dz then eg.

$$\frac{da}{dz}+2za = 1$$

Could someone post the partial fraction this is referring to?

Never mind that, books are always saying that this or that is obvious if you just look at the other drives me up the wall!
Have a go solving it - it's 1st order linear ODE right? You can do those!

Note "proving" and "understanding" are different things.
I don't know how to prove that there is no closed form.

Integrating a normal density to find a CDF

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Distance between a Clock's hands when the distance is increasing most rapidly

Polar integral

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight