Integrating a normal density to find a CDF

  • Thread starter Catchfire
  • Start date
  • #1
30
0

Homework Statement


Let X ~ norm(5,10). Find P(X>10).


Homework Equations


f(x) = [itex]\frac{1}{δ\sqrt{2π}} e^{-\frac{(x-μ)^2}{2δ^2}}[/itex]
F(x) = P(X<x) = [itex]\int_{-∞}^x f(u) du[/itex]

The Attempt at a Solution


P(X>10) = 1 - P(X<10)

P(X<10) = [itex]\int_{-∞}^{10} \frac{1}{δ\sqrt{2π}} e^{-\frac{(x-μ)^2}{2δ^2}} dx[/itex]
= [itex]\frac{1}{\sqrt{2π}} \int_{-∞}^{10} \frac{1}{δ} e^{-\frac{(x-μ)^2}{2δ^2}} dx[/itex]

Let [itex]-\frac{(x-μ)^2}{2δ^2} = -\frac{y^2}{2}[/itex], then
[itex]y = \frac{x-μ}{δ}[/itex], Since [itex]μ = 5, δ = 10 → y = 0.5[/itex], when [itex]x = 10[/itex]
[itex]y = \frac{x-μ}{δ}[/itex], Since [itex]μ = 5, δ = 10 → y = -∞[/itex], when [itex]x = -∞[/itex]
[itex]x = δy + μ[/itex], So [itex]dx = δ dy[/itex].

So P(X<10) = [itex]\frac{1}{\sqrt{2π}} \int_{-∞}^{0.5} e^{-\frac{y^2}{2}} dy[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{-∞}^{0.5} e^{-\frac{y^2}{2}} dy \int_{-∞}^{0.5} e^{-\frac{x^2}{2}} dx )^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{0}^{2π} \int_0^{0.5}re^{-\frac{r^2}{2}} drdθ)^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{0}^{2π} [-e^{-\frac{r^2}{2}}]\stackrel{0}{0.5} dθ)^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (\int_{0}^{2π} 1-e^{-\frac{1}{8}}dθ)^{0.5}[/itex]
= [itex]\frac{1}{\sqrt{2π}} (2π (1-e^{-\frac{1}{8}})^{0.5})[/itex]
= [itex](1-e^{-\frac{1}{8}})^{0.5}[/itex]

So [itex]P(X>10) = 1 - (1-e^{-\frac{1}{8}})^{0.5} = 0.6572...[/itex], but it should be 0.3085.... What went wrong?
 
Last edited:

Answers and Replies

  • #2
37
4
The mistake happens in the third line when you convert to polar coordinates. You are no longer integrating over the same region. In the Cartesian case (second line) the integration region is a semi-infinite square with top-right corner at coordinates (0.5,0.5). In the polar case (third line) the integration region is the region inside the circle of radius 0.5 centered at the origin.

It's a neat thought to use that trick to try to calculate the integral but it is only useful when the integral goes over the entire line. Otherwise, the integral has to be computed numerically or called the Gauss error function.

By the way, you probably have context for the notation norm(*,*) so you know perfectly well what it means. But, I don't think the notation is universal. For example, in Wikipedia and in my learning, the second number is the variance, not the standard deviation. So I actually thought that [itex]\delta = \sqrt{10}[/itex] rather than [itex]\delta = 10[/itex]. But, it's definitely [itex]\delta = 10[/itex] for this problem since that gives P(X>10)=0.3085.
 
Last edited:
  • #3
Simon Bridge
Science Advisor
Homework Helper
17,874
1,655
You want the probability of finding a result greater than ##\mu+\sigma /2## ?
That should certainly be less than 0.5.

1 - 0.6572 = 0.3428 ... closer right?
But not the expected answer (which I take-it is a model answer?)
It is suggestive though ... I'd check the limits when you change coordinates.

Note: I won't do the problem, but I will guide you in looking for the trouble with what you've done.


Aside:
I'm puzzled you didn't just go: $$P(X>10)=\frac{1}{5\sqrt{2}}\int_{10}^\infty \exp \left [ -\frac{(x-5)^2}{200}\right ] \; dx$$

Observe your choices: P(X>10) = 1-P(X<10) = P(X<0) .... that last one looks like it could have easier limits... anyway: but I think the calculation would proceed the same way.

[edit]Krome beat me to it.
@ Krome - I hadn't noticed that quirk in wikipedia, I automatically read it as P~N(μ,σ) though I'm used to "norm(x,y)" meaning something different. I think a lot of older stats texts use the standard deviation version as standard.
 
Last edited:
  • #4
30
0
The mistake happens in the third line when you convert to polar coordinates. You are no longer integrating over the same region. In the Cartesian case (second line) the integration region is a semi-infinite square with top-right corner at coordinates (0.5,0.5). In the polar case (third line) the integration region is the region inside the circle of radius 0.5 centered at the origin.

It's a neat thought to use that trick to try to calculate the integral but it is only useful when the integral goes over the entire line. Otherwise, the integral has to be computed numerically or called the Gauss error function.

By the way, you probably have context for the notation norm(*,*) so you know perfectly well what it means. But, I don't think the notation is universal. For example, in Wikipedia and in my learning, the second number is the variance, not the standard deviation. So I actually thought that [itex]\delta = \sqrt{10}[/itex] rather than [itex]\delta = 10[/itex]. But, it's definitely [itex]\delta = 10[/itex] for this problem since that gives P(X>10)=0.3085.

Yeah I had a feeling that's where the issue was.

Thanks, I wasn't sure if it would work but I'd already tried integration by parts and that didn't seem to be going in the right direction. Guess I should scrap this solution.

And yes you're right, it should be X~N(5,100), at least according to my text. As I'm sure it's plain to see, I'm new to this notation.
 
  • #5
30
0
You want the probability of finding a result greater than ##\mu+\sigma /2## ?
That should certainly be less than 0.5.

1 - 0.6572 = 0.3428 ... closer right?
But not the expected answer (which I take-it is a model answer?)
It is suggestive though ... I'd check the limits when you change coordinates.

Note: I won't do the problem, but I will guide you in looking for the trouble with what you've done.


Aside:
I'm puzzled you didn't just go: $$P(X>10)=\frac{1}{5\sqrt{2}}\int_{10}^\infty \exp \left [ -\frac{(x-5)^2}{200}\right ] \; dx$$

Observe your choices: P(X>10) = 1-P(X<10) = P(X<0) .... that last one looks like it could have easier limits... anyway: but I think the calculation would proceed the same way.

[edit]Krome beat me to it.
@ Krome - I hadn't noticed that quirk in wikipedia, I automatically read it as P~N(μ,σ) though I'm used to "norm(x,y)" meaning something different. I think a lot of older stats texts use the standard deviation version as standard.

Can you elaborate? How does P(X>10) = P(X<0)?

Also both of you mention changing the limits when going from Cartesian to Polar. I'm guessing going to polar coordinates isn't the right method since we're dealing with a rectangular area.
 
  • #6
30
0
You want the probability of finding a result greater than ##\mu+\sigma /2## ?
That should certainly be less than 0.5.

1 - 0.6572 = 0.3428 ... closer right?

Sorry I think you miss read my post. The minus sign almost looks like a dot if not in latex tags. I've edited the original post to clarify things.
 
  • #7
37
4
[itex]P(X>10) = P(X<0)[/itex] in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, [itex]P(X> \mu + \epsilon) = P(X< \mu - \epsilon )[/itex] for arbitrary real [itex]\epsilon[/itex]. In this case, [itex]P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)[/itex].
 
  • #8
37
4
Can you elaborate? How does P(X>10) = P(X<0)?

[itex]P(X>10) = P(X<0)[/itex] in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, [itex]P(X> \mu + \epsilon) = P(X< \mu - \epsilon )[/itex] for arbitrary real [itex]\epsilon[/itex]. In this case, [itex]P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)[/itex].

Also both of you mention changing the limits when going from Cartesian to Polar. I'm guessing going to polar coordinates isn't the right method since we're dealing with a rectangular area.

Correct.
 
  • #9
Simon Bridge
Science Advisor
Homework Helper
17,874
1,655
Can you elaborate? How does P(X>10) = P(X<0)?
What krome said :)

In the bad old days you could only to normal distribution calcs for X~N(0,1), and only for P(0<X<x), so we had to learn to exploit all kinds of properties of the gaussian.

Also both of you mention changing the limits when going from Cartesian to Polar. I'm guessing going to polar coordinates isn't the right method since we're dealing with a rectangular area.

Although you can work a rectangular area in polar co-ords, it's not usually a pleasant experience. You translated the function and the area element correctly, but you got the wrong limits so it won't work.

Sorry I think you miss read my post. The minus sign almost looks like a dot if not in latex tags. I've edited the original post to clarify things.
Don't see anything to change what I wrote ... since your result was bigger than 0.5, there was the possibility that you misplaced a minus sign or had one too many "1-"'s. "1-<your result>" being close to what you needed suggests this may be worth looking into.

The other source of mistakes is, of course, integrating with the wrong limits... you started out right, but several transformations down the track...
 
  • #10
30
0
[itex]P(X>10) = P(X<0)[/itex] in this specific case. It is due to the fact that the normal distribution is symmetric about the mean. Therefore, [itex]P(X> \mu + \epsilon) = P(X< \mu - \epsilon )[/itex] for arbitrary real [itex]\epsilon[/itex]. In this case, [itex]P(X>10) = P(X> \mu + 5) = P(X < \mu - 5) = P(X < 0)[/itex].

Ahh yes I see. That's a very handy property I would suspect, thanks.

Don't see anything to change what I wrote ... since your result was bigger than 0.5, there was the possibility that you misplaced a minus sign or had one too many "1-"'s. "1-<your result>" being close to what you needed suggests this may be worth looking into.

The other source of mistakes is, of course, integrating with the wrong limits... you started out right, but several transformations down the track...

So I was getting the impression I was going to have to solve this with the error function (which isn't mentioned in my text... or at the very least it's not in the index and I haven't come accross it yet). So are you saying I can solve this problem with the line of reasoning I initially presented (minus the incorrect limits and other arithmatic errors that might be present)?
 
  • #11
Ray Vickson
Science Advisor
Homework Helper
Dearly Missed
10,706
1,722
Ahh yes I see. That's a very handy property I would suspect, thanks.



So I was getting the impression I was going to have to solve this with the error function (which isn't mentioned in my text... or at the very least it's not in the index and I haven't come accross it yet). So are you saying I can solve this problem with the line of reasoning I initially presented (minus the incorrect limits and other arithmatic errors that might be present)?

Yes, you need to use the error function, or a normal cdf table, or (nowadays) a scientific calculator or a spreadsheet. You are asking for the probability that a normal random variable is more than 1/2 standard deviation above the mean, and there is no simple, closed-form formula for that.
 
Last edited:
  • #12
30
0
So that's what all those tables in the back of my text are for, lol. I guess the text I'm using assumes some level of familiarity with these tables as I haven't seen them mentioned yet. Well that makes things much easier. Still it's much more satisfying to work the problem than look it up.

In the future how would one know before attempting to find a closed-form, if one actually exists or not? I'm guessing like most things that have to do with integration it all comes down to experience.

Thanks for all the help, it may not have been what I was looking for initially but I've learned plenty!
 
  • #13
Simon Bridge
Science Advisor
Homework Helper
17,874
1,655
So that's what all those tables in the back of my text are for, lol. I guess the text I'm using assumes some level of familiarity with these tables as I haven't seen them mentioned yet. Well that makes things much easier. Still it's much more satisfying to work the problem than look it up.

In the future how would one know before attempting to find a closed-form, if one actually exists or not? I'm guessing like most things that have to do with integration it all comes down to experience.

Thanks for all the help, it may not have been what I was looking for initially but I've learned plenty!

FWIW: I've never done this without the errorfunction ... which basically means some sort of numerical method.
Paying careful attention to the limits would have shown you why - it was a pedagogical ploy ;)

Those tables at the back of your book are the lifesaver here - learn to use them.
Aside: your approach was kinda neat though - it sure looks like it should work doesn't it? - I don't think I've seen anyone try that before.
 
  • #14
30
0
FWIW: I've never done this without the errorfunction ... which basically means some sort of numerical method.
Paying careful attention to the limits would have shown you why - it was a pedagogical ploy ;)

Those tables at the back of your book are the lifesaver here - learn to use them.
Aside: your approach was kinda neat though - it sure looks like it should work doesn't it? - I don't think I've seen anyone try that before.

So after some integration with polar coordinates refreshing the limits should have been something like pi to 3/2pi and 0 to infinity. This however would have left me with my centre at the origin not (0.5, 0.5). Which is why when I integrate from the origin I get 0.5 which is less than 0.6915 (the actual area). I'm guessing there is no way to translate the integral to the origin. (Was that the goal of your ploy?) This is still not a proof of why there is no closed form, could it lead to one?

In my quest to figure out how to prove a function has no closed form I found this paper:

http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.pjm/1102991609

The classic integrals \[itex]\int e^{z^2} dz[/itex], [itex]\int e^z/z dz[/itex] are not elementary, for they
give rise to the equations 1 = a' + 2za and 1/z — a' + a, neither of which has a solution a in C(z), as one sees by looking at the partial fraction expression for a.

Could someone post the partial fraction this is refering to?
 
  • #15
Simon Bridge
Science Advisor
Homework Helper
17,874
1,655
So after some integration with polar coordinates refreshing the limits should have been something like pi to [3pi/2] and 0 to infinity.
... there's something nagging me about those limits... I'll figure it out later.
It may help with your overall objective to look at the overall geometry that you have constructed. eg. Is this a surface integral or are you looking at the volume between a surface and the x-y plane?

It should throw light your "center" issue too ... should there be a lower limit on r or maybe it should depend on theta?

give rise to the equations 1 = a' + 2za and 1/z — a' + a, neither of which has a solution a in C(z), as one sees by looking at the partial fraction expression for a.

assuming a' = da/dz then eg.

$$\frac{da}{dz}+2za = 1$$
Could someone post the partial fraction this is refering to?
Never mind that, books are always saying that this or that is obvious if you just look at the other drives me up the wall!
Have a go solving it - it's 1st order linear ODE right? You can do those!

Note "proving" and "understanding" are different things.
I don't know how to prove that there is no closed form.
 

Related Threads on Integrating a normal density to find a CDF

  • Last Post
Replies
1
Views
959
Replies
4
Views
804
Replies
1
Views
907
  • Last Post
Replies
4
Views
754
Replies
3
Views
2K
  • Last Post
Replies
23
Views
4K
Replies
2
Views
410
Replies
2
Views
665
  • Last Post
Replies
11
Views
6K
Top