Derivation of the Antiderivative of the Gaussian Distribution

Click For Summary

Discussion Overview

The discussion revolves around the derivation of the antiderivative of the Gaussian distribution, exploring the use of the fundamental theorem of calculus in relation to statistical applications, particularly in calculating areas under the normal curve. Participants express varying levels of understanding and approaches to the topic.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant expresses dissatisfaction with z-tables and seeks to derive the antiderivative of the Gaussian distribution using the fundamental theorem of calculus, emphasizing the need for accuracy.
  • Another participant confirms the correctness of the substitution method used in the derivation but questions the purpose of the derivation itself.
  • There is a mention of the cumulative distribution function, Φ(x), and the error function, erf(x), as related functions necessary for further calculations.
  • One participant points out that the initial work presented is merely a statement of the antiderivative rather than a full derivation, suggesting that new functions and evaluations are needed for practical use.
  • Participants discuss the empirical rule and provide approximate values for erf at specific points, indicating its relevance to the area calculations under the normal curve.
  • There is a suggestion that using calculators or software to compute erf may not provide a deeper understanding, and a question is raised about the feasibility of calculating it by hand or programming it into a calculator.
  • Another participant inquires about the type of calculator being used and mentions specific models that have built-in functions for the error function.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the nature of the derivation presented. While some agree on the correctness of the substitution, others challenge the completeness of the derivation and the practical implications of using the antiderivative.

Contextual Notes

There are limitations regarding the definitions and interpretations of "derivation" and "area calculations," as well as the reliance on external resources for computing functions like erf.

Mandelbroth
Messages
610
Reaction score
23
I'm in a high school pre-calculus class and a statistics class. For the latter, we are given z-tables for some of our tests. I don't like these z-tables.

Thus, I decided that a more direct approach (fundamental theorem of calculus) would be more accurate and, more importantly, more fun. My teacher kind of looked at the work and said "looks right to me". I think it's right (empirically it works and Wolfram says so), but I have no basis to see if I made a mistake or not (I'm basically just using my dad's old copy of Calculus and Analytic Geometry by Edwards and Penney and random resources on the internet).

To be clear, I'm looking for if I made a mistake in the derivation, not the answer. I tend to be good manipulating numbers in an erroneous way but still getting the right answer, but I want to make sure that I can do this correctly before I move on to something more difficult.

\int \frac{e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}}{\sigma \sqrt{2\pi}} dx = \frac{1}{\sigma \sqrt{2\pi}}\int e^{\frac{-z^2}{2}} dx

Letting z = (x-μ)/σ (fittingly), it follows that dz = \frac{dx}{\sigma} \Rightarrow dx = \sigma dz.

∴\frac{1}{\sigma \sqrt{2\pi}}\int e^{\frac{-z^2}{2}} dx = \frac{1}{\sqrt{2\pi}} \int e^{-\frac{z^2}{2}} dz = \frac{1}{\sqrt{2\pi}} \int \sqrt{e^{-z^2}} dz = \frac{1}{\sqrt{\pi}} \int_{0}^{z/\sqrt{2}} e^{-t^2} dt.

Finally, we get \displaystyle \frac{1}{\sqrt{\pi}} \int_{0}^{(\frac{x-\mu}{\sigma\sqrt{2}})} e^{-t^2} dt + C= \int \frac{e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}}{\sigma \sqrt{2\pi}} dx.
 
Last edited:
Physics news on Phys.org
That is right, it is just a substitution, what are you deriving? To effect the integration you will need one of the two functions

\Phi(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^x e^{-t^2/2} \, dt

\text{erf}(x)=\frac{2}{\sqrt{\pi}}\int_0^x e^{-t^2} \text{ dt}

related by

\Phi (x) = \frac{1}{2}+ \frac{1}{2} \operatorname{erf} \left(x/ \sqrt{2}\right)

If your calculator or computer software does not have this built in you will need to compute it. Wikipedia has some suggestion for approximation. You will also need to be able to calculate the inverse.

http://en.wikipedia.org/wiki/Normal...n#Numerical_approximations_for_the_normal_CDF
http://en.wikipedia.org/wiki/Error_function
 
lurflurf said:
That is right, it is just a substitution, what are you deriving?
I think I put that in the title...:wink:

The point is to use the fundamental theorem of calculus to get the area, rather than the table, as I'm sure you know from...

\Phi (x) = \frac{1}{2}+ \frac{1}{2} \operatorname{erf} \left(x/ \sqrt{2}\right)

However, the antiderivative is also useful for calculating things like the area between 2 points on a normal curve. For example, the I was able to show that the empirical rule (I think some people call it the "68-95-99.7 rule") is true because you can calculate \displaystyle \int_{\mu-n\sigma}^{\mu+n\sigma} \frac{e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}}{\sigma \sqrt{2\pi}} \ dx = \frac{1}{\sqrt{\pi}} \left(\int_{0}^{n/\sqrt{2}} e^{-t^2} \ dt + \int_{-n/\sqrt{2}}^{0} e^{-t^2} \ dt\right). I switched the sign of the difference, simply because I like having the lower bound of the integral be less than the upper bound rather than calculating a negative area, but I think you get the point.
 
You have just written down the Antiderivative of the Gaussian Distribution, not derived anything. To do anything with it you either need to introduce new functions and evaluate them (and likely their inverses).

The empirical rule is
erf(1/sqrt(2))~.682689
erf(2/sqrt(2))~.954499
erf(3/sqrt(2))~.997300
erf(4/sqrt(2))~.999936

If you look up erf in a table you have gained nothing. If you use a calculator of computer you gain some digits and flexibility, but it is just as much of a magic black box. Do you propose to calculate it by hand or write your own program?
 
lurflurf said:
You have just written down the Antiderivative of the Gaussian Distribution, not derived anything.
Merriam-Webster defines derivation as "a sequence of statements (as in logic or mathematics) showing that a result is a necessary consequence of previously accepted statements". Though, you are correct, I have not "derived" anything in the sense that the word is normally (heh, normally) used.

lurflurf said:
The empirical rule is
erf(1/sqrt(2))~.682689
erf(2/sqrt(2))~.954499
erf(3/sqrt(2))~.997300
erf(4/sqrt(2))~.999936
Yes. It is. In other words, area between -σ and σ is approximately 0.682689, area between -2σ and 2σ is approximately 0.954499, etc. Was there something I missed?

lurflurf said:
If you look up erf in a table you have gained nothing. If you use a calculator of computer you gain some digits and flexibility, but it is just as much of a magic black box. Do you propose to calculate it by hand or write your own program?
I intend to program it into my calculator. I'll probably come to some clever conclusion about it later.
 
^What calculator are you using? Several such as
HP 50g
TI-84 Plus
Casio cfx-9850G
Have such a function built in, many others have it as an app or download.

If you make your own and are happy with limitations of real numbers and 10^-7 accuracy the Zelen & Severo (1964) linked above is simple. Also 26.2.17 here http://people.math.sfu.ca/~cbm/aands/page_932.htm
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 19 ·
Replies
19
Views
5K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K