Derivation of the Antiderivative of the Gaussian Distribution

Click For Summary
SUMMARY

The discussion centers on the derivation of the antiderivative of the Gaussian distribution using the fundamental theorem of calculus. The user expresses dissatisfaction with z-tables and seeks to validate their derivation process, which involves substituting variables and integrating the Gaussian function. Key functions mentioned include the cumulative distribution function, Φ(x), and the error function, erf(x), which are essential for calculating areas under the normal curve. The empirical rule is also referenced, demonstrating the practical application of these concepts in statistics.

PREREQUISITES
  • Understanding of the fundamental theorem of calculus
  • Familiarity with Gaussian distribution and its properties
  • Knowledge of the error function, erf(x)
  • Ability to perform variable substitution in integrals
NEXT STEPS
  • Study the derivation of the error function, erf(x)
  • Learn how to implement the cumulative distribution function, Φ(x), in programming
  • Explore numerical methods for approximating integrals of Gaussian functions
  • Investigate the empirical rule and its applications in statistics
USEFUL FOR

Students in high school or college-level mathematics and statistics courses, educators teaching calculus and statistics, and anyone interested in understanding the Gaussian distribution and its applications in data analysis.

Mandelbroth
Messages
610
Reaction score
23
I'm in a high school pre-calculus class and a statistics class. For the latter, we are given z-tables for some of our tests. I don't like these z-tables.

Thus, I decided that a more direct approach (fundamental theorem of calculus) would be more accurate and, more importantly, more fun. My teacher kind of looked at the work and said "looks right to me". I think it's right (empirically it works and Wolfram says so), but I have no basis to see if I made a mistake or not (I'm basically just using my dad's old copy of Calculus and Analytic Geometry by Edwards and Penney and random resources on the internet).

To be clear, I'm looking for if I made a mistake in the derivation, not the answer. I tend to be good manipulating numbers in an erroneous way but still getting the right answer, but I want to make sure that I can do this correctly before I move on to something more difficult.

\int \frac{e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}}{\sigma \sqrt{2\pi}} dx = \frac{1}{\sigma \sqrt{2\pi}}\int e^{\frac{-z^2}{2}} dx

Letting z = (x-μ)/σ (fittingly), it follows that dz = \frac{dx}{\sigma} \Rightarrow dx = \sigma dz.

∴\frac{1}{\sigma \sqrt{2\pi}}\int e^{\frac{-z^2}{2}} dx = \frac{1}{\sqrt{2\pi}} \int e^{-\frac{z^2}{2}} dz = \frac{1}{\sqrt{2\pi}} \int \sqrt{e^{-z^2}} dz = \frac{1}{\sqrt{\pi}} \int_{0}^{z/\sqrt{2}} e^{-t^2} dt.

Finally, we get \displaystyle \frac{1}{\sqrt{\pi}} \int_{0}^{(\frac{x-\mu}{\sigma\sqrt{2}})} e^{-t^2} dt + C= \int \frac{e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}}{\sigma \sqrt{2\pi}} dx.
 
Last edited:
Physics news on Phys.org
That is right, it is just a substitution, what are you deriving? To effect the integration you will need one of the two functions

\Phi(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^x e^{-t^2/2} \, dt

\text{erf}(x)=\frac{2}{\sqrt{\pi}}\int_0^x e^{-t^2} \text{ dt}

related by

\Phi (x) = \frac{1}{2}+ \frac{1}{2} \operatorname{erf} \left(x/ \sqrt{2}\right)

If your calculator or computer software does not have this built in you will need to compute it. Wikipedia has some suggestion for approximation. You will also need to be able to calculate the inverse.

http://en.wikipedia.org/wiki/Normal...n#Numerical_approximations_for_the_normal_CDF
http://en.wikipedia.org/wiki/Error_function
 
lurflurf said:
That is right, it is just a substitution, what are you deriving?
I think I put that in the title...:wink:

The point is to use the fundamental theorem of calculus to get the area, rather than the table, as I'm sure you know from...

\Phi (x) = \frac{1}{2}+ \frac{1}{2} \operatorname{erf} \left(x/ \sqrt{2}\right)

However, the antiderivative is also useful for calculating things like the area between 2 points on a normal curve. For example, the I was able to show that the empirical rule (I think some people call it the "68-95-99.7 rule") is true because you can calculate \displaystyle \int_{\mu-n\sigma}^{\mu+n\sigma} \frac{e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}}{\sigma \sqrt{2\pi}} \ dx = \frac{1}{\sqrt{\pi}} \left(\int_{0}^{n/\sqrt{2}} e^{-t^2} \ dt + \int_{-n/\sqrt{2}}^{0} e^{-t^2} \ dt\right). I switched the sign of the difference, simply because I like having the lower bound of the integral be less than the upper bound rather than calculating a negative area, but I think you get the point.
 
You have just written down the Antiderivative of the Gaussian Distribution, not derived anything. To do anything with it you either need to introduce new functions and evaluate them (and likely their inverses).

The empirical rule is
erf(1/sqrt(2))~.682689
erf(2/sqrt(2))~.954499
erf(3/sqrt(2))~.997300
erf(4/sqrt(2))~.999936

If you look up erf in a table you have gained nothing. If you use a calculator of computer you gain some digits and flexibility, but it is just as much of a magic black box. Do you propose to calculate it by hand or write your own program?
 
lurflurf said:
You have just written down the Antiderivative of the Gaussian Distribution, not derived anything.
Merriam-Webster defines derivation as "a sequence of statements (as in logic or mathematics) showing that a result is a necessary consequence of previously accepted statements". Though, you are correct, I have not "derived" anything in the sense that the word is normally (heh, normally) used.

lurflurf said:
The empirical rule is
erf(1/sqrt(2))~.682689
erf(2/sqrt(2))~.954499
erf(3/sqrt(2))~.997300
erf(4/sqrt(2))~.999936
Yes. It is. In other words, area between -σ and σ is approximately 0.682689, area between -2σ and 2σ is approximately 0.954499, etc. Was there something I missed?

lurflurf said:
If you look up erf in a table you have gained nothing. If you use a calculator of computer you gain some digits and flexibility, but it is just as much of a magic black box. Do you propose to calculate it by hand or write your own program?
I intend to program it into my calculator. I'll probably come to some clever conclusion about it later.
 
^What calculator are you using? Several such as
HP 50g
TI-84 Plus
Casio cfx-9850G
Have such a function built in, many others have it as an app or download.

If you make your own and are happy with limitations of real numbers and 10^-7 accuracy the Zelen & Severo (1964) linked above is simple. Also 26.2.17 here http://people.math.sfu.ca/~cbm/aands/page_932.htm
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 19 ·
Replies
19
Views
4K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K