Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Intrinsic Distributions

  1. Sep 26, 2005 #1
    The following is crude derivation demonstrating how a distribution such as the normal distribution is simply one distribution

    that stems from a family of similar distributions. I originally was going to post this in the new Independent Research forum

    but the moderator thought it was better suited to be posted here instead. I am looking for feedback and thoughts from viewers

    in this forum. I would like to find someone to help me coauthor a paper on this subject. My technical writing skills aren’t

    that great and I don’t claim to be a professional mathematician only a recreational mathematician. I do this stuff for fun. I

    am not an academic nor a professor so I am not under the gun to have papers published on a routine basis. I have a lot more

    work done than what I have showed here. The intent here also is to show how several common distributions can be manipulated

    into a form that contains parameters such as variance and mean. Very few distributions exist that require the parameters mean

    and variance. Most distributions are in a form that contains adjustment constants. From a practical perspective this is

    insufficient due to the fact that generic constants require a certain amount of trial and error to adjust a distribution to

    fit a certain data set. Distributions such as the normal distribution only require the parameters mean and variance. The mean

    and variance are easily obtained from a data set.

    Derivation of Normal Intrinsic Distribution

    From the equation
    [itex]\frac{1}{{\sqrt {2 \cdot \pi } \cdot \sigma }} = \sqrt {\frac{1}{{2 \cdot \pi \cdot \sigma ^2 }}} [/itex]
    the normal distribution can be written in the form given here.
    [itex]P(q) = \frac{1}{{\sqrt {2 \cdot \pi } \cdot \sigma }} \cdot e^{ - \frac{1}{{2 \cdot \sigma ^2 }} \cdot (q - \mu )^2 }

    = \sqrt {\frac{1}{{2 \cdot \pi \cdot \sigma ^2 }}} \cdot e^{ - \pi \cdot \left( {\sqrt {\frac{1}{{2 \cdot \pi \cdot

    \sigma ^2 }}} } \right)^2 \cdot (q - \mu )^2 } [/itex]
    By letting
    [itex]P_{q_1 } = \sqrt {\frac{1}{{2 \cdot \pi \cdot \sigma ^2 }}} [/itex]
    the normal distribution takes the form.
    [itex]P(q) = P_{q_1 } \cdot e^{ - \pi \cdot \left(P_{q_1 }\right)^{2} \cdot (q - \mu )^2 } [/itex]
    Using the integral given here
    [itex]\alpha = \int\limits_{ - \infty }^\infty {e^{ - x^{2 \cdot k} } dx} = \frac{1}{k} \cdot \Gamma (\frac{1}{{2 \cdot

    k}}),k = 1,2,3,....,\infty[/itex]
    and evaluating k=1 produces the integral.
    [itex]\int\limits_{ - \infty }^\infty {e^{ - q^2 } dq} = \sqrt \pi [/itex]
    The relationship between the constant pi and the integral can be seen
    [itex]P(q) = P{}_q \cdot e^{ - (\sqrt \pi \cdot P_q )^2 \cdot (q - \mu )^2 } = P_{q_1 } \cdot e^{ - \pi

    \cdot\left(P_{q_1}\right)^{2} \cdot (q - \mu )^2 } [/itex]
    Multiplying both sides of the unevaluated version of the integral and substituting the solution.
    [itex]N\cdot \alpha = N\cdot \int\limits_{ - \infty }^\infty {e^{ - x^{2 \cdot k} } dx} = N\cdot \frac{1}{k} \cdot \Gamma

    (\frac{1}{{2 \cdot k}}),k = 1,2,3,....,\infty[/itex]
    Produces the equation
    [itex]P(q) = P_q \cdot e^{ - \left[ {\frac{1}{k} \cdot \Gamma (\frac{1}{{2 \cdot k}}) \cdot N \cdot P_q (q - \mu )}

    \right]^{2 \cdot k} } ,k = 1,2,3,....,\infty . [/itex]
    For the case N=1 and [itex]k = 1,2,3,....,\infty[/itex]
    [itex]\int\limits_{ - \infty }^\infty {P(q)dq = 1} [/itex]
    and for the case N=2 and [itex]k = 1,2,3,....,\infty[/itex]
    [itex]\int\limits_{ - \infty }^\infty {P(q)dq = \frac{1}{2}} [/itex]
    hence the Intrinsic Distribution can be written below for all cases of [itex]N=1,2,3,....,\infty[/itex]
    [itex]P(q) = \sum\limits_{i = 1}^N {P_{q_i } \cdot e^{ - \left[ {\frac{1}{k} \cdot \Gamma (\frac{1}{{2 \cdot k}}) \cdot N

    \cdot P_{q_i } (q - q_i )} \right]^{2 \cdot k} } }= \sum\limits_{i = 1}^N {P_{q_i } \cdot e^{ - \left[ {\alpha \cdot N

    \cdot P_{q_i } \cdot \left( {q - q_i } \right)} \right]^{2 \cdot k} } } ,k = 1,2,3,....,\infty \int\limits_{ - \infty

    }^\infty {P(q)dq} = 1 [/itex]

    Derivation of Normal Distribution

    Fore the case N=1 and k=1
    [itex]P(q) = \sum\limits_{i = 1}^N {P_{q_i } \cdot e^{ - \left[ {\alpha \cdot N \cdot P_{q_i } \cdot \left( {q - q_i }

    \right)} \right]^{2 \cdot k} } } ,k = 1,2,3,....,\infty [/itex].
    The equation is reduced to.
    [itex]P(q) = P_{q_1 } \cdot e^{ - \left(\alpha \cdot P_{q_1 }\right)^{2} \cdot (q - q_1 )^2 } [/itex]
    Using the integral
    [itex]\alpha = \int\limits_{ - \infty }^\infty {e^{ - x^{2 \cdot k} } dx} = \frac{1}{k} \cdot \Gamma (\frac{1}{{2 \cdot

    k}}),k = 1,2,3,....,\infty [/itex]
    the equation takes the form.
    [itex] P(q) = P_{q_1 } \cdot e^{ - \pi \cdot \left(P_{q_1 }\right)^{2} \cdot (q - \mu )^2 }[/itex]
    Using the equation
    [itex]\mu = \int\limits_{ - \infty }^\infty {P(q) \cdot q \cdot dq = q_1 } [/itex]
    it can be seen that
    [itex]P(q) = P_{q_1 } \cdot e^{ - \pi \cdot \left(P_{q_1 }\right)^{2} \cdot (q - \mu )^2 } [/itex]
    and from
    [itex]\sigma ^2 = \int\limits_{ - \infty }^\infty {P(q) \cdot (q - \mu )^2 \cdot dq} [/itex]
    the following table is generated.
    {P_{q_1 } = 1} & {\sigma ^2 = \frac{1}{{2 \cdot \pi }}} & {\sigma ^2 = \frac{1}{{2 \cdot 1^2 \cdot \pi }}} \\
    {P_{q_1 } = 2} & {\sigma ^2 = \frac{1}{{8 \cdot \pi }}} & {\sigma ^2 = \frac{1}{{2 \cdot 2^2 \cdot \pi }}} \\
    {P_{q_1 } = 3} & {\sigma ^2 = \frac{1}{{18 \cdot \pi }}} & {\sigma ^2 = \frac{1}{{2 \cdot 3^2 \cdot \pi }}} \\
    {P_{q_1 } = 4} & {\sigma ^2 = \frac{1}{{32 \cdot \pi }}} & {\sigma ^2 = \frac{1}{{2 \cdot 4^2 \cdot \pi }}} \\
    {P_{q_1 } = P_{q_1 } } & \Rightarrow & {\sigma ^2 = \frac{1}{{2 \cdot P_{q_1 } ^2 \cdot \pi }}} \\
    \end{array} [/itex]
    From this table the equation is found.
    [itex]\sigma ^2 = \frac{1}{{2 \cdot \pi \cdot (P_{q_1 } )^2 }} [/itex]
    Solving the equation for
    [itex]P_{q_1 } = \sqrt {\frac{1}{{2 \cdot \pi \cdot \sigma ^2 }}} = \frac{1}{{\sqrt {2 \cdot \pi } \cdot \sigma }}

    and substituting into the equation
    [itex] P(q) = P_{q_1 } \cdot e^{ - \pi \cdot \left(P_{q_1 }\right)^{2} \cdot (q - \mu )^2 } [/itex]
    produces the normal distribution.
    [itex]P(q) = \frac{1}{{\sqrt {2 \cdot \pi } \cdot \sigma }} \cdot e^{ - \frac{1}{{2 \cdot \sigma ^2 }} \cdot \left( {q -

    \mu } \right){}^2} [/itex]
    Multiple distributions can be generated in the like manor from multiple values of k. You can also generate multi modal

    distributions from different values of N. For example for N=2 and k=1 a true bimodal normal distrubtion could be generated.

    You do not have to be restricted to gausian distributions only. By using the same methods as above you could generate several

    different types of Intrinsic Distribution. An example shown here is the Cauchy Intrinsic distribution
    [itex]P(q) = \sum\limits_{i = 1}^N {\frac{{P_{q_i } }}{{((\left( {q - q_i } \right) \cdot P_{q_i } \cdot N \cdot \varsigma

    )^{2 \cdot k} + 1)^k }}} ,\varsigma = \int\limits_{ - \infty }^\infty {(\frac{1}{{(x^{2 \cdot k} + 1)^k }})} dx =

    \frac{1}{k} \cdot \beta (\frac{1}{{2 \cdot k}},\frac{{2 \cdot k^2 - 1}}{{2 \cdot k}}),k = 1,2,3,....,\infty [/itex]
    Many different distributions are incomplete and can be finished by putting into a form that contains measurable quantities

    such as mean and variance. You can use the same method shown above to complete many different distributions such as the

    logistic distribution shown here. The Logistic Distribution typically takes the form
    [itex]P(q) = \frac{{e^{ - (q - m)/b} }}{{b \cdot \left[ {1 + e^{ - (q - m)/b} } \right]^2 }} [/itex]
    and its distribution function
    [itex]D(q) = \frac{1}{{1 + e^{ - (q - m)/b} }} [/itex]
    The complete form is
    [itex]P(q) = 4 \cdot \sqrt {\frac{{\pi ^2 }}{{\sigma ^2 \cdot 48}}} \cdot \left( {\frac{{e^{\left( {\left( {q - \mu }

    \right) \cdot 4 \cdot \sqrt {\frac{{\pi ^2 }}{{\sigma ^2 \cdot 48}}} } \right)} }}{{\left( {1 + e^{\left( {\left( {q - \mu }

    \right) \cdot 4 \cdot \sqrt {\frac{{\pi ^2 }}{{\sigma ^2 \cdot 48}}} } \right)} } \right)^2 }}} \right) [/itex]
    and its complete distribution function
    [itex] D(x) = \frac{1}{2} - \frac{1}{{\left( {1 + e^{\left( {\frac{1}{3} \cdot \left( {q - \mu } \right) \cdot \sqrt 3

    \cdot \pi \cdot \sqrt {\frac{1}{{\sigma ^2 }}} } \right)} } \right)}}[/itex]

    Thousands of different distributions that have never been seen or studied before can be generated or created using similar

    techniques shown as above. An example such as this is the Hyperbolic Distribution below.
    [itex]P(q) ={\sqrt {\frac{{\pi ^2 }}{{48^2 \cdot \sigma ^2 }}} } \cdot \cosh (2 \cdot \sqrt {\frac{{\pi ^2 }}{{48^2 \cdot

    \sigma ^2 }}} \cdot (q - \mu )^{ - 2} ) [/itex]
    Multivariate Intrinsic Distributions are also achievable. The Multivariate Cauchy Distribution is shown below.
    [itex]P(q_1 ,q_2 ,q{}_3,....,q_j ) = \prod\limits_{j = 1}^m {\sum\limits_{i = 1}^N {\frac{{P_{q_{i,j} }

    }}{{((N^{\frac{1}{m}} \cdot \varsigma \cdot (q_j - q_{i,j} ) \cdot P_{q_{i,j} } )^{2 \cdot k} + 1)^k }}(q_j - q_{i,j} )}

    },k = 1,2,3,....,\infty [/itex]

    [itex] \varsigma = \int\limits_{ - \infty }^\infty {(\frac{1}{{(x^{2 \cdot k} + 1)^k }})} dx = \frac{1}{k} \cdot \beta

    (\frac{1}{{2 \cdot k}},\frac{{2 \cdot k^2 - 1}}{{2 \cdot k}}),k = 1,2,3,....,\infty [/itex]
    Last edited: Sep 26, 2005
  2. jcsd
  3. Sep 27, 2005 #2


    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Would it be fair to say that the main thrust of your work here is to rewrite various probably density functions and cumulative distribution functions in a form such that the free variables are the mean and variance?

    And beyond that, you have generalized slightly to an expression that represents a equally weighted combination of the individual distributions?

    An example of what I mean by that is your "bimodal" normal distribution: it would correspond to an experiment like:

    Flip a coin.
    If heads, generate a normal variable with mean 5 and variance 1
    If tails, generate a normal variable with mean 3 and variance 2

    but not be applicable to

    Roll a 6-sided die.
    If 1, generate a normal variable with mean 5 and variance 1
    If 2-6, generate a normal variable with mean 3 and variance 2

    And you've introduced a new parameter (k) that defines some apparent relative of the normal distribution.

    Anyways, your opening remarks turned me off somewhat -- I figure it would be good to know the reasons for that.

    The parameters to the various distributions were not arbitrarily chosen -- many (all?) are highly convenient parameters, and often the formulae are simpler in the original variables.

    For example, I would generally use the chi-squared distribution with n degrees of freedom because I have a theoretical reason for n degrees of freedom, not because I want a distribution with mean n and variance 2n. (Ick, I hope I have them right from memory)

    It almost sounds like you're suggesting that the "best" way to select the parameters for a given distribution is to pick the parameters for which the mean and variance are equal to the sample mean and sample variance -- while that sounds good in theory, such idealistic approaches are rarely optimal in the real world. I'll try and remember to ping a statistician at work tomorrow to get a more informed opinion on this. :smile:
  4. Sep 27, 2005 #3

    No- The sample mean and variance are only estimates of the distributions mean and variance.
  5. Sep 27, 2005 #4


    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Then I entirely don't understand this passage.

    What is the benefit of writing, say, the gamma distribution in terms of its mean and variance, instead of in terms of α and β?

    And what is the point of mentioning that the mean and variance are easily obtained from a data set?
  6. Sep 28, 2005 #5
    Poor Terminology


    I am sorry for the incorrect usage of the terminology mean and variance. The statement should have read the mean and variance can be estimated from the sample mean and variance. That was my intent of this posting was to have things like that brought to my attention. The gamma distribution typically describes a natural process with waiting times between events that are distributed according to a poisson distribution. If I had a collection of data from a process that I new was best described by the gamma distribution. Would it not be simpler to estimate the mean and variance of the data from its sample mean and sample variance and simple put those numbers into the gamma distribution rather than trying to determine the parameters alpha and beta. I may be missing something here I have not worked with the gamma distribution much but I cannot recall if the parameters alpha and beta can be directly related to data set through some type of mathematical relationship such as an equation. I can estimate the mean from [itex]m = \frac{1}{N} \cdot \sum\limits_i^N {x_i }[/itex] easily but how do you estimate or determine alpha directly through the use of an equation. One last thing it is not always practical to use an estimate of the mean and variance. For an example the Cauchy distribution has an infinite variance. I appreciate your feedback keep it coming.
    Last edited: Sep 28, 2005
  7. Oct 17, 2005 #6


    User Avatar
    Science Advisor
    Homework Helper

    What you have "reinvented" is the method of parameter estimation known as "matching of moments."
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook