Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Calculating a probability given a point for a continuous distribution?

  1. Jul 24, 2014 #1
    I thought I understood all the theory quite well and sat down to begin coding until I realized that calculating a probability at a point within a normal distribution in the application of bayes' rule you can't simply plug the point into the normal distribution and get the value since the probability is a density. How do you approach this from a numerical standpoint or am I incorrect? My guess is that you need to leverage the cumulative distribution function instead and calculate the probability over some small delta around the point.

    My broader problem is now after having gone through a good part of probability and statistics and being able to wield the P notation quite deftly on paper, I'm finding that translating those theories to actual computation has its own challenges and I'm wondering how this is generally approached or if I'm completely misunderstanding something. I'm fairly certain that P(x) doesn't mean evaluating the pdf f(x) and just taking that value.

    Appreciate any guidance.
     
  2. jcsd
  3. Jul 24, 2014 #2

    Simon Bridge

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member
    2016 Award

    Pretty much except you don't need the cumulative probability.

    If X~N, P(X=x)=0
    There is no such thing as an arbitrarily precise measurement.
    When we say that someone is 183cm tall, we really mean they are between 182.5cm and 183.5cm or something like that. So the probability of someone being 183cm tall, in the same sense, is actually P(182.5<x<183.5).

    That is correct. P(x) has no formal meaning where a pdf is concerned. You can only compute P(a<x<b) ... i.e. you can only find a non-zero probability for a continuous random variable falling between a range of values.

    In general we can say: $$\lim_{b-a\rightarrow 0}P(a<x<b)\rightarrow 0$$

    You can work it out for yourself:$$P(x=a)=\lim_{\epsilon\rightarrow 0}P(a-\epsilon<x<a+\epsilon)=\int_a^a p(x)\; dx$$
     
  4. Jul 24, 2014 #3

    Stephen Tashi

    User Avatar
    Science Advisor

    Adding to what Simon Bridge said, if you are trying to generate samples from a normal distribution then your computer program will produce a specific number - an "exact" real number one might say. But this type of sampling is only an approximation of the idea of sampling from a continuous distribution. The program is actually picking a number from a finite population of numbers - those that it can represent.

    Are you familiar with the typical algorithms for generating pseudo-random samples?
     
  5. Jul 24, 2014 #4

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    If you can integrate the PDF from a to b, then you can use the PDF to calculate P(a < X < b). In practice, the integration can be very difficult. The CDF has already done the integration from -∞ up to any point, so P(a < X < b) = CDF(b) - CDF(a). There are very accurate tables for most the CDFs of the common distributions.
     
  6. Jul 25, 2014 #5
    I think Simon's point was helpful to confirm and clear up some of my confusion regarding density functions and how to interpret the value of a density function. The other part that I needed to clear up was to go back and review the generalization of Bayes' Theorem to continuous distributions and then apply that to my current problem which is after all calculating values and summing etc directly from PDF because of that generalization. But also to distinguish when it is safe to use the value directly and when it doesn't make sense.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook