Calculating a probability given a point for a continuous distribution?

TheOldHag
Messages
44
Reaction score
3
I thought I understood all the theory quite well and sat down to begin coding until I realized that calculating a probability at a point within a normal distribution in the application of bayes' rule you can't simply plug the point into the normal distribution and get the value since the probability is a density. How do you approach this from a numerical standpoint or am I incorrect? My guess is that you need to leverage the cumulative distribution function instead and calculate the probability over some small delta around the point.

My broader problem is now after having gone through a good part of probability and statistics and being able to wield the P notation quite deftly on paper, I'm finding that translating those theories to actual computation has its own challenges and I'm wondering how this is generally approached or if I'm completely misunderstanding something. I'm fairly certain that P(x) doesn't mean evaluating the pdf f(x) and just taking that value.

Appreciate any guidance.
 
Physics news on Phys.org
My guess is that you need to leverage the cumulative distribution function instead and calculate the probability over some small delta around the point.
Pretty much except you don't need the cumulative probability.

If X~N, P(X=x)=0
There is no such thing as an arbitrarily precise measurement.
When we say that someone is 183cm tall, we really mean they are between 182.5cm and 183.5cm or something like that. So the probability of someone being 183cm tall, in the same sense, is actually P(182.5<x<183.5).

I'm fairly certain that P(x) doesn't mean evaluating the pdf f(x) and just taking that value.
That is correct. P(x) has no formal meaning where a pdf is concerned. You can only compute P(a<x<b) ... i.e. you can only find a non-zero probability for a continuous random variable falling between a range of values.

In general we can say: $$\lim_{b-a\rightarrow 0}P(a<x<b)\rightarrow 0$$

You can work it out for yourself:$$P(x=a)=\lim_{\epsilon\rightarrow 0}P(a-\epsilon<x<a+\epsilon)=\int_a^a p(x)\; dx$$
 
  • Like
Likes 1 person
Adding to what Simon Bridge said, if you are trying to generate samples from a normal distribution then your computer program will produce a specific number - an "exact" real number one might say. But this type of sampling is only an approximation of the idea of sampling from a continuous distribution. The program is actually picking a number from a finite population of numbers - those that it can represent.

Are you familiar with the typical algorithms for generating pseudo-random samples?
 
If you can integrate the PDF from a to b, then you can use the PDF to calculate P(a < X < b). In practice, the integration can be very difficult. The CDF has already done the integration from -∞ up to any point, so P(a < X < b) = CDF(b) - CDF(a). There are very accurate tables for most the CDFs of the common distributions.
 
I think Simon's point was helpful to confirm and clear up some of my confusion regarding density functions and how to interpret the value of a density function. The other part that I needed to clear up was to go back and review the generalization of Bayes' Theorem to continuous distributions and then apply that to my current problem which is after all calculating values and summing etc directly from PDF because of that generalization. But also to distinguish when it is safe to use the value directly and when it doesn't make sense.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top