Continuous random variable: Zero probablity

  • B
  • Thread starter Biker
  • Start date
  • #1
416
51

Main Question or Discussion Point

I just have a couple of questions about how it can be zero probability.

In case, you have a continuous cumulative probability distribution such that there is a derivative at each point not equal to zero. This means that every point as a different value than the other which means that every point contributes to the probability.
Now I know you cant assign a finite value because it will go to infinity and you cant assign zero because that would certainly mean that the derivative is zero.
However, They use: Zero almost surely...
Which means that an even can happen even if it has zero probability which is fine but why not say that it is an infinitesimal? (Hyperreal, is it possible?) and keep the notion of zero to impossible of it happening

Is it just zero to keep it in the real numbers or is it exactly zero?

Of course the area under the curve of a probability density function in a single point is zero that doesnt mean that it has probability zero
 

Answers and Replies

  • #2
FactChecker
Science Advisor
Gold Member
5,581
2,059
The probability of a continuous distribution having an exact pre-specified value (like 2.100000000...) is zero. The only thing with a positive probability is a range (e.g. from 2.0 to 2.02) or a large set of values (e.g. irrational numbers). Of course, any particular result DOES have an exact value but that exact value had 0 probability and will not happen again.
 
  • #3
416
51
The probability of a continuous distribution having an exact pre-specified value (like 2.100000000...) is zero. The only thing with a positive probability is a range (e.g. from 2.0 to 2.02) or a large set of values (e.g. irrational numbers). Of course, any particular result DOES have an exact value but that exact value had 0 probability and will not happen again.
Sorry I didn't understand the last part.

It is that if every probability at an exact value is zero and that the cumulative probability is continuous on R then how can any point differ than another?
How can I interpret the curve?
 
  • #4
34,370
10,445
then how can any point differ than another?
They don't. Every point has the same probability.
 
  • #5
jbriggs444
Science Advisor
Homework Helper
2019 Award
8,767
3,528
As I understand the concern, it is that the probability of obtaining any specific result value is zero. So one would expect that the value of the cumulative probability immediately before that value to be the same as the value of the cumulative probability immediately after that value. And so on -- so that one would expect the cumulative PDF to necessarily be a constant function.

One difficulty with that reasoning is that there is no such thing as a point either immediately before or immediately after another on the real line. For every pair of distinct points, there is a non-zero interval between them. The probability density function integrated over a non-zero interval can have a strictly positive result -- the probability of obtaining a result in that interval can be non-zero.

Another difficulty is with the notion of transitivity being applied over an uncountable set. Ordinary mathematical induction cannot extend it that far.
 
  • Like
Likes FactChecker and Biker
  • #6
FactChecker
Science Advisor
Gold Member
5,581
2,059
Sorry I didn't understand the last part.

It is that if every probability at an exact value is zero and that the cumulative probability is continuous on R then how can any point differ than another?
How can I interpret the curve?
A continuous PDF does not directly give a "probability" for a single value. You are trying to use the PDF in a way that it can not be used. For one thing, the PDF at a point can be much greater than 1, so clearly it is not a probability. The probabilities are only defined as the integral of the PDF over a measurable set of values. You can get the probability as the limit of integrals that narrow down to a single point, but that is not the same as the value of the PDF at that point. No matter how large the PDF is at a point, when the integral narrows down to that point the integral goes to 0.
 
  • #7
FactChecker
Science Advisor
Gold Member
5,581
2,059
Of course the area under the curve of a probability density function in a single point is zero that doesnt mean that it has probability zero
Yes it does. Suppose I tell you that I did an experiment with a continuous PDF and got EXACTLY X=2.11111111111111... How likely would you say that was? An infinite number of '1's? I would say that the likelihood was 0. Of course, the result of the experiment was some EXACT number, so things do happen all the time where the pre-experiment likelihood was 0.
 
  • #8
416
51
As I understand the concern, it is that the probability of obtaining any specific result value is zero. So one would expect that the value of the cumulative probability immediately before that value to be the same as the value of the cumulative probability immediately after that value. And so on -- so that one would expect the cumulative PDF to necessarily be a constant function.

One difficulty with that reasoning is that there is no such thing as a point either immediately before or immediately after another on the real line. For every pair of distinct points, there is a non-zero interval between them. The probability density function integrated over a non-zero interval can have a strictly positive result -- the probability of obtaining a result in that interval can be non-zero.

Another difficulty is with the notion of transitivity being applied over an uncountable set. Ordinary mathematical induction cannot extend it that far.
That is exactly what I meant. If you choose a particular value then every point after it should be the same. The thing that lead me to this was that I was trying to make a cdf. You take a bunch of data then you approximate but this approximation with a continuous function strictly says (if the derivative is nonZero )that every point has a different value which made this contradiction. How can I correctly interpret this?

It is similar to the problem of a line which bison made out of zero width points

Mfb, Could you please elaborate?

And thank you factchecker
 
Last edited:
  • #9
FactChecker
Science Advisor
Gold Member
5,581
2,059
You need to be careful here. I think you are implying that a zero probability of X=x0 means that the PDF is 0 at x0. That is not true. The PDF is NOT a probability. The PDF is the slope of the CDF. If the PDF, f(x) at the point x0 is zero, then the slope of the CDF at that point is zero. For a continuous random variable, the probability of any exact single value is zero no matter what the value of the PDF is.
 
  • #10
416
51
You need to be careful here. I think you are implying that a zero probability of X=x0 means that the PDF is 0 at x0. That is not true. The PDF is NOT a probability. The PDF is the slope of the CDF. If the PDF, f(x) at the point x0 is zero, then the slope of the CDF at that point is zero. For a continuous random variable, the probability of any exact single value is zero no matter what the value of the PDF is.
I am not. I know what each one presents.
I am talking about the cumulative distribution as if it was a sum of points because it is continuous everywhere it. If you some how say that the exact probability is zero then it follows the the slope must be equal to zero of the Cdf. But no, any interval has some probability. So the probability must not be zero and not finite. Jbriggs explained what I meant above.
It is a matter of how can I interpret a continuous cdf while probability of exact value is zero.
The same thing can be applied to a line, say you take out a point does that make difference to the length?
The whole problem is about the sum of zeros can result in a finite number

And that I want to know how you can interpret a continuous cdf

PS I know I can't talk about points in continuous distribution but it just a contradiction that you some how have continuous cdf but zero probability for each value
 
  • #11
FactChecker
Science Advisor
Gold Member
5,581
2,059
I am not. I know what each one presents.
I am talking about the cumulative distribution as if it was a sum of points because it is continuous everywhere it. If you some how say that the exact probability is zero then it follows the the slope must be equal to zero of the Cdf.
Not true. The slope does not have to be 0. That is all I can say.
But no, any interval has some probability. So the probability must not be zero and not finite. Jbriggs explained what I meant above.
It is a matter of how can I interpret a continuous cdf while probability of exact value is zero.
The same thing can be applied to a line, say you take out a point does that make difference to the length?
The whole problem is about the sum of zeros can result in a finite number

And that I want to know how you can interpret a continuous cdf
The interpretation of a continuous CDF is P( X∈[x1, x2] ) = CDF(x2) - CDF(x1) for x1 ≤ x2. So P( X=x1) = P( X∈[x1, x1] ) = CDF(x1) - CDF(x1) = 0, for any continuous CDF with any slope.
 
Last edited:
  • #12
Stephen Tashi
Science Advisor
7,162
1,314
I just have a couple of questions about how it can be zero probability.
It should be emphasized that your questions concern the application of probabilty theory, not the mathematical theory of probability because your are asking about physical events - whether something actually happens. Mathematical probability theory does not define whether an event that has been assigned a probability will (or can) actually happen. It does not even have an axiom that says it is possible to take random samples from a distribution- in the sense of forcing a random variable to "actually" take on a particular value. The assumption that we can do random sampling is an assumption that is done when making applications of probability theory to particular problems.

Mathematical probability theory only assumes that there is a "measure space" in which events can be assigned probabilities. It doesn't comment on whether these events actually happen.

For example, if you are measuring peoples' weights, you may choose to assume that your are sampling from a continuous distribution such as lognormal distribution. You will only be able to measure a person's weight to a finite precision, so you can't experimentally prove or disprove that people's weights have a continuous distribution. If you assume peoples' weights have a lognormal distribution and treat your measurement of someone's weight as being exactly the correct weight then you do have the awkward situation that an event with zero probability has happened. However, that awkwardness can't be resolved by mathematical probability theory - the awkwardness involves the desire of people who apply probability theory to interpret "zero probability" as meaning "can't actually happen".

The mathematical problem of assigning continuous probabilities is similar to the problem of assigning a mass to part of an object that has a continuously varying density. We use a mass density function to describe the varying density in the object, but "at a point" in the object, there is no mass. So we may have a mass density of 120 lbs per cubic inch at a point, but we don't say the point itself has any mass.

People working problems about mass densities don't face the task of taking a sample of an object consisting of a mathematical point and putting it on a slide to examine under a microscope. In contrast, people who apply probability theory often use language that suggests they (or Nature) is accomplishing the feat of having an event with zero probability actually happen. Some people may take that idea seriously and others may regard it as merely a convenient fiction that approximates the way things work. Mathematical probability theory doesn't comment on the issue.
 
  • Like
Likes Biker
  • #13
StoneTemplePython
Science Advisor
Gold Member
2019 Award
1,156
561
However, They use: Zero almost surely...
Which means that an even can happen even if it has zero probability which is fine but why not say that it is an infinitesimal? (Hyperreal, is it possible?) and keep the notion of zero to impossible of it happening

Is it just zero to keep it in the real numbers or is it exactly zero?

Of course the area under the curve of a probability density function in a single point is zero that doesn't mean that it has probability zero
To answer your question -- modern probability theory comes from Kolmogorov, using measure theory. There are other rigorous formulations -- e.g. Nelson's Radically Elementary Probability Theory which does use nonstandard analysis -- the preface humorously suggests that even high school students can understand the book (it is not an easy read). You can read it here: https://web.math.princeton.edu/~nelson/books/rept.pdf

For the most part you will not see people use infinitesimals in standard probability theory -- except when they 'tap out' and feel like it is the needed to convey what they want to say. (In particular I've seen a lot of people start talking about infinitesimal generators of continuous Markov chains, but otherwise they won't use infinitesimals.)

Once you get used to the idea that zero probability events can happen (but not zero density ones), I suspect you'll be ok with the terminology. It took me a while.
- - - -
Sometimes coming at this from a different angle is illuminating. Here's a very relevant problem from Fifty Challenging Problems in Probability

>> What is the probability that the quadratic equation: ##x^2 + 2bx + c = 0## has real roots?

note your domain is reals, b and c are independently sampled from ##(-\infty,\infty)##, though it might be helpful to think of them coming from ##(-3, 3)## or ##(-n, n)## and to then consider the limiting behavior.

I used to not like this "zero probability but still possible" stuff -- but eventually I got over it, and this problem helped that process move along.
 
  • #14
mathman
Science Advisor
7,822
433
One idea, apart from probability, which might clarify things. The unit interval has length 1. Each point on the interval has length 0. Add the points together to get the unit interval, but you cannot add up lengths.
 
  • #15
Svein
Science Advisor
Insights Author
2,080
669
One idea, apart from probability, which might clarify things. The unit interval has length 1. Each point on the interval has length 0. Add the points together to get the unit interval, but you cannot add up lengths.
And then we end up in measure theory...
 
  • #16
mathman
Science Advisor
7,822
433
The foundation of measure theory and the foundation of probability are quite similar.
 
  • #17
3,379
943
If the probability of something is less than then the probability of the Universe existing, that will do for me as a definition of 'Zero'
 
  • #18
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
13,407
5,929
- - - -
Sometimes coming at this from a different angle is illuminating. Here's a very relevant problem from Fifty Challenging Problems in Probability

>> What is the probability that the quadratic equation: ##x^2 + 2bx + c = 0## has real roots?

note your domain is reals, b and c are independently sampled from ##(-\infty,\infty)##, though it might be helpful to think of them coming from ##(-3, 3)## or ##(-n, n)## and to then consider the limiting behavior.

I used to not like this "zero probability but still possible" stuff -- but eventually I got over it, and this problem helped that process move along.
I guess your argument is that if I am asked to produce a quadratic equation of that form and I go for, say, ##b = 5, c=1##, i.e.:

##x^2 + 10x + 1##

Then, as this quadratic has real roots, something with 0 probability has actually happened?

Not to mention that my picking two integers for the coefficients also had 0 probability!
 
  • #19
chiro
Science Advisor
4,790
132
The probability is zero because it is meant to be impossible to get an event occur unless you have infinitely many tries of a continuous stochastic process.

A continuous stochastic process has infinitely many values in its state space and you will never actually realize that state with a normal continuous PDF stochastic process [like with a Normal distribution or some other analytic PDF].

However - you can have processes that have many values that can be realized and the way that this is studied is through what is called pure probability.

If you want to understand this more, then you will have to look at graduate statistics which includes measure theory, analysis, and probability related subjects like sigma algebras and stochastic processes.

I'd wait until you get to that if you are in undergraduate, but those subjects will help answer your questions in regard to your original post.
 
  • Like
Likes EnumaElish and Biker
  • #20
FactChecker
Science Advisor
Gold Member
5,581
2,059
Throw a dart at a number line to get a number between 0 and 1. The PDF is 1 on [0,1] and 0 otherwise. The CDF is y=x, 0≤x≤1.
You know that some number must occur from your experiment. Suppose the number is 0.1592653589793238462643383279502884197169399375105820974944592307816406286 208998628034825342117067982148086513282306647093844609550582231725359408128481 117450284102701938521105559644622948954930381964428810975665933446128475648233 786783165271201909145648566923460348610454326648213393607260249141273724587006 606315588174881520920962829254091715364367892590360011330530548820466521384146 951941511609433057270365759591953092186117381932611793105118548074462379962749 567351885752724891227938183011949129833673362440656643086021394946395224737190 702179860943702770539217176293176752384674818467669405132000568127145263560827 785771342757789609173637178721468440901224953430146549585371050792279689258923 542019956112129021960864034418159813629774771309960518707211349999998372978049 951059731732816096318595024459455346908302642522308253344685035261931188171010 003137838752886587533208381420617177669147303598253490428755468731159562863882 353787593751957781857780532171226806613001927876611195909216420198938095257201 065485863278865936153381827968230301952035301852968995773622599413891249721775 283479131515574857242454150695950829533116861727855889075098381754637464939319 255060400927701671139009848824012858361603563707660104710181942955596198946767 837449448255379774726847104047534646208046684259069491293313677028989152104752 162056966024058038150193511253382430035587640247496473263914199272604269922796....

I can add another thousand digits, and millions after that.
What were the odds of that number being the result? Some number must occur, but the pre-experiment probability of that exact number occurring was 0. And that exact number will never happen again in a billion billion tries.
 
  • #21
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
13,407
5,929
I can add another thousand digits, and millions after that.
What were the odds of that number being the result? Some number must occur, but the pre-experiment probability of that exact number occurring was 0. And that exact number will never happen again in a billion billion tries.
In any such experiment, only a finite number of results is possible.
 
  • #22
FactChecker
Science Advisor
Gold Member
5,581
2,059
In any such experiment, only a finite number of results is possible.
Really? What is that finite number?
EDIT: I am trying to illustrate a mathematical concept with a easily understood physical example. Some abstraction must be applied.
 
  • #23
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
13,407
5,929
Really? What is that finite number?
It depends on the experiment.

Given any number, you could devise an experiment with more than that number of possible results. But, the number of results of any experiment is finite.
 
  • #24
jbriggs444
Science Advisor
Homework Helper
2019 Award
8,767
3,528
I guess your argument is that if I am asked to produce a quadratic equation of that form and I go for, say, ##b = 5, c=1##, i.e.:

##x^2 + 10x + 1##

Then, as this quadratic has real roots, something with 0 probability has actually happened?
I believe that you are misinterpreting the idea being presented. We are asked to pick a, b and c from a uniform distribution over an interval centered on zero. We are asked to assess the a priori probability that the resulting polynomial has real roots. That is to say, we are asked for the probability that the discriminant, ##b^2 - 4ac## will be greater than zero.

In order to avoid concerns with the impossibility of a uniform distribution over the real numbers we are asked to take the limit of this probability as the length of the interval increases without bound.

A key observation is that the computed probability is independent of the size of the interval. So evaluating the limit is trivial. Just evaluate the result for an interval of one's choosing. It is clear that the probability is neither zero nor one. [And, accordingly, has precious little to do with the subject matter of this thread].

Not to mention that my picking two integers for the coefficients also had 0 probability!
It is fairly clear that you did not pick those coefficients at random using a continuous PDF.
 
  • #25
FactChecker
Science Advisor
Gold Member
5,581
2,059
It depends on the experiment.

Given any number, you could devise an experiment with more than that number of possible results. But, the number of results of any experiment is finite.
Maybe in the physical world. But the mathematical concepts are not limited to that. Some abstract thought must be applied to answer the OP.
 

Related Threads on Continuous random variable: Zero probablity

  • Last Post
Replies
5
Views
692
  • Last Post
Replies
3
Views
2K
  • Last Post
Replies
2
Views
2K
  • Last Post
Replies
3
Views
560
  • Last Post
Replies
1
Views
2K
Replies
4
Views
615
Replies
2
Views
3K
Replies
1
Views
10K
Replies
1
Views
1K
Replies
4
Views
2K
Top