Continuous random variable: Zero probablity

Biker · Apr 12, 2017

I just have a couple of questions about how it can be zero probability.

In case, you have a continuous cumulative probability distribution such that there is a derivative at each point not equal to zero. This means that every point as a different value than the other which means that every point contributes to the probability.
Now I know you can't assign a finite value because it will go to infinity and you can't assign zero because that would certainly mean that the derivative is zero.
However, They use: Zero almost surely...
Which means that an even can happen even if it has zero probability which is fine but why not say that it is an infinitesimal? (Hyperreal, is it possible?) and keep the notion of zero to impossible of it happening

Is it just zero to keep it in the real numbers or is it exactly zero?

Of course the area under the curve of a probability density function in a single point is zero that doesn't mean that it has probability zero

FactChecker · Apr 12, 2017

The probability of a continuous distribution having an exact pre-specified value (like 2.100000000...) is zero. The only thing with a positive probability is a range (e.g. from 2.0 to 2.02) or a large set of values (e.g. irrational numbers). Of course, any particular result DOES have an exact value but that exact value had 0 probability and will not happen again.

Biker · Apr 13, 2017

FactChecker said:

The probability of a continuous distribution having an exact pre-specified value (like 2.100000000...) is zero. The only thing with a positive probability is a range (e.g. from 2.0 to 2.02) or a large set of values (e.g. irrational numbers). Of course, any particular result DOES have an exact value but that exact value had 0 probability and will not happen again.

Sorry I didn't understand the last part.

It is that if every probability at an exact value is zero and that the cumulative probability is continuous on R then how can any point differ than another?
How can I interpret the curve?

mfb · Apr 13, 2017

Biker said:

then how can any point differ than another?

They don't. Every point has the same probability.

jbriggs444 · Apr 13, 2017

As I understand the concern, it is that the probability of obtaining any specific result value is zero. So one would expect that the value of the cumulative probability immediately before that value to be the same as the value of the cumulative probability immediately after that value. And so on -- so that one would expect the cumulative PDF to necessarily be a constant function.

One difficulty with that reasoning is that there is no such thing as a point either immediately before or immediately after another on the real line. For every pair of distinct points, there is a non-zero interval between them. The probability density function integrated over a non-zero interval can have a strictly positive result -- the probability of obtaining a result in that interval can be non-zero.

Another difficulty is with the notion of transitivity being applied over an uncountable set. Ordinary mathematical induction cannot extend it that far.

FactChecker · Apr 13, 2017

Biker said:

Sorry I didn't understand the last part.

It is that if every probability at an exact value is zero and that the cumulative probability is continuous on R then how can any point differ than another?
How can I interpret the curve?

A continuous PDF does not directly give a "probability" for a single value. You are trying to use the PDF in a way that it can not be used. For one thing, the PDF at a point can be much greater than 1, so clearly it is not a probability. The probabilities are only defined as the integral of the PDF over a measurable set of values. You can get the probability as the limit of integrals that narrow down to a single point, but that is not the same as the value of the PDF at that point. No matter how large the PDF is at a point, when the integral narrows down to that point the integral goes to 0.

FactChecker · Apr 13, 2017

Biker said:

Of course the area under the curve of a probability density function in a single point is zero that doesn't mean that it has probability zero

Yes it does. Suppose I tell you that I did an experiment with a continuous PDF and got EXACTLY X=2.11111111111111... How likely would you say that was? An infinite number of '1's? I would say that the likelihood was 0. Of course, the result of the experiment was some EXACT number, so things do happen all the time where the pre-experiment likelihood was 0.

Biker · Apr 13, 2017

jbriggs444 said:

As I understand the concern, it is that the probability of obtaining any specific result value is zero. So one would expect that the value of the cumulative probability immediately before that value to be the same as the value of the cumulative probability immediately after that value. And so on -- so that one would expect the cumulative PDF to necessarily be a constant function.

One difficulty with that reasoning is that there is no such thing as a point either immediately before or immediately after another on the real line. For every pair of distinct points, there is a non-zero interval between them. The probability density function integrated over a non-zero interval can have a strictly positive result -- the probability of obtaining a result in that interval can be non-zero.

Another difficulty is with the notion of transitivity being applied over an uncountable set. Ordinary mathematical induction cannot extend it that far.

That is exactly what I meant. If you choose a particular value then every point after it should be the same. The thing that lead me to this was that I was trying to make a cdf. You take a bunch of data then you approximate but this approximation with a continuous function strictly says (if the derivative is nonZero )that every point has a different value which made this contradiction. How can I correctly interpret this?

It is similar to the problem of a line which bison made out of zero width points

Mfb, Could you please elaborate?

And thank you factchecker

FactChecker · Apr 13, 2017

You need to be careful here. I think you are implying that a zero probability of X=x₀ means that the PDF is 0 at x₀. That is not true. The PDF is NOT a probability. The PDF is the slope of the CDF. If the PDF, f(x) at the point x₀ is zero, then the slope of the CDF at that point is zero. For a continuous random variable, the probability of any exact single value is zero no matter what the value of the PDF is.

Biker · Apr 13, 2017

FactChecker said:

You need to be careful here. I think you are implying that a zero probability of X=x₀ means that the PDF is 0 at x₀. That is not true. The PDF is NOT a probability. The PDF is the slope of the CDF. If the PDF, f(x) at the point x₀ is zero, then the slope of the CDF at that point is zero. For a continuous random variable, the probability of any exact single value is zero no matter what the value of the PDF is.

I am not. I know what each one presents.
I am talking about the cumulative distribution as if it was a sum of points because it is continuous everywhere it. If you some how say that the exact probability is zero then it follows the the slope must be equal to zero of the Cdf. But no, any interval has some probability. So the probability must not be zero and not finite. Jbriggs explained what I meant above.
It is a matter of how can I interpret a continuous cdf while probability of exact value is zero.
The same thing can be applied to a line, say you take out a point does that make difference to the length?
The whole problem is about the sum of zeros can result in a finite number

And that I want to know how you can interpret a continuous cdf

PS I know I can't talk about points in continuous distribution but it just a contradiction that you some how have continuous cdf but zero probability for each value

FactChecker · Apr 13, 2017

Biker said:

I am not. I know what each one presents.
I am talking about the cumulative distribution as if it was a sum of points because it is continuous everywhere it. If you some how say that the exact probability is zero then it follows the the slope must be equal to zero of the Cdf.

Not true. The slope does not have to be 0. That is all I can say.

But no, any interval has some probability. So the probability must not be zero and not finite. Jbriggs explained what I meant above.
It is a matter of how can I interpret a continuous cdf while probability of exact value is zero.
The same thing can be applied to a line, say you take out a point does that make difference to the length?
The whole problem is about the sum of zeros can result in a finite number

And that I want to know how you can interpret a continuous cdf

The interpretation of a continuous CDF is P( X∈[x1, x2] ) = CDF(x2) - CDF(x1) for x1 ≤ x2. So P( X=x1) = P( X∈[x1, x1] ) = CDF(x1) - CDF(x1) = 0, for any continuous CDF with any slope.

Stephen Tashi · Apr 13, 2017

Biker said:

I just have a couple of questions about how it can be zero probability.

It should be emphasized that your questions concern the application of probabilty theory, not the mathematical theory of probability because your are asking about physical events - whether something actually happens. Mathematical probability theory does not define whether an event that has been assigned a probability will (or can) actually happen. It does not even have an axiom that says it is possible to take random samples from a distribution- in the sense of forcing a random variable to "actually" take on a particular value. The assumption that we can do random sampling is an assumption that is done when making applications of probability theory to particular problems.

Mathematical probability theory only assumes that there is a "measure space" in which events can be assigned probabilities. It doesn't comment on whether these events actually happen.

For example, if you are measuring peoples' weights, you may choose to assume that your are sampling from a continuous distribution such as lognormal distribution. You will only be able to measure a person's weight to a finite precision, so you can't experimentally prove or disprove that people's weights have a continuous distribution. If you assume peoples' weights have a lognormal distribution and treat your measurement of someone's weight as being exactly the correct weight then you do have the awkward situation that an event with zero probability has happened. However, that awkwardness can't be resolved by mathematical probability theory - the awkwardness involves the desire of people who apply probability theory to interpret "zero probability" as meaning "can't actually happen".

The mathematical problem of assigning continuous probabilities is similar to the problem of assigning a mass to part of an object that has a continuously varying density. We use a mass density function to describe the varying density in the object, but "at a point" in the object, there is no mass. So we may have a mass density of 120 lbs per cubic inch at a point, but we don't say the point itself has any mass.

People working problems about mass densities don't face the task of taking a sample of an object consisting of a mathematical point and putting it on a slide to examine under a microscope. In contrast, people who apply probability theory often use language that suggests they (or Nature) is accomplishing the feat of having an event with zero probability actually happen. Some people may take that idea seriously and others may regard it as merely a convenient fiction that approximates the way things work. Mathematical probability theory doesn't comment on the issue.

StoneTemplePython · Apr 13, 2017

Biker said:

However, They use: Zero almost surely...
Which means that an even can happen even if it has zero probability which is fine but why not say that it is an infinitesimal? (Hyperreal, is it possible?) and keep the notion of zero to impossible of it happening

Is it just zero to keep it in the real numbers or is it exactly zero?

Of course the area under the curve of a probability density function in a single point is zero that doesn't mean that it has probability zero

To answer your question -- modern probability theory comes from Kolmogorov, using measure theory. There are other rigorous formulations -- e.g. Nelson's Radically Elementary Probability Theory which does use nonstandard analysis -- the preface humorously suggests that even high school students can understand the book (it is not an easy read). You can read it here: https://web.math.princeton.edu/~nelson/books/rept.pdf

For the most part you will not see people use infinitesimals in standard probability theory -- except when they 'tap out' and feel like it is the needed to convey what they want to say. (In particular I've seen a lot of people start talking about infinitesimal generators of continuous Markov chains, but otherwise they won't use infinitesimals.)

Once you get used to the idea that zero probability events can happen (but not zero density ones), I suspect you'll be ok with the terminology. It took me a while.
- - - -
Sometimes coming at this from a different angle is illuminating. Here's a very relevant problem from Fifty Challenging Problems in Probability

>> What is the probability that the quadratic equation: ##x^2 + 2bx + c = 0## has real roots?

note your domain is reals, b and c are independently sampled from ##(-\infty,\infty)##, though it might be helpful to think of them coming from ##(-3, 3)## or ##(-n, n)## and to then consider the limiting behavior.

I used to not like this "zero probability but still possible" stuff -- but eventually I got over it, and this problem helped that process move along.

mathman · Apr 13, 2017

One idea, apart from probability, which might clarify things. The unit interval has length 1. Each point on the interval has length 0. Add the points together to get the unit interval, but you cannot add up lengths.

Svein · Apr 14, 2017

mathman said:

One idea, apart from probability, which might clarify things. The unit interval has length 1. Each point on the interval has length 0. Add the points together to get the unit interval, but you cannot add up lengths.

And then we end up in measure theory...

mathman · Apr 15, 2017

The foundation of measure theory and the foundation of probability are quite similar.

rootone · Apr 15, 2017

If the probability of something is less than then the probability of the Universe existing, that will do for me as a definition of 'Zero'

PeroK · Apr 16, 2017

StoneTemplePython said:

- - - -
Sometimes coming at this from a different angle is illuminating. Here's a very relevant problem from Fifty Challenging Problems in Probability

>> What is the probability that the quadratic equation: ##x^2 + 2bx + c = 0## has real roots?

note your domain is reals, b and c are independently sampled from ##(-\infty,\infty)##, though it might be helpful to think of them coming from ##(-3, 3)## or ##(-n, n)## and to then consider the limiting behavior.

I used to not like this "zero probability but still possible" stuff -- but eventually I got over it, and this problem helped that process move along.

I guess your argument is that if I am asked to produce a quadratic equation of that form and I go for, say, ##b = 5, c=1##, i.e.:

##x^2 + 10x + 1##

Then, as this quadratic has real roots, something with 0 probability has actually happened?

Not to mention that my picking two integers for the coefficients also had 0 probability!

chiro · Apr 16, 2017

The probability is zero because it is meant to be impossible to get an event occur unless you have infinitely many tries of a continuous stochastic process.

A continuous stochastic process has infinitely many values in its state space and you will never actually realize that state with a normal continuous PDF stochastic process [like with a Normal distribution or some other analytic PDF].

However - you can have processes that have many values that can be realized and the way that this is studied is through what is called pure probability.

If you want to understand this more, then you will have to look at graduate statistics which includes measure theory, analysis, and probability related subjects like sigma algebras and stochastic processes.

I'd wait until you get to that if you are in undergraduate, but those subjects will help answer your questions in regard to your original post.

FactChecker · Apr 16, 2017

Throw a dart at a number line to get a number between 0 and 1. The PDF is 1 on [0,1] and 0 otherwise. The CDF is y=x, 0≤x≤1.
You know that some number must occur from your experiment. Suppose the number is 0.1592653589793238462643383279502884197169399375105820974944592307816406286 208998628034825342117067982148086513282306647093844609550582231725359408128481 117450284102701938521105559644622948954930381964428810975665933446128475648233 786783165271201909145648566923460348610454326648213393607260249141273724587006 606315588174881520920962829254091715364367892590360011330530548820466521384146 951941511609433057270365759591953092186117381932611793105118548074462379962749 567351885752724891227938183011949129833673362440656643086021394946395224737190 702179860943702770539217176293176752384674818467669405132000568127145263560827 785771342757789609173637178721468440901224953430146549585371050792279689258923 542019956112129021960864034418159813629774771309960518707211349999998372978049 951059731732816096318595024459455346908302642522308253344685035261931188171010 003137838752886587533208381420617177669147303598253490428755468731159562863882 353787593751957781857780532171226806613001927876611195909216420198938095257201 065485863278865936153381827968230301952035301852968995773622599413891249721775 283479131515574857242454150695950829533116861727855889075098381754637464939319 255060400927701671139009848824012858361603563707660104710181942955596198946767 837449448255379774726847104047534646208046684259069491293313677028989152104752 162056966024058038150193511253382430035587640247496473263914199272604269922796...

I can add another thousand digits, and millions after that.
What were the odds of that number being the result? Some number must occur, but the pre-experiment probability of that exact number occurring was 0. And that exact number will never happen again in a billion billion tries.

PeroK · Apr 16, 2017

FactChecker said:

I can add another thousand digits, and millions after that.
What were the odds of that number being the result? Some number must occur, but the pre-experiment probability of that exact number occurring was 0. And that exact number will never happen again in a billion billion tries.

In any such experiment, only a finite number of results is possible.

FactChecker · Apr 16, 2017

PeroK said:

In any such experiment, only a finite number of results is possible.

Really? What is that finite number?
EDIT: I am trying to illustrate a mathematical concept with a easily understood physical example. Some abstraction must be applied.

PeroK · Apr 16, 2017

FactChecker said:

Really? What is that finite number?

It depends on the experiment.

Given any number, you could devise an experiment with more than that number of possible results. But, the number of results of any experiment is finite.

jbriggs444 · Apr 16, 2017

PeroK said:

I guess your argument is that if I am asked to produce a quadratic equation of that form and I go for, say, ##b = 5, c=1##, i.e.:

##x^2 + 10x + 1##

Then, as this quadratic has real roots, something with 0 probability has actually happened?

I believe that you are misinterpreting the idea being presented. We are asked to pick a, b and c from a uniform distribution over an interval centered on zero. We are asked to assess the a priori probability that the resulting polynomial has real roots. That is to say, we are asked for the probability that the discriminant, ##b^2 - 4ac## will be greater than zero.

In order to avoid concerns with the impossibility of a uniform distribution over the real numbers we are asked to take the limit of this probability as the length of the interval increases without bound.

A key observation is that the computed probability is independent of the size of the interval. So evaluating the limit is trivial. Just evaluate the result for an interval of one's choosing. It is clear that the probability is neither zero nor one. [And, accordingly, has precious little to do with the subject matter of this thread].

Not to mention that my picking two integers for the coefficients also had 0 probability!

It is fairly clear that you did not pick those coefficients at random using a continuous PDF.

FactChecker · Apr 16, 2017

PeroK said:

It depends on the experiment.

Given any number, you could devise an experiment with more than that number of possible results. But, the number of results of any experiment is finite.

Maybe in the physical world. But the mathematical concepts are not limited to that. Some abstract thought must be applied to answer the OP.

PeroK · Apr 16, 2017

jbriggs444 said:

I believe that you are misinterpreting the idea being presented. We are asked to pick a, b and c from a uniform distribution over an interval centered on zero. We are asked to assess the a priori probability that the resulting polynomial has real roots. That is to say, we are asked for the probability that the discriminant, ##b^2 - 4ac## will be greater than zero.

In order to avoid concerns with the impossibility of a uniform distribution over the real numbers we are asked to take the limit of this probability as the length of the interval increases without bound.

A key observation is that the computed probability is independent of the size of the interval. So evaluating the limit is trivial. Just evaluate the result for an interval of one's choosing. It is clear that the probability is neither zero nor one. [And, accordingly, has precious little to do with the subject matter of this thread].It is fairly clear that you did not pick those coefficients at random using a continuous PDF.

Exactly. Will post a fuller analysis of the issue.

PeroK · Apr 16, 2017

FactChecker said:

Maybe in the physical world. But the mathematical concepts are not limited to that. Some abstract thought must be applied to answer the OP.

Yes, but then you cannot use the mathematical result to make a claim about the physical world, unless you can apply that mathematics to the physical world.

FactChecker · Apr 16, 2017

PeroK said:

Yes, but then you cannot use the mathematical result to make a claim about the physical world, unless you can apply that mathematics to the physical world.

The concept should be understood before it can be applied to anything, even as an approximation. The OP did not mention any particular application. The mathematical concept is trivial and the misconception of the OP was basic. Time to move on.

PeroK · Apr 16, 2017

StoneTemplePython said:

Sometimes coming at this from a different angle is illuminating. Here's a very relevant problem from Fifty Challenging Problems in Probability

>> What is the probability that the quadratic equation: ##x^2 + 2bx + c = 0## has real roots?

note your domain is reals, b and c are independently sampled from ##(-\infty,\infty)##, though it might be helpful to think of them coming from ##(-3, 3)## or ##(-n, n)## and to then consider the limiting behavior.

I used to not like this "zero probability but still possible" stuff -- but eventually I got over it, and this problem helped that process move along.

Here is a better an fuller analysis of the problem with this statement:

1) How can you generate a quadratic equation at random?

You can have a finite number of discrete options for each coefficient and a uniform distribution.

You can have a finite interval of options and a uniform (density) distribution on that interval.

But:

You cannot have an infinite number of discrete options and a uniform distribution.

You cannot have an infinite interval and a uniform density.

2) Given any interval ##[-n, n]## and a uniform density distribution, you can calculate the probability that the quadratic equation having real roots.

3) The limit as ##n \rightarrow \infty## equals 0.

But, there is no probability distribution represented by this limit. In other words, the limit of a sequence of pdf's is not necessarily a pdf.

Therefore, there is no mathematical or physical process by which you can choose any real coefficients at random, uniformly distributed.

In order to allow any real coefficients to be chosen, you must abandon the uniform distribution and then the probability of the quadratic having real roots depends on the distribution.

StoneTemplePython · Apr 16, 2017

PeroK said:

I guess your argument is that if I am asked to produce a quadratic equation of that form and I go for, say, ##b = 5, c=1##, i.e.:

##x^2 + 10x + 1##

Then, as this quadratic has real roots, something with 0 probability has actually happened?

Not to mention that my picking two integers for the coefficients also had 0 probability!

I was actually suggesting the opposite of this. If you work through the math, as n grows large, you actually get real roots with probability one (i.e. the probability of complex root is zero). Yet we all know that complex roots are possible over real quadratic equations -- it is where quite a few of us first ran into complex numbers, I think.

- - - -
For avoidance of doubt, there is no ##a## term in ##x^2 + 2bx + c = 0## (or put differently, ##a## is fixed at one).

PeroK · Apr 16, 2017

StoneTemplePython said:

I was actually suggesting the opposite of this. If you work through the math, as n grows large, you actually get real roots with probability one (i.e. the probability of complex root is zero). Yet we all know that complex roots are possible over real quadratic equations -- it is where quite a few of us first ran into complex numbers, I think.

- - - -
For avoidance of doubt, there is no ##a## term in ##x^2 + 2bx + c = 0## (or put differently, ##a## is fixed at one).

Yes, you're right, it's the probablity of having complex roots that tends to ##0##. But, the point remains, the limit where this probability is ##0## is not represented by any pdf for the coefficients. Your limiting pdf is identically zero

Nothing with probablity 0 ever happens, by definition.

StoneTemplePython · Apr 16, 2017

PeroK said:

Yes, you're right, it's the probablity of having complex roots that tends to ##0##. But, the point remains, the limit where this probability is ##0## is not represented by any pdf for the coefficients. Your limiting pdf is identically zero

(This may have been mentioned elsewhere in the thread) The distribution works just fine as an improper prior and could be useful as the starting point in a bayesian inference problem. Yes -- without any (satisfactory) likelihood function applied to it, we can't get a satisfactory normalizing constant over the full infinite interval.

If you think of the limiting process, it gives you some interesting asymptotic information : this tells us: as n grows larger and larger, the probability of a complex root get vanishingly close to zero. (You can always look at the rate of this convergence for a simple problem like this for more info.)

Puzzles can sometimes help people build insight -- if they don't help, then no need to use them.

PeroK said:

Nothing with probablity 0 ever happens, by definition.

The whole point of this thread is the exact opposite of this statement. There are well understood and accepted definitions in probability theory that do not agree with this statement. When people come up with their own private definitions this is not helpful. Perhaps you mean that nothing with density 0 ever happens?

PeroK · Apr 16, 2017

StoneTemplePython said:

(This may have been mentioned elsewhere in the thread) The distribution works just fine as an improper prior and could be useful as the starting point in a bayesian inference problem. Yes -- without any (satisfactory) likelihood function applied to it, we can't get a satisfactory normalizing constant over the full infinite interval.

If you think of the limiting process, it gives you some interesting asymptotic information : this tells us: as n grows larger and larger, the probability of a complex root get vanishingly close to zero. (You can always look at the rate of this convergence for a simple problem like this for more info.)

Puzzles can sometimes help people build insight -- if they don't help, then no need to use them.
The whole point of this thread is the exact opposite of this statement. There are well understood and accepted definitions in probability theory that do not agree with this statement. When people come up with their own private definitions this is not helpful. Perhaps you mean that nothing with density 0 ever happens?

An event with a probablity of 0 cannot happen, by definition. Something in mathematics doesn't "happen". If I say, for example:

Let ##f(x) = sin(x)##, then nothing has happened. It doesn't mean that we actually have an infinite line anywhere with an infinite sine function.

But, also, your logic of assuming that a property shared by the elements of a sequence must be present in the limit is false. And this false logic leads to paradoxes like the one you have quoted.

Each pdf that represents a uniform distribution on ##[-n, n]## is indeed a pdf. But, the limit of this sequence of pdf's is not itself a pdf: it is the zero function. That's where your paradox about quadratics comes from and why some quadratics have real roots and some have complex roots. It's not because things of zero probability actually happen.

Stephen Tashi · Apr 16, 2017

PeroK said:

Nothing with probablity 0 ever happens, by definition.

No definition of probability in terms of measure theory deals with whether events happen. If there is a definition saying an event with probability zero never happens, it isn't a definition from mathematical probability theory.

PeroK · Apr 16, 2017

Stephen Tashi said:

No definition of probability in terms of measure theory deals with whether events happen. If there is a definition saying an event with probability zero never happens, it isn't a definition from mathematical probability theory.

How would you make something happen that has 0 probability?

Continuous random variable: Zero probablity

Similar threads

Hot Threads

Recent Insights