The meaning, if any, of probability -1 or i

Loren Booda · Feb 17, 2012

Is there a significance to the probability value -1 or i?

micromass · Feb 17, 2012

Uuuh, you mean that an event happens with probability -1 or i??

All probabilities must normally be a real number between 0 and 1. So a probability of -1 and i is classically not allowed.

Can you tell us where you saw these things??

Loren Booda · Feb 17, 2012

I'm sorry that I cannot recall where I had seen this concept, but that it may have arisen from quantum mechanics.

lavinia · Feb 17, 2012

Loren Booda said:

I'm sorry that I cannot recall where I had seen this concept, but that it may have arisen from quantum mechanics.

In Quantum Mechanics amplitudes are complex numbers and are similar to probabilities. The wave function for a free particle follows a complex diffusion process that is mathematically similar to the diffusion of heat.

But actual probabilities are percents and so must lie between zero and 1.

chiro · Feb 17, 2012

Recall that the PDF is generated by multiplying the wave function by its conjugate.

Specifically:

http://en.wikipedia.org/wiki/Wave_function#Definition

Stephen Tashi · Feb 18, 2012

In standard probability theory, probabilities are numbers in the interval [0,1]. There, that felt comfortable to say! - but I'm wondering if probability density functions might stray outside that interval.

I think the topic of complex valued probabilities has been studied ( and I'm talking about that topic, not the wave functions of quantum mechanics). I recall seeing documents with that title, but I probably didn't understand them. Besides, the fact that something has already been studied and worked out shouldn't deter anyone from speculating about it on the internet.

I don't think the objection that an event "can't happen" with probability i or -1 is a clear objection. The fact that an event happens indicates it has probability 1 in the probability space where we know it happened. In a space where we aren't sure it happens, it has a probability different than 1.

A difficulty of making sense of i or -1 as a probability might come in dealing with P(A|B) = P(A and B)/ P(B) when P(A|B) = 1. But perhaps clever people have figured out how to make the ratio come out correctly with complex valued probabilities.

The unfortunate property of probability theory is that is has such a tenuous connection with things that are definite and real. People very much desire to connect probabilities with the actual observed frequencies of events. Theorems of statistics and probability that say anything about actual frequencies only talk about the probabilities of actual frequencies. (This is a rather circular situation!) The "law of large numbers" is about the only thing you can get a hold of that will connect to reality. It involves a limit of probabilities-of-actual-frequencies, but the limit is 1. If there is a fundamental objection to complex valued probabilities and the fact that (in some probability space) an event either happens or doesn't, I'd think that the Law Of Large Numbers would have to come into play to support the objection.

lavinia · Feb 18, 2012

Stephen Tashi said:

A difficulty of making sense of i or -1 as a probability might come in dealing with P(A|B) = P(A and B)/ P(B) when P(A|B) = 1. But perhaps clever people have figured out how to make the ratio come out correctly with complex valued probabilities.

Conditional amplitudes are well defined just as are conditional probabilities.

The amplitude that a quantum particle is in a given state is the sum over all possible states of the amplitudes that it was previously in any of the other states times the amplitude that given that it is in a previous state that it will transition to the given state. This is really a Markov process where probabilities are replaced by complex number "amplitudes" and transition probabilities are replaced by transition amplitudes. Feynmann's derivation of the Shroedinger equation from a discrete Markov like chain of complex amplitudes makes this clear.

But it should be no surprise that anything satisfying the Shroedinger equation should be a diffusion like process since it is mathematically identical to a heat equation with a complex coefficient.

The unfortunate property of probability theory is that is has such a tenuous connection with things that are definite and real. People very much desire to connect probabilities with the actual observed frequencies of events. Theorems of statistics and probability that say anything about actual frequencies only talk about the probabilities of actual frequencies. (This is a rather circular situation!) The "law of large numbers" is about the only thing you can get a hold of that will connect to reality. It involves a limit of probabilities-of-actual-frequencies, but the limit is 1. If there is a fundamental objection to complex valued probabilities and the fact that (in some probability space) an event either happens or doesn't, I'd think that the Law Of Large Numbers would have to come into play to support the objection.

It seems to me that ,since the Shroedinger equation predicts observations exactly - the probabilities of observing a free particle at some point in space are not tenuous connections to reality but are exact descriptions of it.

Stephen Tashi · Feb 18, 2012

lavinia said:

Conditional amplitudes are well defined just as are conditional probabilities.

I'm not sure how the existence of complex valued things whose modulus is used to compute real valued probabilities bears on the question of whether there could be a sensible theory of probability that allowed complex values for probabilities. Is your point that we could create a complex valued probability theory (for ordinary macroscopic situations like coin tossing, etc.) by making a theory that parallels the formalism of quantum mechanics. (I've always wondered why the Quantum Mechanics books never seem to worry about things like measure theory.)

It seems to me that ,since the Shroedinger equation predicts observations exactly - the probabilities of observing a free particle at some point in space are not tenuous connections to reality but are exact descriptions of it.

But what does it mean for a theory to predict a probability "exactly"? Don't the tests of such theories boil down to applying the law of large numbers? Or were you referring to some other testable thing that the Shroedinger equation predicts?

kai_sikorski · Feb 18, 2012

Stephen Tashi said:

But what does it mean for a theory to predict a probability "exactly"? Don't the tests of such theories boil down to applying the law of large numbers? Or were you referring to some other testable thing that the Shroedinger equation predicts?

And how do you interpret statements like "There is a 60% chance of rain tomorrow"?

I think any attempt to verify a theoretical probability with real world data is based around Bayesian Inference in one way or another; which is totally unsatisfying because there isn't really a good way to justify the prior probabilities.

chiro · Feb 18, 2012

kai_sikorski said:

And how do you interpret statements like "There is a 60% chance of rain tomorrow"?

I think any attempt to verify a theoretical probability with real world data is based around Bayesian Inference in one way or another; which is totally unsatisfying because there isn't really a good way to justify the prior probabilities.

That is pretty much what a large chunk of Bayesian Thinking is all about.

Also this is getting into a kind of philosophical territory. As mathematicians not only do we want to quantify things but we want to quantify things in the most explicit and methodical way possible.

We have a lot of definitions for probability including the law of large numbers approach, but what if you don't have this luxury? What if you need to do an experiment with a limit on resources and you have to ask an expert for some prior probabilities?

The expert will give you some probabilities but despite having a good intuitive grasp of their domain, may not be able to really flesh out an explicit enough reason for why they chose that particular answer.

It's definitely a good thing to discuss because a lot of the methodology of not only stating but also understanding probabilities in the context of the model and its domain of application is not that well developed (and probability/statistics is quite a young field as well).

micromass · Feb 18, 2012

kai_sikorski said:

And how do you interpret statements like "There is a 60% chance of rain tomorrow"?

I actually mailed some meteriologists about that.

It means: if you like at all the days with the same initial conditions, then in 60% of the days it will rain. So the weather people will look at all the data of the previous days, they will search a computer for days of very similar conditions and they see in which days it rains or not.

kai_sikorski · Feb 18, 2012

I think the only principle that I've seen for assigning priors that I've heard of that is even remotely objective is maximum entropy. However even there you would often seek a maximum entropy distribution satisfying certain constraints, and picking those constraints can again be subjective. However the fact that basically all of the famous probability distributions are actually obtainable as maximum entropy distributions subject to some simple constraint is somehow very satisfying to me.

chiro · Feb 18, 2012

kai_sikorski said:

I think the only principle that I've seen for assigning priors that I've heard of that is even remotely objective is maximum entropy. However even there you would often seek a maximum entropy distribution satisfying certain constraints, and picking those constraints can again be subjective. However the fact that basically all of the famous probability distributions are actually obtainable as maximum entropy distributions subject to some simple constraint is somehow very satisfying to me.

Maximum entropy basically implies a uniform distribution which ends up giving results that are more 'classical' since the prior function is proportional to a constant.

The idea of Bayesian analysis is that the information supplied lowers the entropy which has a positive effect on the posterior.

What I mean by that is that in say experimental design, you can use expert data to minimize say the number of trials that you need to get an appropriate credible interval. The assumption is of course that the priors are a good reflection and if they aren't then there's not much point using them if they are not a good reflection or approximation of the underlying process.

Statistics is not an exact science and in many respects its like a computer: garbage in, garbage out.

kai_sikorski · Feb 18, 2012

micromass said:

I actually mailed some meteriologists about that.

It means: if you like at all the days with the same initial conditions, then in 60% of the days it will rain. So the weather people will look at all the data of the previous days, they will search a computer for days of very similar conditions and they see in which days it rains or not.

Yeah how I would interpret it is this: the current state of the world is ω, but since our information about the world is imperfect we only know some many to one function of this state σ=f(ω). If we look at all the states of the world η consistent with σ = f(η) then we could in principle calculate what percentage of those states will lead to rain tomorrow. That's sort of what the number they tell us for the probability of rain means.

kai_sikorski · Feb 18, 2012

chiro said:

Maximum entropy basically implies a uniform distribution which ends up giving results that are more 'classical' since the prior function is proportional to a constant.

No, that's only true if the only restriction you put on the distribution is the range over which it can be positive.

If for example instead you restrict only the mean and variance, then the maximum entropy distribution is Normal. If you specify it has to be positive and have a certain mean then you get the Exponential distribution.

You can specify any number of moments and get different distributions.

chiro · Feb 18, 2012

kai_sikorski said:

No, that's only true if the only restriction you put on the distribution is the range over which it can be positive.

If for example instead you restrict only the mean and variance, then the maximum entropy distribution is Normal. If you specify it has to be positive and have a certain mean then you get the Exponential distribution.

You can specify any number of moments and get different distributions.

What I was trying to say that if you wanted to describe a distribution with maximum entropy whether its continuous or not, it has to be uniform.

This is for the prior and not the posterior. You could restrict the prior to certain values and get the kind of results you are talking about, but I'm not talking about the posterior I'm talking about a simple prior where there is no restriction.

You can use optimization techniques to show that a continuous uniform distribution maximizes entropy.

http://en.wikipedia.org/wiki/Maximu...n#Uniform_and_piecewise_uniform_distributions

After looking at the page I think we are focusing on completely different scenarios with different constraints.

kai_sikorski · Feb 18, 2012

chiro,

I don't actually work in this field so I only have a cursory familiarity with it. This section of the wikipedia entry on prior distributions is along the lines that I was thinking but it seems that you're right and basically in almost all cases if you're trying to get an objective prior using maximum entropy you'll end up with a uniform distribution.

http://en.wikipedia.org/wiki/Prior_probability#Uninformative_priors

The cases I mentioned are mentioned in the article as well, but since you have to specify the mean or mean and variance or other such parameter, this still seems like a subjective prior to me.

chiro · Feb 18, 2012

kai_sikorski said:

chiro,

I don't actually work in this field so I only have a cursory familiarity with it. This section of the wikipedia entry on prior distributions is along the lines that I was thinking but it seems that you're right and basically in almost all cases if you're trying to get an objective prior using maximum entropy you'll end up with a uniform distribution.

http://en.wikipedia.org/wiki/Prior_probability#Uninformative_priors

The cases I mentioned are mentioned in the article as well, but since you have to specify the mean or mean and variance or other such parameter, this still seems like a subjective prior to me.

Yeah we're basically talking about different situations so we're comparing red apples to green apples which is different enough to talking about different things.

With my comment I am talking about maximizing entropy of a general continuous distribution. You are talking about maximizing entropy in a specific way for a specific kind of distribution with specific properties.

So yeah we're both talking about different things.

lavinia · Feb 19, 2012

Stephen Tashi said:

I'm not sure how the existence of complex valued things whose modulus is used to compute real valued probabilities bears on the question of whether there could be a sensible theory of probability that allowed complex values for probabilities. Is your point that we could create a complex valued probability theory (for ordinary macroscopic situations like coin tossing, etc.) by making a theory that parallels the formalism of quantum mechanics. (I've always wondered why the Quantum Mechanics books never seem to worry about things like measure theory.)

The theory is different than a probability theory but is completely analagous. That is all I was saying. In quantum theory, ordinary probabilities arise as the norm of the complex amplitues. I suppose if the amplitudes were restricted to the positive real numbers then the theory would become a probability theory.

But what does it mean for a theory to predict a probability "exactly"? Don't the tests of such theories boil down to applying the law of large numbers? Or were you referring to some other testable thing that the Shroedinger equation predicts?

The Shroedinger equation describes the evolution of the wave function. And thus also of its norm which is a probability distribution. This is mathematically exactly like the heat equation which also predicts a probability distribution except that while the heat equation though incredibly accurate is an empirical equation, the Shroedinger equation describes a Law of Nature.

alan2 · Feb 19, 2012

The short answer to the original question is that the three axioms of classical probability theory permit only real probabilities between zero and one. Quantum mechanics satisfies classical probability theory. Although wave functions may be complex, all probabilities calculated using those functions are real.

I have seen imaginary probabilities in the context of what are called "fractional probabilities". I don't know much about it but I think that these imaginary probabilities represent unrealizable states. For example, the Chapman-Kolmogorov equations involve summation over all intermediate states. The theory permits a summation over unrealizable states with imaginary probabilities as a mathematical tool. The end result still satisfies classical probability theory. Maybe someone else is familiar with this.

The meaning, if any, of probability -1 or i

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect