# The meaning, if any, of probability -1 or i

by Loren Booda
Tags: meaning, probability
 P: 3,408 Is there a significance to the probability value -1 or i?
 Mentor P: 18,346 Uuuh, you mean that an event happens with probability -1 or i?? All probabilities must normally be a real number between 0 and 1. So a probability of -1 and i is classically not allowed. Can you tell us where you saw these things??
 P: 3,408 I'm sorry that I cannot recall where I had seen this concept, but that it may have arisen from quantum mechanics.
P: 1,719
The meaning, if any, of probability -1 or i

 Quote by Loren Booda I'm sorry that I cannot recall where I had seen this concept, but that it may have arisen from quantum mechanics.
In Quantum Mechanics amplitudes are complex numbers and are similar to probabilities. The wave function for a free particle follows a complex diffusion process that is mathematically similar to the diffusion of heat.

But actual probabilities are percents and so must lie between zero and 1.
 P: 4,577 Recall that the PDF is generated by multiplying the wave function by its conjugate. Specifically: http://en.wikipedia.org/wiki/Wave_function#Definition
 Sci Advisor P: 3,313 In standard probability theory, probabilities are numbers in the interval [0,1]. There, that felt comfortable to say! - but I'm wondering if probability density functions might stray outside that interval. I think the topic of complex valued probabilities has been studied ( and I'm talking about that topic, not the wave functions of quantum mechanics). I recall seeing documents with that title, but I probably didn't understand them. Besides, the fact that something has already been studied and worked out shouldn't deter anyone from speculating about it on the internet. I don't think the objection that an event "can't happen" with probability i or -1 is a clear objection. The fact that an event happens indicates it has probability 1 in the probability space where we know it happened. In a space where we aren't sure it happens, it has a probability different than 1. A difficulty of making sense of i or -1 as a probability might come in dealing with P(A|B) = P(A and B)/ P(B) when P(A|B) = 1. But perhaps clever people have figured out how to make the ratio come out correctly with complex valued probabilities. The unfortunate property of probability theory is that is has such a tenuous connection with things that are definite and real. People very much desire to connect probabilities with the actual observed frequencies of events. Theorems of statistics and probability that say anything about actual frequencies only talk about the probabilities of actual frequencies. (This is a rather circular situation!) The "law of large numbers" is about the only thing you can get a hold of that will connect to reality. It involves a limit of probabilities-of-actual-frequencies, but the limit is 1. If there is a fundamental objection to complex valued probabilities and the fact that (in some probability space) an event either happens or doesn't, I'd think that the Law Of Large Numbers would have to come into play to support the objection.
P: 1,719
 Quote by Stephen Tashi A difficulty of making sense of i or -1 as a probability might come in dealing with P(A|B) = P(A and B)/ P(B) when P(A|B) = 1. But perhaps clever people have figured out how to make the ratio come out correctly with complex valued probabilities.
Conditional amplitudes are well defined just as are conditional probabilities.

The amplitude that a quantum particle is in a given state is the sum over all possible states of the amplitudes that it was previously in any of the other states times the amplitude that given that it is in a previous state that it will transition to the given state. This is really a Markov process where probabilities are replaced by complex number "amplitudes" and transition probabilities are replaced by transition amplitudes. Feynmann's derivation of the Shroedinger equation from a discrete Markov like chain of complex amplitudes makes this clear.

But it should be no surprise that anything satisfying the Shroedinger equation should be a diffusion like process since it is mathematically identical to a heat equation with a complex coefficient.

 The unfortunate property of probability theory is that is has such a tenuous connection with things that are definite and real. People very much desire to connect probabilities with the actual observed frequencies of events. Theorems of statistics and probability that say anything about actual frequencies only talk about the probabilities of actual frequencies. (This is a rather circular situation!) The "law of large numbers" is about the only thing you can get a hold of that will connect to reality. It involves a limit of probabilities-of-actual-frequencies, but the limit is 1. If there is a fundamental objection to complex valued probabilities and the fact that (in some probability space) an event either happens or doesn't, I'd think that the Law Of Large Numbers would have to come into play to support the objection.
It seems to me that ,since the Shroedinger equation predicts observations exactly - the probabilities of observing a free particle at some point in space are not tenuous connections to reality but are exact descriptions of it.
P: 3,313
 Quote by lavinia Conditional amplitudes are well defined just as are conditional probabilities.
I'm not sure how the existence of complex valued things whose modulus is used to compute real valued probabilities bears on the question of whether there could be a sensible theory of probability that allowed complex values for probabilities. Is your point that we could create a complex valued probability theory (for ordinary macroscopic situations like coin tossing, etc.) by making a theory that parallels the formalism of quantum mechanics. (I've always wondered why the Quantum Mechanics books never seem to worry about things like measure theory.)

 It seems to me that ,since the Shroedinger equation predicts observations exactly - the probabilities of observing a free particle at some point in space are not tenuous connections to reality but are exact descriptions of it.
But what does it mean for a theory to predict a probability "exactly"? Don't the tests of such theories boil down to applying the law of large numbers? Or were you referring to some other testable thing that the Shroedinger equation predicts?
PF Gold
P: 162
 Quote by Stephen Tashi But what does it mean for a theory to predict a probability "exactly"? Don't the tests of such theories boil down to applying the law of large numbers? Or were you referring to some other testable thing that the Shroedinger equation predicts?
And how do you interpret statements like "There is a 60% chance of rain tomorrow"?

I think any attempt to verify a theoretical probability with real world data is based around Bayesian Inference in one way or another; which is totally unsatisfying because there isn't really a good way to justify the prior probabilities.
P: 4,577
 Quote by kai_sikorski And how do you interpret statements like "There is a 60% chance of rain tomorrow"? I think any attempt to verify a theoretical probability with real world data is based around Bayesian Inference in one way or another; which is totally unsatisfying because there isn't really a good way to justify the prior probabilities.
That is pretty much what a large chunk of Bayesian Thinking is all about.

Also this is getting into a kind of philosophical territory. As mathematicians not only do we want to quantify things but we want to quantify things in the most explicit and methodical way possible.

We have a lot of definitions for probability including the law of large numbers approach, but what if you don't have this luxury? What if you need to do an experiment with a limit on resources and you have to ask an expert for some prior probabilities?

The expert will give you some probabilities but despite having a good intuitive grasp of their domain, may not be able to really flesh out an explicit enough reason for why they chose that particular answer.

It's definitely a good thing to discuss because a lot of the methodology of not only stating but also understanding probabilities in the context of the model and its domain of application is not that well developed (and probability/statistics is quite a young field as well).
Mentor
P: 18,346
 Quote by kai_sikorski And how do you interpret statements like "There is a 60% chance of rain tomorrow"?
I actually mailed some meteriologists about that.

It means: if you like at all the days with the same initial conditions, then in 60% of the days it will rain. So the weather people will look at all the data of the previous days, they will search a computer for days of very similar conditions and they see in which days it rains or not.
 PF Gold P: 162 I think the only principle that I've seen for assigning priors that I've heard of that is even remotely objective is maximum entropy. However even there you would often seek a maximum entropy distribution satisfying certain constraints, and picking those constraints can again be subjective. However the fact that basically all of the famous probability distributions are actually obtainable as maximum entropy distributions subject to some simple constraint is somehow very satisfying to me.
P: 4,577
 Quote by kai_sikorski I think the only principle that I've seen for assigning priors that I've heard of that is even remotely objective is maximum entropy. However even there you would often seek a maximum entropy distribution satisfying certain constraints, and picking those constraints can again be subjective. However the fact that basically all of the famous probability distributions are actually obtainable as maximum entropy distributions subject to some simple constraint is somehow very satisfying to me.
Maximum entropy basically implies a uniform distribution which ends up giving results that are more 'classical' since the prior function is proportional to a constant.

The idea of Bayesian analysis is that the information supplied lowers the entropy which has a positive effect on the posterior.

What I mean by that is that in say experimental design, you can use expert data to minimize say the number of trials that you need to get an appropriate credible interval. The assumption is of course that the priors are a good reflection and if they aren't then there's not much point using them if they are not a good reflection or approximation of the underlying process.

Statistics is not an exact science and in many respects its like a computer: garbage in, garbage out.
PF Gold
P: 162
 Quote by micromass I actually mailed some meteriologists about that. It means: if you like at all the days with the same initial conditions, then in 60% of the days it will rain. So the weather people will look at all the data of the previous days, they will search a computer for days of very similar conditions and they see in which days it rains or not.
Yeah how I would interpret it is this: the current state of the world is ω, but since our information about the world is imperfect we only know some many to one function of this state σ=f(ω). If we look at all the states of the world η consistent with σ = f(η) then we could in principle calculate what percentage of those states will lead to rain tomorrow. That's sort of what the number they tell us for the probability of rain means.
PF Gold
P: 162
 Quote by chiro Maximum entropy basically implies a uniform distribution which ends up giving results that are more 'classical' since the prior function is proportional to a constant.
No, that's only true if the only restriction you put on the distribution is the range over which it can be positive.

If for example instead you restrict only the mean and variance, then the maximum entropy distribution is Normal. If you specify it has to be positive and have a certain mean then you get the Exponential distribution.

You can specify any number of moments and get different distributions.
P: 4,577
 Quote by kai_sikorski No, that's only true if the only restriction you put on the distribution is the range over which it can be positive. If for example instead you restrict only the mean and variance, then the maximum entropy distribution is Normal. If you specify it has to be positive and have a certain mean then you get the Exponential distribution. You can specify any number of moments and get different distributions.
What I was trying to say that if you wanted to describe a distribution with maximum entropy whether its continuous or not, it has to be uniform.

This is for the prior and not the posterior. You could restrict the prior to certain values and get the kind of results you are talking about, but I'm not talking about the posterior I'm talking about a simple prior where there is no restriction.

You can use optimization techniques to show that a continuous uniform distribution maximizes entropy.

http://en.wikipedia.org/wiki/Maximum..._distributions

After looking at the page I think we are focusing on completely different scenarios with different constraints.
 PF Gold P: 162 chiro, I don't actually work in this field so I only have a cursory familiarity with it. This section of the wikipedia entry on prior distributions is along the lines that I was thinking but it seems that you're right and basically in almost all cases if you're trying to get an objective prior using maximum entropy you'll end up with a uniform distribution. http://en.wikipedia.org/wiki/Prior_p...rmative_priors The cases I mentioned are mentioned in the article as well, but since you have to specify the mean or mean and variance or other such parameter, this still seems like a subjective prior to me.
P: 4,577
 Quote by kai_sikorski chiro, I don't actually work in this field so I only have a cursory familiarity with it. This section of the wikipedia entry on prior distributions is along the lines that I was thinking but it seems that you're right and basically in almost all cases if you're trying to get an objective prior using maximum entropy you'll end up with a uniform distribution. http://en.wikipedia.org/wiki/Prior_p...rmative_priors The cases I mentioned are mentioned in the article as well, but since you have to specify the mean or mean and variance or other such parameter, this still seems like a subjective prior to me.
Yeah we're basically talking about different situations so we're comparing red apples to green apples which is different enough to talking about different things.

With my comment I am talking about maximizing entropy of a general continuous distribution. You are talking about maximizing entropy in a specific way for a specific kind of distribution with specific properties.

So yeah we're both talking about different things.

 Related Discussions Quantum Physics 1 Calculus & Beyond Homework 5 Quantum Physics 0 Set Theory, Logic, Probability, Statistics 9 Quantum Physics 2