Joint probability from conditional probability?

Demystifier · Oct 6, 2008

Hi,
I am a quantum physicist who needs a practical help from mathematicians.

The physical problem that I have can be reduced to the following mathematical problem:
Assume that we have two correlated variables a and b. Assume that we know all conditional probabilities
P(a|b), P(b|a)
for all possible values of the variables a and b.
What I want to know are all joint probabilities P(a,b). However, a priori they are not given. I want to ask the following:
What is the best I can conclude about P(a,b) from knowledge of P(a|b), P(b|a)?
Are there special cases (except the trivial case in which a and b are independent) in which P(a,b) can be determined uniquely?
Any further suggestions?

Thank you in advance!

EnumaElish · Oct 8, 2008

Denoting random variables with capital letters and omitting the Prob{.} part:

A|B = AB/B
B|A = AB/A

where AB is the joint probability. Since you know A|B and B|A, you have 2 equations in 3 unknowns (AB, A, B); you need a 3rd equation; for example: A = f(B). Or, without the shorthand notation, Prob{A} = f(Prob{B}).

See also: http://en.wikipedia.org/wiki/Copula_(statistics )

winterfors · Oct 8, 2008

Well, since

[tex]p(a,b) = p(b|a) p(a) = p(a|b) p(b)[/tex]
where
[tex]p(a) = \int p(a,b)db[/tex]
[tex]p(b) = \int p(a,b)da[/tex]

we have that

[tex]\frac{p(a)}{p(b)} = \frac{p(a|b)}{p(b|a)}[/tex]

[itex]p(a) / p(b)[/itex] is obviously the product of [itex]p(a)[/itex] and [itex]1/p(b)[/itex], so [itex]p(a|b) / p(b|a)[/itex] must also be possible to rewrite as such a product. Once that is done, it is trivial to identify [itex]p(a)[/itex] and [itex]1/p(b)[/itex] out of that expression.

If [itex]p(a|b) / p(b|a)[/itex] is not separable in one a-dependent and one b-dependent factor, then there is a inconsistency between [itex]p(a|b)[/itex] and [itex]p(b|a)[/itex], and they can not be conditional probability distributions from the same joint distribution.

Good luck,

-Emanuel

quadraphonics · Oct 8, 2008

I don't think that will always work, winterfors. For one thing, what happens if one of the denominators is 0?

Consider the example were a is a Gaussian variable with mean zero and unit variance, and b is exactly equal to a. The marginal density of b is, naturally, also a unit Gaussian, and the joint density is degenerate (it's like a scalar Gaussian on the diagonal in the (a,b)-plane). The conditional density is then [itex]P(a|b) = 1_b(a)[/itex], and vice-versa. The ratio of the conditional densities, then, is 1 when a=b, and undefined otherwise. This is enough for us to see that a and b are actually the same variable, and so of course have the same marginal, but it doesn't give us any idea what said marginal is. I.e., the whole thing would work out exactly the same if a were given a different marginal distribution.

winterfors · Oct 9, 2008

quadraphonics said:

I don't think that will always work

You're absolutely right.

There are situations where [itex]p(a)/p(b)[/itex] is undefined because both [itex]p(a|b)[/itex] and [itex]p(b|a)[/itex] are zero. In that case, there is no way of deducing a joint distribution without additional information. The expression of [itex]p(a|b)/p(b|a)[/itex] may also be just too complicated to easily separated into an [itex]a[/itex]-dependent and a [itex]b[/itex]-dependent factor.

An even more common problem is that [itex]p(a|b)[/itex] and [itex]p(b|a)[/itex] may be derived from different sources, and it may in such cases be incorrect to view them as conditionals of the same joint distribution [itex]p(a,b)[/itex].

-Emanuel

quadraphonics · Oct 9, 2008

Isn't it also a problem if just one of the conditional distributions assigns zero probability to some region where the corresponding marginal has nonzero probability? I.e., you're trying to infer something about the marginal from a condition that eliminates all information about it. And, after all, divide by zero is undefined...

But I think the approach should be fine, in principle, if you add the restriction that all of the distributions in question are nonzero on the support of the pertinent random variables (which is probably the case that most people are interested in). It may still be impractical to actually work out the expressions for the distributions (and there may be no closed form expression, as they will typically require normalization), but it should all be well-defined... For exponential family distributions, my intuition is that this should always work out nicely, due to the exponents playing nicely with the ratio (i.e., it turns into a linear separation of functions of a and b, instead of a ratio separation). Also note that exponential distributions tend to fulfill the nonzero requirement up front, except for a few boundary cases (which should be degenerate anyway, if my intuition holds...)

statdad · Oct 9, 2008

"An even more common problem is that [tex]p(a \mid b)[/tex] and [tex]p(b \mid a)[/tex] may be derived from different sources, and it may in such cases be incorrect to view them as conditionals of the same joint distribution."

I'm not sure what you mean by this.

winterfors · Oct 9, 2008

quadraphonics said:

Isn't it also a problem if just one of the conditional distributions assigns zero probability to some region where the corresponding marginal has nonzero probability?

If just one of [tex]p(a)[/tex] and [tex]p(b)[/tex] is zero, one can just invert both sides of

[tex]\frac{p(a)}{p(b)} = \frac{p(a|b)}{p(b|a)}[/tex]

and get a well-defined equation.

winterfors · Oct 9, 2008

statdad said:

"An even more common problem is that [tex]p(a \mid b)[/tex] and [tex]p(b \mid a)[/tex] may be derived from different sources, and it may in such cases be incorrect to view them as conditionals of the same joint distribution."

I'm not sure what you mean by this.

Uhm, it gets a bit technical in terms of what information sources have been used to construct each of the two conditional probability distributions. The short answer is that if they come from completely different sources, one has to assume a marginal distribution for each of them separately, and these can not be derived from the other conditional distribution.

One could do like this: Let's call our two conditional distributions [tex]q(b | a)[/tex] and [tex]r(a | b)[/tex].

We can construct two separate joint distributions by using two non-informative priors [tex]q(a)[/tex] and [tex]r(b)[/tex] :

[tex]q(a,b) = q(b | a)q(a)[/tex]
[tex]r(a,b) = r(a | b)r(b)[/tex]

These can then be combined into a third, joint probability distribution

[tex]p(a,b) = K \frac{q(a,b) r(a,b)}{\mu(a)\mu(b)}[/tex] ,

where [itex]\mu(a)[/itex] and [itex]\mu(b)[/itex] are homogeneous probability densities and [tex]K[/tex] is a normalization constant

[tex]\frac{1}{K} = \int \int \frac{q(a,b) r(a,b)}{\mu(a)\mu(b)} da db[/tex]

I'm not sure this makes things any clearer for you, if you're interested this kind of combining of probability distributions you can have a look in the book "Inverse Problem Theory and Model Parameter Estimation" by Albert Tarantola, around pages 13 and 32.
It's available for download at

http://www.ipgp.jussieu.fr/~tarantola/Files/Professional/Books/index.html

Cheers,

-Emanuel

quadraphonics · Oct 9, 2008

For a particular a or b, sure, but we need the expression to hold for all a and b in the support of the marginals in order for the approach to work, don't we? Perhaps it would still work to do the factorization in a piecewise manner and then stitch the results back together? This would work as long as there is no region where both conditionals are zero (but where the marginals are not... which there can't be, if the conditionals come from the same joint distribution). However, I'm not sure this is possible, since you wouldn't know how to normalize each of the pieces. But perhaps it can all be worked out... I will think on it a bit more...

Regardless of the prospects for a piecewise solution, a sufficient condition is that at least one of the conditionals never equals zero in regions where the marginal is positive. Then, you put that conditional in the denominator and proceed. If both conditionals have (disjoint) zero regions, then neither choice of denominator works for the entire support of the marginals.

statdad · Oct 10, 2008

Thank you winterfors - I understand your calculations (although I've never seen [tex]\mu[/tex] to represent a density rather than a distribution measure - merely notation), and actually am aware of their basis. I am guilty of one of two things:
* Either taking away an incomplete understanding of the o.p.'s question, or
* Noting that you were referring to a more general situation than the one in the current discussion

Joint probability from conditional probability?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect