Combining Conditional Probability Distributions

Click For Summary

Discussion Overview

The discussion revolves around the estimation of a combined probability distribution, h(x|b,c), given two separate conditional probability distributions, f(x|b) and g(x|c), where b and c are discrete events and x is a continuous variable. The context includes considerations of independence between events and the implications of having limited data for simultaneous occurrences of b and c.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • Some participants suggest that the interpretation of the atomicity of events is crucial for understanding how to combine the distributions f(x|b) and g(x|c).
  • One participant proposes averaging the two distributions as a potential method for estimating h(x|b,c), provided both distributions are valid probability density functions (PDFs) and share the same domain.
  • Another participant clarifies that f(x|B) represents the distribution of x given B and either C or not C, while g(x|C) represents the distribution of x given C and either B or not B.
  • Concerns are raised about the independence of events b and c, with assumptions stated as p(C|B) = p(C) and p(B|C) = p(B).
  • One participant mentions the challenge of inferring h(x|B,C) due to a small sample size where both events have occurred simultaneously, noting the low probabilities of occurrence for both events.
  • A suggestion is made to explore Bayesian statistics and Markov-Chain Monte-Carlo (MCMC) methods as a way to handle the limited data and infer distributions.

Areas of Agreement / Disagreement

Participants express varying interpretations of the events and their implications for combining distributions, indicating that there is no consensus on the best approach or methodology for estimating h(x|b,c). The discussion remains unresolved regarding the optimal strategies and assumptions needed.

Contextual Notes

Participants highlight the importance of understanding the specific meanings of events b and c, as well as the implications of their independence. Limitations include the small sample size for simultaneous occurrences and the need for clear definitions of the events involved.

Whenry
Messages
22
Reaction score
0
Hi all,

My question is the following. Let's say I have two probability distributions;

f(x|b)\,g(x|c)

b and c are discrete events while x is a continuos variable. i.e When the button b is pressed there is some distribution for the amount of rain fall the next day, x. When the button c is pressed there is a different distribution of rain fall the next day, x. Are there any strategies for estimating the distribution of rain fall if both buttons are pressed,

h(x|b,c)\,?

And, what assumptions do those strategies rest on?

Thank you in advance,

Will
 
Physics news on Phys.org
Whenry said:
Hi all,

My question is the following. Let's say I have two probability distributions;

f(x|b)\,g(x|c)

b and c are discrete events while x is a continuos variable. i.e When the button b is pressed there is some distribution for the amount of rain fall the next day, x. When the button c is pressed there is a different distribution of rain fall the next day, x. Are there any strategies for estimating the distribution of rain fall if both buttons are pressed,

h(x|b,c)\,?

And, what assumptions do those strategies rest on?

Thank you in advance,

Will

Hey Whenry and welcome to the forums.

The subtlety with this kind of problem is one of interpretation and it boils down to the atomicity of events.

In probability, we usually break things down more or less in a way that we are able to identify events that can not be broken down any further (atomic) and also are completely disjoint from every other event.

In your situation, you have to interpret what these atomic events refer to because the first situation implies that your 'b' and 'c' events are disjoint, but now that you mention that you can have 'both' being pressed makes this assumption not hold if this double keypress corresponds to a real-event.

One way you can deal with this is to get a distribution that is an 'average' of the two distributions, which will be a proper distribution mathematically with respect to Kolmogorov axioms if both of the distributions are of the same domain and also both valid PDF's. If they not, then you need to consider this.

But again, it's more important that you consider what the events refer to rather than just trying to fudge things mathematically. If you want to consider three events (B only, C only, B and C) then this will need an interpretation. If you want to consider only two (B only, C only) then this will have an interpretation.

Without an interpretation and a subsequent understanding thereof, you have a mathematical model that really has no basis for understanding.
 
chiro said:
Hey Whenry and welcome to the forums.

The subtlety with this kind of problem is one of interpretation and it boils down to the atomicity of events.

In probability, we usually break things down more or less in a way that we are able to identify events that can not be broken down any further (atomic) and also are completely disjoint from every other event.

In your situation, you have to interpret what these atomic events refer to because the first situation implies that your 'b' and 'c' events are disjoint, but now that you mention that you can have 'both' being pressed makes this assumption not hold if this double keypress corresponds to a real-event.

One way you can deal with this is to get a distribution that is an 'average' of the two distributions, which will be a proper distribution mathematically with respect to Kolmogorov axioms if both of the distributions are of the same domain and also both valid PDF's. If they not, then you need to consider this.

But again, it's more important that you consider what the events refer to rather than just trying to fudge things mathematically. If you want to consider three events (B only, C only, B and C) then this will need an interpretation. If you want to consider only two (B only, C only) then this will have an interpretation.

Without an interpretation and a subsequent understanding thereof, you have a mathematical model that really has no basis for understanding.

Thank you Chiro,

I apologize for lack of clarity. I mean the following cases (I am not sure of the proper notation): f(x|B) means the distributions of x given B and (C or not C). g(x|C) means the distributions of x given C and (B or not B). h(x|B,C) means the distributions of x given B and C.

So, f(x|B) is the PDF of x over (C or not C) and B.

In my originally analogy, this would be the distribution of rain fall x when B is definitely pressed and C may or may not be pressed. The probability of C being pressed is assumed to be independent of B, p(C|B) = p(C), and vice versa, p(B|C) = p(B).

I can relate this to a more realistic example where B and C are not buttons but are distinct weather patterns. i.e B represents a distinct pattern over greenland, and C represents a distinct patter over the Atlantic Ocean, and x is rain fall over Englad. I have enough data to reasonably determine f(x|B) and g(x|C), but I would like to, hopefully infer something about h(x|B,C) . Unfortunately I have a very small sample size of data where both B and C have occurred simultaneously. The probabilities of C and B occurring are relatively small. p(B)≈0.05 and p(C)≈0.05.

I hope that helps. I appreciate your feedback.
 
Whenry said:
Thank you Chiro,

I apologize for lack of clarity. I mean the following cases (I am not sure of the proper notation): f(x|B) means the distributions of x given B and (C or not C). g(x|C) means the distributions of x given C and (B or not B). h(x|B,C) means the distributions of x given B and C.

So, f(x|B) is the PDF of x over (C or not C) and B.

In my originally analogy, this would be the distribution of rain fall x when B is definitely pressed and C may or may not be pressed. The probability of C being pressed is assumed to be independent of B, p(C|B) = p(C), and vice versa, p(B|C) = p(B).

I can relate this to a more realistic example where B and C are not buttons but are distinct weather patterns. i.e B represents a distinct pattern over greenland, and C represents a distinct patter over the Atlantic Ocean, and x is rain fall over Englad. I have enough data to reasonably determine f(x|B) and g(x|C), but I would like to, hopefully infer something about h(x|B,C) . Unfortunately I have a very small sample size of data where both B and C have occurred simultaneously. The probabilities of C and B occurring are relatively small. p(B)≈0.05 and p(C)≈0.05.

I hope that helps. I appreciate your feedback.

I misunderstood what B and C were referring to: it seems that these are three different events with clear and distinct meanings which is what you need.

If you want to infer something like a distribution, this is a little bit more complicated than making an inference on say a mean, group or means of variance.

I recommend you look into something along the lines of a Markov-Chain Monte-Carlo (MCMC) scheme in the Bayesian setting. Bayesian statistics is very useful especially in the context of not having a lot of data.

There is a program called WinBUGS:

http://www.mrc-bsu.cam.ac.uk/bugs/

This can generate distributions based on given priors, likelihoods and also based on specific data that is used to generate distributions and from this means, variances and so on.

The key thing of course is specifying the model parameters and you will need to understand Bayesian statistics and the MCMC method.

Have you had experience with this kind of thing before? Have you been exposed to Bayesian inference?
 
Last edited by a moderator:
chiro said:
I misunderstood what B and C were referring to: it seems that these are three different events with clear and distinct meanings which is what you need.

If you want to infer something like a distribution, this is a little bit more complicated than making an inference on say a mean, group or means of variance.

I recommend you look into something along the lines of a Markov-Chain Monte-Carlo (MCMC) scheme in the Bayesian setting. Bayesian statistics is very useful especially in the context of not having a lot of data.

There is a program called WinBUGS:

http://www.mrc-bsu.cam.ac.uk/bugs/

This can generate distributions based on given priors, likelihoods and also based on specific data that is used to generate distributions and from this means, variances and so on.

The key thing of course is specifying the model parameters and you will need to understand Bayesian statistics and the MCMC method.

Have you had experience with this kind of thing before? Have you been exposed to Bayesian inference?

I do have experience coding naive bayes binomial classifiers, but that is where my experience ends. I certainly have no experience using bayesian inference to arrive at PDFs of continuous variables, as is x in the above example. Neither do I have experience with MCMC.

I will need to find some crash course with examples as I need to make some quick decisions on how to find a reasonable estimate of h(x|b,c).

Any more pointers or advice would be very appreciated.

thank you,

Will
 
Last edited by a moderator:
Whenry said:
I do have experience coding naive bayes binomial classifiers, but that is where my experience ends. I certainly have no experience using bayesian inference to arrive at PDFs of continuous variables, as is x in the above example. Neither do I have experience with MCMC.

I will need to find some crash course with examples as I need to make some quick decisions on how to find a reasonable estimate of h(x|b,c).

Any more pointers or advice would be very appreciated.

thank you,

Will

I guess the only advice would be to know the limits of your data and the other assumptions that will be used to generate simulated distributions using MCMC.

Understanding the limitations of your prior and how you describe it too will be important as well as the consequences for using priors, especially with low data points.

This kind of thing though is really application and domain specific, and you are ultimately going to have the expert knowledge that I don't have a chance of having.
 
chiro said:
I guess the only advice would be to know the limits of your data and the other assumptions that will be used to generate simulated distributions using MCMC.

Understanding the limitations of your prior and how you describe it too will be important as well as the consequences for using priors, especially with low data points.

This kind of thing though is really application and domain specific, and you are ultimately going to have the expert knowledge that I don't have a chance of having.

Thank you chiro, I appreciate your feedback. I have been doing some investigating into Bayesian inference and seems that I will have to have some data points within the distribution h(x|b,c) in order to infer the paramaters of the distribution. Unfortunately, I will have very few to none of these data points, especially considering that the full application will be considering more conditions than only b and c, ie. h(x|b,c,d,e,f,...). I think the best strategy may be to discretize the random variable x into categories. i.e. (in our above example) "0-2 inches of rain", "2-5 inches of rain", "5-7 inches of rain", and then I can use a multinomial naive bayes network to model the relative probabilities of each category, and then fit a distribution to that (?).
 
Whenry said:
Thank you Chiro,

I apologize for lack of clarity. I mean the following cases (I am not sure of the proper notation): f(x|B) means the distributions of x given B and (C or not C). g(x|C) means the distributions of x given C and (B or not B). h(x|B,C) means the distributions of x given B and C.

So, f(x|B) is the PDF of x over (C or not C) and B.

In my originally analogy, this would be the distribution of rain fall x when B is definitely pressed and C may or may not be pressed. The probability of C being pressed is assumed to be independent of B, p(C|B) = p(C), and vice versa, p(B|C) = p(B).

I can relate this to a more realistic example where B and C are not buttons but are distinct weather patterns. i.e B represents a distinct pattern over greenland, and C represents a distinct patter over the Atlantic Ocean, and x is rain fall over Englad. I have enough data to reasonably determine f(x|B) and g(x|C), but I would like to, hopefully infer something about h(x|B,C) . Unfortunately I have a very small sample size of data where both B and C have occurred simultaneously. The probabilities of C and B occurring are relatively small. p(B)≈0.05 and p(C)≈0.05.

I hope that helps. I appreciate your feedback.

Hi Whenry,

If you don't have an underlying model for h, or a theoretical background that helps you out to guess its behavior, or any other information about it, then you are left with your little data, and that's all you have.

The way you express the problem seems like if f and g should tell you how h behave, but then it is you that have to assess with the experience you have in the field what that relationship is. If there are no grounds for any relationship among f, g, h, B or C then you simply need more data.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 11 ·
Replies
11
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K