Combining Conditional Probability Distributions

In summary: B) means the distribution of x given B and (C or not C). g(x|C) means the distribution of x given C and (B or not B). h(x|B,C) means the distribution of x given B and C.In summary, when the button B is pressed, there is a different distribution of rain fall the next day, x. When the button C is pressed, there is an average of the two distributions, f(x|B,C). This average is given by h(x|B,C).
  • #1
Whenry
23
0
Hi all,

My question is the following. Let's say I have two probability distributions;

[tex]f(x|b)\,g(x|c)[/tex]

b and c are discrete events while x is a continuos variable. i.e When the button b is pressed there is some distribution for the amount of rain fall the next day, x. When the button c is pressed there is a different distribution of rain fall the next day, x. Are there any strategies for estimating the distribution of rain fall if both buttons are pressed,

[tex]h(x|b,c)\,?[/tex]

And, what assumptions do those strategies rest on?

Thank you in advance,

Will
 
Physics news on Phys.org
  • #2
Whenry said:
Hi all,

My question is the following. Let's say I have two probability distributions;

[tex]f(x|b)\,g(x|c)[/tex]

b and c are discrete events while x is a continuos variable. i.e When the button b is pressed there is some distribution for the amount of rain fall the next day, x. When the button c is pressed there is a different distribution of rain fall the next day, x. Are there any strategies for estimating the distribution of rain fall if both buttons are pressed,

[tex]h(x|b,c)\,?[/tex]

And, what assumptions do those strategies rest on?

Thank you in advance,

Will

Hey Whenry and welcome to the forums.

The subtlety with this kind of problem is one of interpretation and it boils down to the atomicity of events.

In probability, we usually break things down more or less in a way that we are able to identify events that can not be broken down any further (atomic) and also are completely disjoint from every other event.

In your situation, you have to interpret what these atomic events refer to because the first situation implies that your 'b' and 'c' events are disjoint, but now that you mention that you can have 'both' being pressed makes this assumption not hold if this double keypress corresponds to a real-event.

One way you can deal with this is to get a distribution that is an 'average' of the two distributions, which will be a proper distribution mathematically with respect to Kolmogorov axioms if both of the distributions are of the same domain and also both valid PDF's. If they not, then you need to consider this.

But again, it's more important that you consider what the events refer to rather than just trying to fudge things mathematically. If you want to consider three events (B only, C only, B and C) then this will need an interpretation. If you want to consider only two (B only, C only) then this will have an interpretation.

Without an interpretation and a subsequent understanding thereof, you have a mathematical model that really has no basis for understanding.
 
  • #3
chiro said:
Hey Whenry and welcome to the forums.

The subtlety with this kind of problem is one of interpretation and it boils down to the atomicity of events.

In probability, we usually break things down more or less in a way that we are able to identify events that can not be broken down any further (atomic) and also are completely disjoint from every other event.

In your situation, you have to interpret what these atomic events refer to because the first situation implies that your 'b' and 'c' events are disjoint, but now that you mention that you can have 'both' being pressed makes this assumption not hold if this double keypress corresponds to a real-event.

One way you can deal with this is to get a distribution that is an 'average' of the two distributions, which will be a proper distribution mathematically with respect to Kolmogorov axioms if both of the distributions are of the same domain and also both valid PDF's. If they not, then you need to consider this.

But again, it's more important that you consider what the events refer to rather than just trying to fudge things mathematically. If you want to consider three events (B only, C only, B and C) then this will need an interpretation. If you want to consider only two (B only, C only) then this will have an interpretation.

Without an interpretation and a subsequent understanding thereof, you have a mathematical model that really has no basis for understanding.

Thank you Chiro,

I apologize for lack of clarity. I mean the following cases (I am not sure of the proper notation): [itex] f(x|B) [/itex] means the distributions of x given B and (C or not C). [itex] g(x|C) [/itex] means the distributions of x given C and (B or not B). [itex] h(x|B,C) [/itex] means the distributions of x given B and C.

So, [itex] f(x|B) [/itex] is the PDF of x over (C or not C) and B.

In my originally analogy, this would be the distribution of rain fall x when B is definitely pressed and C may or may not be pressed. The probability of C being pressed is assumed to be independent of B, [itex] p(C|B) = p(C) [/itex], and vice versa, [itex] p(B|C) = p(B) [/itex].

I can relate this to a more realistic example where B and C are not buttons but are distinct weather patterns. i.e B represents a distinct pattern over greenland, and C represents a distinct patter over the Atlantic Ocean, and x is rain fall over Englad. I have enough data to reasonably determine [itex] f(x|B)[/itex] and [itex] g(x|C)[/itex], but I would like to, hopefully infer something about [itex] h(x|B,C)[/itex] . Unfortunately I have a very small sample size of data where both B and C have occurred simultaneously. The probabilities of C and B occurring are relatively small. [itex]p(B)≈0.05[/itex] and [itex]p(C)≈0.05[/itex].

I hope that helps. I appreciate your feedback.
 
  • #4
Whenry said:
Thank you Chiro,

I apologize for lack of clarity. I mean the following cases (I am not sure of the proper notation): [itex] f(x|B) [/itex] means the distributions of x given B and (C or not C). [itex] g(x|C) [/itex] means the distributions of x given C and (B or not B). [itex] h(x|B,C) [/itex] means the distributions of x given B and C.

So, [itex] f(x|B) [/itex] is the PDF of x over (C or not C) and B.

In my originally analogy, this would be the distribution of rain fall x when B is definitely pressed and C may or may not be pressed. The probability of C being pressed is assumed to be independent of B, [itex] p(C|B) = p(C) [/itex], and vice versa, [itex] p(B|C) = p(B) [/itex].

I can relate this to a more realistic example where B and C are not buttons but are distinct weather patterns. i.e B represents a distinct pattern over greenland, and C represents a distinct patter over the Atlantic Ocean, and x is rain fall over Englad. I have enough data to reasonably determine [itex] f(x|B)[/itex] and [itex] g(x|C)[/itex], but I would like to, hopefully infer something about [itex] h(x|B,C)[/itex] . Unfortunately I have a very small sample size of data where both B and C have occurred simultaneously. The probabilities of C and B occurring are relatively small. [itex]p(B)≈0.05[/itex] and [itex]p(C)≈0.05[/itex].

I hope that helps. I appreciate your feedback.

I misunderstood what B and C were referring to: it seems that these are three different events with clear and distinct meanings which is what you need.

If you want to infer something like a distribution, this is a little bit more complicated than making an inference on say a mean, group or means of variance.

I recommend you look into something along the lines of a Markov-Chain Monte-Carlo (MCMC) scheme in the Bayesian setting. Bayesian statistics is very useful especially in the context of not having a lot of data.

There is a program called WinBUGS:

http://www.mrc-bsu.cam.ac.uk/bugs/

This can generate distributions based on given priors, likelihoods and also based on specific data that is used to generate distributions and from this means, variances and so on.

The key thing of course is specifying the model parameters and you will need to understand Bayesian statistics and the MCMC method.

Have you had experience with this kind of thing before? Have you been exposed to Bayesian inference?
 
Last edited by a moderator:
  • #5
chiro said:
I misunderstood what B and C were referring to: it seems that these are three different events with clear and distinct meanings which is what you need.

If you want to infer something like a distribution, this is a little bit more complicated than making an inference on say a mean, group or means of variance.

I recommend you look into something along the lines of a Markov-Chain Monte-Carlo (MCMC) scheme in the Bayesian setting. Bayesian statistics is very useful especially in the context of not having a lot of data.

There is a program called WinBUGS:

http://www.mrc-bsu.cam.ac.uk/bugs/

This can generate distributions based on given priors, likelihoods and also based on specific data that is used to generate distributions and from this means, variances and so on.

The key thing of course is specifying the model parameters and you will need to understand Bayesian statistics and the MCMC method.

Have you had experience with this kind of thing before? Have you been exposed to Bayesian inference?

I do have experience coding naive bayes binomial classifiers, but that is where my experience ends. I certainly have no experience using bayesian inference to arrive at PDFs of continuous variables, as is x in the above example. Neither do I have experience with MCMC.

I will need to find some crash course with examples as I need to make some quick decisions on how to find a reasonable estimate of [itex] h(x|b,c) [/itex].

Any more pointers or advice would be very appreciated.

thank you,

Will
 
Last edited by a moderator:
  • #6
Whenry said:
I do have experience coding naive bayes binomial classifiers, but that is where my experience ends. I certainly have no experience using bayesian inference to arrive at PDFs of continuous variables, as is x in the above example. Neither do I have experience with MCMC.

I will need to find some crash course with examples as I need to make some quick decisions on how to find a reasonable estimate of [itex] h(x|b,c) [/itex].

Any more pointers or advice would be very appreciated.

thank you,

Will

I guess the only advice would be to know the limits of your data and the other assumptions that will be used to generate simulated distributions using MCMC.

Understanding the limitations of your prior and how you describe it too will be important as well as the consequences for using priors, especially with low data points.

This kind of thing though is really application and domain specific, and you are ultimately going to have the expert knowledge that I don't have a chance of having.
 
  • #7
chiro said:
I guess the only advice would be to know the limits of your data and the other assumptions that will be used to generate simulated distributions using MCMC.

Understanding the limitations of your prior and how you describe it too will be important as well as the consequences for using priors, especially with low data points.

This kind of thing though is really application and domain specific, and you are ultimately going to have the expert knowledge that I don't have a chance of having.

Thank you chiro, I appreciate your feedback. I have been doing some investigating into Bayesian inference and seems that I will have to have some data points within the distribution [itex] h(x|b,c) [/itex] in order to infer the paramaters of the distribution. Unfortunately, I will have very few to none of these data points, especially considering that the full application will be considering more conditions than only b and c, ie. [itex] h(x|b,c,d,e,f,...) [/itex]. I think the best strategy may be to discretize the random variable x into categories. i.e. (in our above example) "0-2 inches of rain", "2-5 inches of rain", "5-7 inches of rain", and then I can use a multinomial naive bayes network to model the relative probabilities of each category, and then fit a distribution to that (?).
 
  • #8
Whenry said:
Thank you Chiro,

I apologize for lack of clarity. I mean the following cases (I am not sure of the proper notation): [itex] f(x|B) [/itex] means the distributions of x given B and (C or not C). [itex] g(x|C) [/itex] means the distributions of x given C and (B or not B). [itex] h(x|B,C) [/itex] means the distributions of x given B and C.

So, [itex] f(x|B) [/itex] is the PDF of x over (C or not C) and B.

In my originally analogy, this would be the distribution of rain fall x when B is definitely pressed and C may or may not be pressed. The probability of C being pressed is assumed to be independent of B, [itex] p(C|B) = p(C) [/itex], and vice versa, [itex] p(B|C) = p(B) [/itex].

I can relate this to a more realistic example where B and C are not buttons but are distinct weather patterns. i.e B represents a distinct pattern over greenland, and C represents a distinct patter over the Atlantic Ocean, and x is rain fall over Englad. I have enough data to reasonably determine [itex] f(x|B)[/itex] and [itex] g(x|C)[/itex], but I would like to, hopefully infer something about [itex] h(x|B,C)[/itex] . Unfortunately I have a very small sample size of data where both B and C have occurred simultaneously. The probabilities of C and B occurring are relatively small. [itex]p(B)≈0.05[/itex] and [itex]p(C)≈0.05[/itex].

I hope that helps. I appreciate your feedback.

Hi Whenry,

If you don't have an underlying model for h, or a theoretical background that helps you out to guess its behavior, or any other information about it, then you are left with your little data, and that's all you have.

The way you express the problem seems like if f and g should tell you how h behave, but then it is you that have to assess with the experience you have in the field what that relationship is. If there are no grounds for any relationship among f, g, h, B or C then you simply need more data.
 
Last edited:

1. What is the concept of combining conditional probability distributions?

Combining conditional probability distributions is the process of using multiple conditional probability distributions to calculate the probability of an event occurring. It involves understanding the relationship between different variables and their impact on the final outcome.

2. Why is it important to combine conditional probability distributions?

Combining conditional probability distributions allows for a more accurate prediction of the likelihood of an event. By taking into account multiple factors, the resulting probability is more comprehensive and reliable.

3. How do you combine conditional probability distributions?

To combine conditional probability distributions, you multiply the individual probabilities together. This is known as the product rule and is based on the understanding that the probability of two independent events occurring together is the product of their individual probabilities.

4. What are some common applications of combining conditional probability distributions?

Combining conditional probability distributions is commonly used in fields such as finance, economics, and machine learning. It is also used in risk assessment and prediction models in various industries.

5. Are there any limitations to combining conditional probability distributions?

One limitation of combining conditional probability distributions is that it assumes the events are independent, which may not always be the case. Additionally, it can become more complex and difficult to calculate with a large number of variables.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
341
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
471
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
981
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
Back
Top