On concave functions over spaces of probabilty distributions

In summary, we must show that the function H[X] must be concave with respect to the space of probability distributions, and the probability density function corresponding to keeping p(y|x) constant must be separable for H[X|Y] to also be concave in this case.
  • #1
bayesian
2
0
Given two (dependent) random variables [itex]X[/itex] and [itex]Y[/itex] with joint PDF [itex]p(x,y)[/itex] [itex]=p(x|y)p(y)[/itex] [itex]=p(y|x)p(x)[/itex], let [itex]H[X][/itex] be real-valued concave function of [itex]p(x)[/itex], and [itex]H[X|Y][/itex] the expectation of [itex]H[/itex] of [itex]p(x|y)[/itex] with respect to [itex]p(y)[/itex].

Examples of possible functions [itex]H[/itex] include the entropy of [itex]X[/itex], or its variance.

The concavity of [itex]H[/itex] implies that [itex]H[X]-H[X|Y]≥0[/itex] (through Jensen's inequality).

Question:

What additional conditions (if any) on [itex]H[/itex] are imposed if we in addtion require that [itex]H[X]-H[X|Y][/itex] should also be concave with respect to [itex]p(x)[/itex], if [itex]p(y|x)[/itex] remains fixed?
 
Physics news on Phys.org
  • #2
Hey bayesian and welcome to the forums.

If you want to do a rigorous proof on this, then the way I would approach is to show what needs to hold for concavity of h when h = f*g. You can use the definition of a concave function for this, simplify and then collect terms.

We also know that p(x,y) and q(x,y) are separable since if p(y|x) is constant between two distributions, then if we have two different distributions corresponding to your two cases where the first corresponds to p(y) being concave, and the second corresponds to p(x) as well as p(y) being concave then this implies that p(x,y)/p(x) = q(x,y)/q(x) which implies q(x)/p(x) = q(x,y)/p(x,y) which means both p(x,y) and q(x,y) are separable.

So once you show what is needed for concavity to hold you can show what conditions p(x,y) and q(x,y) need and thus what the joint distributions need (or vice-versa) for concavity and hopefully answer your question.

This is only a suggestion, but I think those steps would be more rigorous, although it probably would be complicated.
 
  • #3
chiro, I do not understand your suggestion, why should I consider products of functions that are concave? I cannot see how such products arise from the problem that I stated.

Just to make things clear: in my problem, H is a function of the probability distribution p(x) (e.g. the variance) and concave with respect to the space of probability distributions.
 
  • #4
bayesian said:
chiro, I do not understand your suggestion, why should I consider products of functions that are concave? I cannot see how such products arise from the problem that I stated.

Just to make things clear: in my problem, H is a function of the probability distribution p(x) (e.g. the variance) and concave with respect to the space of probability distributions.

By a "function of the probability distribution" you mean that H(X) could be V(X) where X is a random variable? Because if so V(X) is not a function but a number. I don't think yo mean that, but I am not sure what you mean either.
 
  • #5
bayesian said:
chiro, I do not understand your suggestion, why should I consider products of functions that are concave? I cannot see how such products arise from the problem that I stated.

Just to make things clear: in my problem, H is a function of the probability distribution p(x) (e.g. the variance) and concave with respect to the space of probability distributions.

I'm a little confused like viraltux as well, but if i understand correctly then H[X] will be a number, but H[X|Y] will be a function and this will have to concave. Is this right?

If this is concave and you end up getting H[X|Y] = Z(X) for some Y, then you want to show that this is concave as well. For H[X|Y], we get an integral in the form of p(x|y)p(y)ydy and the result of this will be concave (a function of x).

Now you can show that using your condition, the probability density function corresponding to keeping p(y|x) constant between changes will result in p(x,y) = b(x)c(y) [Separable].

From this you can take b(x) outside the integral (only with respect to y as in the dy) and then this implies that b(x) is concave, which implies H[X|Y] as a function is concave.
 

1. What is a concave function?

A concave function is a mathematical function where the graph of the function is always below or touching the line segment connecting any two points on the graph. In other words, the function "curves down" and has a decreasing rate of change.

2. How are concave functions used in spaces of probability distributions?

In spaces of probability distributions, concave functions are commonly used to measure the "distance" between two probability distributions. This distance measure is known as the Kullback-Leibler divergence and is useful in various statistical and machine learning applications.

3. Can all functions over probability distributions be concave?

No, not all functions over probability distributions are concave. In fact, there are many functions that are not concave, such as the entropy function, which is commonly used in information theory.

4. How can concave functions be optimized over spaces of probability distributions?

Concave functions can be optimized over spaces of probability distributions using various numerical optimization techniques, such as gradient descent or Newton's method. These methods involve finding the minimum or maximum of the function by iteratively adjusting the input parameters.

5. What are some real-world applications of concave functions over spaces of probability distributions?

Concave functions over spaces of probability distributions have many real-world applications, such as in finance for portfolio optimization, in economics for utility theory, and in machine learning for clustering and classification tasks. They are also commonly used in data compression and information theory.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
932
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
929
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
690
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
863
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
902
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
Back
Top