Why do you need a convex/concave function to do a Legendre transform?

Click For Summary

Discussion Overview

The discussion centers around the necessity of convex or concave functions for performing a Legendre transform. Participants explore the implications of bijectivity in the context of the transformation and the relationship between the function and its derivative.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • Some participants propose that the Legendre transform requires a convex function to ensure that the inverse function is bijective, which is necessary for the transformation to be valid.
  • Others argue that without a convex or concave function, there may be points that do not map to any output, leading to a loss of information in the transformation.
  • A participant questions the necessity of maximization in the definition of the Legendre transform, suggesting that alternative forms could be defined without it, yet still face bijectivity issues.
  • Another participant emphasizes that the maximization aspect is crucial for establishing a unique mapping between points on the function and their corresponding tangent lines, which is essential for the transformation.
  • Some participants discuss the geometric interpretation of the Legendre transform, relating it to the selection of slopes and lines, and how this relates to the bijectivity of the functions involved.

Areas of Agreement / Disagreement

Participants express differing views on the necessity of convexity or concavity for the Legendre transform, with some supporting its importance while others question the definitions and assumptions involved. The discussion remains unresolved regarding the necessity of maximization in the transformation process.

Contextual Notes

Participants highlight the dependence on definitions and the implications of bijectivity in the context of the Legendre transform. There are unresolved questions about the role of maximization and the potential for alternative definitions of the transform.

Prologue
Messages
183
Reaction score
1
I have been trying to figure this out for a couple weeks now. Why does the Legendre transform require that the function be convex?

Is it because g(x) has to be solved to get x(g) and finding this inverse means g(x) should be bijective? (And if g is bijective then dg/dx will always be positive or always be negative [or perhaps one spot be zero] which means always convex (or always concave).)
 
Physics news on Phys.org
Prologue said:
I have been trying to figure this out for a couple weeks now. Why does the Legendre transform require that the function be convex?

Is it because g(x) has to be solved to get x(g) and finding this inverse means g(x) should be bijective? (And if g is bijective then dg/dx will always be positive or always be negative [or perhaps one spot be zero] which means always convex (or always concave).)

You need a concave (or convex) function or else you'll have points that map to nowhere!

The Legendre transformation is a special type of transformation where there is a relationship between the transform variable and the function variable [if your function is f(x), and the transform is h(p), then the relationship between the variables is df(x)/dx=p for a Legendre transform]. If you have two distinct points x such that df(x)/dx=p, then only one of them will get mapped to p, thus destroying this relationship.

You can still Legendre transform your function if it's not concave or convex, but then some points x will not participate in the transform at all! And if these x don't participate in the transform, then you can't possibly invert your transform as your transform doesn't know about those x. Therefore your transform will lose information and you don't want that.
 
Thank you for the reply! I think I know what you are saying but can you clarify what you mean by 'participate in the transform'.
 
Prologue said:
Thank you for the reply! I think I know what you are saying but can you clarify what you mean by 'participate in the transform'.

If you go to Wikipedia and type in Legendre transform, the Legendre transform of the function f(x) is given by:

h(p)=max{px-f(x)}

So what is the Legendre transform at the point p=2, i.e., h(2)?

Well it is the maximum value of 2*x-f(x).

So you plug in every single value of x into 2x-f(x), and find which x maximizes this expression. Say this x value is x=5. That means 2*5-f(5) is greater than 2*6-f(6), 2*7-f(7), etc.

So h(2)=2*5-f(5), and p=2 corresponds to the point x=5.

The question then becomes how to find the pairs (x,p) in general?

Well, the maximum occurs where dh/dx=0, or p-f'(x)=0, or p=f'(x).

So in the example above, 2=f'(5).

But what if f(x) has two different x such that 2=f'(x), then max{px-f(x)} will only select one of the x, the one that produces the maximum.

So say that both x=5 and x=-3 satisfy 2=f'(x) [i.e., 2=f'(5)=f'(-3)].

Then your Legendre transform at h(2) is only determined by one of them, whichever one has the greater expression for max{p*x-f(x)}. Say that 2*5-f(5) is greater than 2*(-3)-f(-3). That means h(2) is determined only by the point x=5. The point x=-3 does not contribute to h(2), and in fact it does not contribute to h(anything) since f'(-3)=2 so it can only contribute to h(2) and not say h(3). But h(2) was determined by x=5.

So really the point x=-3 has no say at all in the Legendre transform. If that is the case, then if you try to invert the Legendre transform h(p) to get f(x), then you can't possibly get f(-3).
 
Thanks for the detailed reply. I understand it well in the context of that specific definition of 'maximizing'. But the definition of the maximization seems a bit arbitrary to me, why define it that way? What if we define a new thing called the Legarden transform that is F(g)=gx(g)-f(x(g)), no maximization? Don't we still run into a problem with bijectivity of the g(x),x(g) functions? It seems to me that the convexivity stems from the assertion that you must attain each transform in terms of the correct variables, x or g. So where does this maximization business come from?
 
Prologue said:
Thanks for the detailed reply. I understand it well in the context of that specific definition of 'maximizing'. But the definition of the maximization seems a bit arbitrary to me, why define it that way? What if we define a new thing called the Legarden transform that is F(g)=gx(g)-f(x(g)), no maximization? Don't we still run into a problem with bijectivity of the g(x),x(g) functions? It seems to me that the convexivity stems from the assertion that you must attain each transform in terms of the correct variables, x or g. So where does this maximization business come from?

If you have the expression: px-f(x), then that doesn't tell you anything. Given an f(x), how do you find a new function?

px-f(x) is written in terms of 2 variables. You want it only in terms of one variable. The Legendre transform solves it by saying max{px-f(x)}. This is a function of one variable h(p).

If you want to define another transform without using max{}, that's perfectly fine. So you want to have a transform of this form: F(g)=gx(g)-f(x(g)). That's perfectly fine, but you need to define an expression for x(g). As you mentioned, if it's bijective, then that's good, because then it has a chance of being invertible, i.e., you can recover f(x) back given just F(g).

The reason for maximization is the Legendre transform wants to transform points into lines. That's a nice geometric intepretation you can find on Wikipedia. You had the variables x, which are points. And you had the transform variable p=f'(x), which is a slope or a line. In order to transform a point to a unique line, then the function has to be convex or concave, or else two points x will have the same slopes, but only one line can be selected. By having a function that is not convex or concave, you lose this bijective mapping between points on a function and tangent lines of the function.
 
So by selecting a slope g, you are in effect picking out a line mx+b, where g=m, x(g) and -f(x(g))=b. And the same thing for the inverse but with f->F, x->g. It still seems to me that the key lies in g(x) being bijective but I think what we are saying may be equivalent, so I'll leave it at that. Thanks for the insight.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
8
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 15 ·
Replies
15
Views
2K