Proof of convex conjugate identity

Click For Summary
The discussion centers on proving the conjugate of the function g(x) = f(Ax + b) as g*(y) = f*(A^{-T}y) - b^TA^{-T}y, where A is a nonsingular n x n matrix and b is in R^n. Participants explore the application of known identities from convex optimization, particularly focusing on the definitions of conjugate functions. A key point raised is the confusion regarding the notation A^{-T} and the dimensions of matrix A, which was clarified to be n x n. The original poster ultimately found a proof using the supremum definition, resolving their query. The thread concludes with the poster expressing satisfaction with the assistance received.
Zerox5f3759df

Homework Statement


Prove that the conjugate of ##g(x) = f(Ax + b)## is ## g^*(y) = f^*(A^{-T}y) - b^TA^{-T}y ## where A is nonsingular nXm matrix in R, and b is in ##R^n##.

Homework Equations


This is from chapter 3 of Boyd's Convex Optimization.

1. The conjugate function is defined as ## f^*(y) = \sup_{x\in dom f} (y^Tx - f(x))##

2. The differentiable conjugate is given as ## f^*(y) = x^{*T} \nabla f(x^*) - f(x^*)##

3. We also have that for arbitrary ##z \in R^n ## and ##y = \nabla f(z)## we have ## f^*(y) = z^T \nabla f(z) - f(z) ##

The Attempt at a Solution


This question/relation is stated right after explaining the differential conjugate relationships above, so I suspect I need to use these identities versus the definition using the supremum. However, I'm having trouble applying the known identities to the function ##g(x) = f(Ax + b)##. If the function was ##f(x) = Ax+b##, I'd calculate ##\nabla f(x)## as A, and plug that into (2) above.

I tried using (3) by letting ##z = Ax + b##. From this I have that ##y = \nabla f(z) = A## and then ## f^*(y) = z^T \nabla f(z) - f(z) ##. But expanding this out doesn't get me anything that looks promising.

Am I misunderstanding an identity here, or computing something wrong? Any pointers would be greatly appreciated.
 
Physics news on Phys.org
Zerox5f3759df said:

Homework Statement


Prove that the conjugate of ##g(x) = f(Ax + b)## is ## g^*(y) = f^*(A^{-T}y) - b^TA^{-T}y ## where A is nonsingular nXm matrix in R, and b is in ##R^n##.

Homework Equations


This is from chapter 3 of Boyd's Convex Optimization.

1. The conjugate function is defined as ## f^*(y) = \sup_{x\in dom f} (y^Tx - f(x))##

2. The differentiable conjugate is given as ## f^*(y) = x^{*T} \nabla f(x^*) - f(x^*)##

3. We also have that for arbitrary ##z \in R^n ## and ##y = \nabla f(z)## we have ## f^*(y) = z^T \nabla f(z) - f(z) ##

The Attempt at a Solution


This question/relation is stated right after explaining the differential conjugate relationships above, so I suspect I need to use these identities versus the definition using the supremum. However, I'm having trouble applying the known identities to the function ##g(x) = f(Ax + b)##. If the function was ##f(x) = Ax+b##, I'd calculate ##\nabla f(x)## as A, and plug that into (2) above.

I tried using (3) by letting ##z = Ax + b##. From this I have that ##y = \nabla f(z) = A## and then ## f^*(y) = z^T \nabla f(z) - f(z) ##. But expanding this out doesn't get me anything that looks promising.

Am I misunderstanding an identity here, or computing something wrong? Any pointers would be greatly appreciated.

How can an ##m \times n## matrix be nonsingular if ##m \neq n##? And if you happen to have ##m=n## and a nonsingular ##A##, what does the notation ##A^{-T}## mean?
 
  • Like
Likes Zerox5f3759df
Hello, and thanks for taking a look. The n x m is a typo, my apologies, it should be n x n. I do also have a proof, which actually does just use the supremum definition. Here it is

## g^*(y) = \sup_{x \in dom(g)} (y^Tx - g(x)) ##
## = \sup_{x \in dom(g)} (y^Tx - f(Ax + b)) ##
## = \sup_{x \in dom(g)} (y^Tx + y^TA^{-1}b - y^TA^{-1}b - f(Ax + b)) ##
## = \sup_{x \in dom(g)} (y^TA^{-1}(Ax + b) - f(Ax + b) - y^TA^{-1}b) ##
## = \sup_{x \in dom(g)} (y^TA^{-1}x - f(x) - y^TA^{-1}b) ##
## = f^*(A^{-T}y) - b^TA^{-T}y ##

As far as the notation ## A^{-T} ## that is the inverse of the transpose of A. I apologize on the notation and typo confusion. We are given questions pulled from all over, and I suspect there are oddities in the material due to that. Thanks so much for taking the time to review my question!
 
Hi Zerox5f3759df! Welcome to PF! :)

Do you still have a question?
 
Nope, I think I am good on this one. Thanks for checking!
 
Question: A clock's minute hand has length 4 and its hour hand has length 3. What is the distance between the tips at the moment when it is increasing most rapidly?(Putnam Exam Question) Answer: Making assumption that both the hands moves at constant angular velocities, the answer is ## \sqrt{7} .## But don't you think this assumption is somewhat doubtful and wrong?

Similar threads

Replies
8
Views
2K
  • · Replies 26 ·
Replies
26
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
Replies
2
Views
1K
  • · Replies 14 ·
Replies
14
Views
2K
Replies
2
Views
1K
Replies
5
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K