Derivative (mimization) of matrix trace

Click For Summary

Discussion Overview

The discussion revolves around finding the derivative of a function involving the trace of matrices, specifically the function f(X) = Tr(X'AX) - 2Tr(X'BC). Participants are exploring the process of deriving this function with respect to the matrix X, focusing on the mathematical steps involved in the differentiation and the implications for minimization.

Discussion Character

  • Technical explanation
  • Homework-related
  • Mathematical reasoning

Main Points Raised

  • One participant presents the function f(X) and expresses uncertainty about how to proceed with finding its derivative.
  • Another participant provides a derivation using the linearity of the trace, leading to the expression f(X) = Tr(X' (A-2B) X) and calculates the derivative, resulting in a linear equation involving the symmetric matrix (A+A'-2B-2B').
  • A later reply seeks clarification on the derivation steps, specifically requesting a detailed breakdown of the first derivative of both terms in the function.
  • Another participant acknowledges a misunderstanding of the original function and provides a corrected derivative, f'(X) = (A+A')X - 2BC, suggesting it leads to a linear equation for solving.
  • One participant mentions the use of index notation for verifying the correctness of transposes in the derivation process.
  • Another participant admits to the homework nature of the question and expresses a desire for a rationale for the coding procedure related to the derivation.
  • One participant offers to share a PDF containing a detailed derivation using index notation for further assistance.

Areas of Agreement / Disagreement

Participants express varying levels of understanding regarding the derivation process, with some providing corrections to earlier claims. There is no consensus on a single approach or final answer, as participants are still exploring the derivation steps and their implications.

Contextual Notes

Some participants note the potential for confusion regarding the original function and its terms, as well as the importance of verifying transposes through index notation. The discussion reflects a range of assumptions and interpretations that have not been fully resolved.

onako
Messages
86
Reaction score
0
Given a function
f(X)= Tr(X'AX) - 2Tr(X'BC), with X' denoting matrix transpose, I'm supposed to find the expression used to miminize the function with respect to X. The derivatives should be used, but I'm not sure how to proceed.
Any help is appreciated.
 
Physics news on Phys.org
Using linearity of the trace you get
<br /> f(X) = Tr(X^t (A-2B) X)<br />

Then you can take the matrix derivative using the chain rule
<br /> \frac{\partial f(X)}{\partial X} = (A-2B)X + (A-2B)^t X = (A+A^t-2B-2B^t)X<br />
This can be checked using index notation if you're not sure of how the derivatives work.

Finally, you have just a linear equation to solve in order to find the stationary points of f(X).
The columns of X must be null vectors of the symmetric matrix (A+A'-2B-2B').

Here's some Mathematica code that let's you check it for any dimension n.

Code:
In[1]:= (n=2;X=Array[x,{n,n}];A=Array[a,{n,n}];B=Array[b,{n,n}]);
In[2]:= Tr[X\[Transpose].A.X]-2Tr[X\[Transpose].B.X]==Tr[X\[Transpose].(A-2B).X]//Expand
Out[2]= True
In[3]:= D[X,{X}]==Array[Boole[#1==#3 && #2==#4]&,{n,n,n,n}]
Out[3]= True
In[4]:= D[Tr[X\[Transpose].A.X],{X}]==Array[D[Tr[X\[Transpose].A.X],x[##]]&,{n,n}]
Out[4]= True
In[5]:= D[Tr[X\[Transpose].A.X],{X}]== (A+A\[Transpose]).X//Expand
Out[5]= True

--------------------------------------
The LaTeX image generation wasn't working, so here's the first two equations:
f(X) = Tr(X' (A-2B) X)
f'(X) = (A-2B) X + (A-2B)' X = (A+A'-2B-2B') X
 
Last edited:
Thanks.
I'm not sure I understand this derivation.
Given f(X)= Tr(X'AX) - 2Tr(X'BC), X' denoting transpose of X,
I'm supposed to find the first derivative of f(x).
Could you follow step by step procedure; first derivative of Tr(X'AX), and then derivative of 2Tr(X'BC)?
The final derivative should have A, B, C, and X.
 
Sorry about that, I misread your question as
f(X)= Tr(X'AX) - 2Tr(X'BX)

------

So, taking the derivative of your actual expression gives

f'(X)= (A+A')X - 2BC

Which is a linear equation you can solve,

------

The best (I imagine that's subjective) way to be sure that you got all of the transposes correct is to use index notation. E.g. https://www.physicsforums.com/showthread.php?t=198712. Then you basically just take derivatives wrt the components, instead of the whole matrix.

I've written up a detailed derivation, but since I have a suspicion that this is a homework question, and you haven't shown any working so far, I might hold off posting it.
Read the notes I linked to and have a go at proving the result. Post what you get here for either congratulations or some more help!
 
Thanks a lot. I need a rationale for the procedure I'm supposed to code. So, in a sense, this (as everything) is a 'homework' question. But, for real homeworks, I'm a bit too old. :)
 
Fair enough!

I've attached a pdf that contains a derivation using index notation.
If you have any questions, just ask.
 

Attachments

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K