Derivative (mimization) of matrix trace

In summary: I'll be happy to help!In summary, the function f is linear and takes the following form: f(x) = Tr(X'AX) - 2Tr(X'BC). The derivatives should be used, but I'm not sure how to proceed. Any help is appreciated.
  • #1
onako
86
0
Given a function
f(X)= Tr(X'AX) - 2Tr(X'BC), with X' denoting matrix transpose, I'm supposed to find the expression used to miminize the function with respect to X. The derivatives should be used, but I'm not sure how to proceed.
Any help is appreciated.
 
Physics news on Phys.org
  • #2
Using linearity of the trace you get
[TEX]
f(X) = Tr(X^t (A-2B) X)
[/TEX]

Then you can take the matrix derivative using the chain rule
[TEX]
\frac{\partial f(X)}{\partial X} = (A-2B)X + (A-2B)^t X = (A+A^t-2B-2B^t)X
[/TEX]
This can be checked using index notation if you're not sure of how the derivatives work.

Finally, you have just a linear equation to solve in order to find the stationary points of f(X).
The columns of X must be null vectors of the symmetric matrix (A+A'-2B-2B').

Here's some Mathematica code that let's you check it for any dimension n.

Code:
In[1]:= (n=2;X=Array[x,{n,n}];A=Array[a,{n,n}];B=Array[b,{n,n}]);
In[2]:= Tr[X\[Transpose].A.X]-2Tr[X\[Transpose].B.X]==Tr[X\[Transpose].(A-2B).X]//Expand
Out[2]= True
In[3]:= D[X,{X}]==Array[Boole[#1==#3 && #2==#4]&,{n,n,n,n}]
Out[3]= True
In[4]:= D[Tr[X\[Transpose].A.X],{X}]==Array[D[Tr[X\[Transpose].A.X],x[##]]&,{n,n}]
Out[4]= True
In[5]:= D[Tr[X\[Transpose].A.X],{X}]== (A+A\[Transpose]).X//Expand
Out[5]= True

--------------------------------------
The LaTeX image generation wasn't working, so here's the first two equations:
f(X) = Tr(X' (A-2B) X)
f'(X) = (A-2B) X + (A-2B)' X = (A+A'-2B-2B') X
 
Last edited:
  • #3
Thanks.
I'm not sure I understand this derivation.
Given f(X)= Tr(X'AX) - 2Tr(X'BC), X' denoting transpose of X,
I'm supposed to find the first derivative of f(x).
Could you follow step by step procedure; first derivative of Tr(X'AX), and then derivative of 2Tr(X'BC)?
The final derivative should have A, B, C, and X.
 
  • #4
Sorry about that, I misread your question as
f(X)= Tr(X'AX) - 2Tr(X'BX)

------

So, taking the derivative of your actual expression gives

f'(X)= (A+A')X - 2BC

Which is a linear equation you can solve,

------

The best (I imagine that's subjective) way to be sure that you got all of the transposes correct is to use index notation. E.g. https://www.physicsforums.com/showthread.php?t=198712. Then you basically just take derivatives wrt the components, instead of the whole matrix.

I've written up a detailed derivation, but since I have a suspicion that this is a homework question, and you haven't shown any working so far, I might hold off posting it.
Read the notes I linked to and have a go at proving the result. Post what you get here for either congratulations or some more help!
 
  • #5
Thanks a lot. I need a rationale for the procedure I'm supposed to code. So, in a sense, this (as everything) is a 'homework' question. But, for real homeworks, I'm a bit too old. :)
 
  • #6
Fair enough!

I've attached a pdf that contains a derivation using index notation.
If you have any questions, just ask.
 

Attachments

  • MatrixDerivative-PhysicsForums.nb
    14.9 KB · Views: 652
  • MatrixDerivative.pdf
    55.7 KB · Views: 445

Related to Derivative (mimization) of matrix trace

1. What is the matrix trace?

The matrix trace is the sum of the elements on the main diagonal of a square matrix. It is often denoted as tr(A) or Tr(A).

2. Why is minimizing the matrix trace important?

Minimizing the matrix trace is important because it is a commonly used optimization technique in machine learning and other applications. It can help to simplify complex problems and improve computational efficiency.

3. How do you calculate the derivative of the matrix trace?

The derivative of the matrix trace can be calculated using the chain rule and the properties of matrix multiplication. It involves taking the derivative of the matrix with respect to each element and then summing the resulting matrix elements.

4. What are some applications of minimizing the matrix trace?

Minimizing the matrix trace can be used in a variety of applications, such as finding the minimum cost of a linear system, minimizing the error in a machine learning model, and optimizing the performance of a control system.

5. Are there any limitations to using the matrix trace minimization technique?

Yes, there are some limitations to using the matrix trace minimization technique. It may not always provide the most accurate or efficient solution, and it may be computationally expensive for large matrices. Additionally, it is only applicable to certain types of optimization problems.

Similar threads

  • Linear and Abstract Algebra
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
8
Views
1K
Replies
3
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
2K
  • Linear and Abstract Algebra
Replies
1
Views
666
  • Linear and Abstract Algebra
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
2K
  • Linear and Abstract Algebra
Replies
15
Views
4K
Replies
10
Views
2K
  • Linear and Abstract Algebra
Replies
2
Views
1K
Back
Top