Is There a Discrepancy in Matrix Trace Derivative Rules?

In summary, the conversation discusses an apparent inconsistency in matrix trace derivative rules. Two specific rules are given, and it is assumed that one matrix is diagonal and the other is anti-symmetric. The cyclic property of the trace is used to show that the two derivatives should be equal up to a minus sign. However, when evaluated, the two derivatives return different results. The conversation ends with a suggestion to explain the system of notation being used in order to get a clearer answer.
  • #1
em12
3
0
Hope this is the right section. I'm having trouble ironing out an apparent inconsistency in matrix trace derivative rules.

Two particular rules for matrix trace derivatives are

[tex]\frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}^2\mathbf{A})=(\mathbf{X} \mathbf{A}+\mathbf{A} \mathbf{X})^T[/tex]

and

[tex] \frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)=\mathbf{X} \mathbf{A}^T+\mathbf{X}\mathbf{A}[/tex]

Now assume that [tex]\mathbf{A}[/tex] is diagonal (or maybe even just symmetric) and [tex] \mathbf{X}[/tex] is anti-symmetric. Then by the cyclic property of the trace, [tex]-Tr(\mathbf{X}^2\mathbf{A})=Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)[/tex]. So the two derivatives should be equal up to a minus sign, no?

However, the first rule returns the derivative

[tex]- (\mathbf{X}\mathbf{A}+\mathbf{A}\mathbf{X})[/tex]

and the second returns

[tex] 2\mathbf{X}\mathbf{A}[/tex].


Am I missing something?
 
Physics news on Phys.org
  • #2
I don't know the answer to your question, but it prompted me to look at the "Matrix calculus" article in the Wikipedia. If you look at the "discussion" page for that article, you see some interesting comments that say (to me) that the notation for taking the derivative with respect to a matrix is not completely standardized. If you explain the system of notation that you are using, perhaps someone will answer your question.
 

1. What is a matrix trace derivative?

A matrix trace derivative is a mathematical operation that calculates the rate of change of a matrix's trace (the sum of its diagonal elements) with respect to its individual elements. It is commonly used in multivariate calculus and linear algebra.

2. How is a matrix trace derivative calculated?

To calculate a matrix trace derivative, the derivative of each individual element in the matrix is calculated and then summed together. This can be done using the chain rule or other matrix differentiation rules.

3. What is the importance of matrix trace derivatives?

Matrix trace derivatives are important in various fields of science, including physics, engineering, and computer science. They are used to solve optimization problems, determine the stability of systems, and analyze the behavior of complex systems.

4. Are there any properties of matrix trace derivatives?

Yes, there are several properties of matrix trace derivatives, including linearity, the product rule, and the chain rule. These properties allow for the simplification and manipulation of matrix trace derivatives in mathematical calculations.

5. Can matrix trace derivatives be applied to non-square matrices?

No, matrix trace derivatives can only be applied to square matrices (matrices with the same number of rows and columns). This is because the trace is only defined for square matrices.

Similar threads

Replies
3
Views
1K
Replies
4
Views
353
Replies
3
Views
1K
Replies
1
Views
4K
  • Calculus and Beyond Homework Help
Replies
1
Views
584
  • Topology and Analysis
Replies
24
Views
2K
  • Special and General Relativity
2
Replies
47
Views
3K
Replies
2
Views
1K
Replies
4
Views
1K
Back
Top