Is There a Discrepancy in Matrix Trace Derivative Rules?

Click For Summary
SUMMARY

The discussion centers on the apparent inconsistency in matrix trace derivative rules, specifically comparing the derivatives of two expressions: Tr(\mathbf{X}^2\mathbf{A}) and Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T). The first derivative yields -(\mathbf{X}\mathbf{A}+\mathbf{A}\mathbf{X}), while the second results in 2\mathbf{X}\mathbf{A}. The user highlights that if \mathbf{A} is diagonal or symmetric and \mathbf{X} is anti-symmetric, the cyclic property of the trace suggests these should be equal up to a minus sign. The discussion also notes that the notation for matrix derivatives lacks standardization, which may contribute to the confusion.

PREREQUISITES
  • Matrix calculus
  • Understanding of trace properties
  • Familiarity with symmetric and anti-symmetric matrices
  • Knowledge of derivative notation for matrices
NEXT STEPS
  • Review the "Matrix calculus" article on Wikipedia for insights on notation
  • Study the cyclic property of the trace in matrix operations
  • Explore the implications of matrix symmetry and anti-symmetry on derivatives
  • Investigate standardized notation for matrix derivatives in academic literature
USEFUL FOR

Mathematicians, researchers in applied mathematics, and students studying linear algebra or matrix calculus will benefit from this discussion, particularly those interested in the nuances of matrix derivatives.

em12
Messages
3
Reaction score
0
Hope this is the right section. I'm having trouble ironing out an apparent inconsistency in matrix trace derivative rules.

Two particular rules for matrix trace derivatives are

\frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}^2\mathbf{A})=(\mathbf{X} \mathbf{A}+\mathbf{A} \mathbf{X})^T

and

\frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)=\mathbf{X} \mathbf{A}^T+\mathbf{X}\mathbf{A}

Now assume that \mathbf{A} is diagonal (or maybe even just symmetric) and \mathbf{X} is anti-symmetric. Then by the cyclic property of the trace, -Tr(\mathbf{X}^2\mathbf{A})=Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T). So the two derivatives should be equal up to a minus sign, no?

However, the first rule returns the derivative

- (\mathbf{X}\mathbf{A}+\mathbf{A}\mathbf{X})

and the second returns

2\mathbf{X}\mathbf{A}.


Am I missing something?
 
Physics news on Phys.org
I don't know the answer to your question, but it prompted me to look at the "Matrix calculus" article in the Wikipedia. If you look at the "discussion" page for that article, you see some interesting comments that say (to me) that the notation for taking the derivative with respect to a matrix is not completely standardized. If you explain the system of notation that you are using, perhaps someone will answer your question.
 
Relativistic Momentum, Mass, and Energy Momentum and mass (...), the classic equations for conserving momentum and energy are not adequate for the analysis of high-speed collisions. (...) The momentum of a particle moving with velocity ##v## is given by $$p=\cfrac{mv}{\sqrt{1-(v^2/c^2)}}\qquad{R-10}$$ ENERGY In relativistic mechanics, as in classic mechanics, the net force on a particle is equal to the time rate of change of the momentum of the particle. Considering one-dimensional...

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K