- #1
em12
- 3
- 0
Hope this is the right section. I'm having trouble ironing out an apparent inconsistency in matrix trace derivative rules.
Two particular rules for matrix trace derivatives are
[tex]\frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}^2\mathbf{A})=(\mathbf{X} \mathbf{A}+\mathbf{A} \mathbf{X})^T[/tex]
and
[tex] \frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)=\mathbf{X} \mathbf{A}^T+\mathbf{X}\mathbf{A}[/tex]
Now assume that [tex]\mathbf{A}[/tex] is diagonal (or maybe even just symmetric) and [tex] \mathbf{X}[/tex] is anti-symmetric. Then by the cyclic property of the trace, [tex]-Tr(\mathbf{X}^2\mathbf{A})=Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)[/tex]. So the two derivatives should be equal up to a minus sign, no?
However, the first rule returns the derivative
[tex]- (\mathbf{X}\mathbf{A}+\mathbf{A}\mathbf{X})[/tex]
and the second returns
[tex] 2\mathbf{X}\mathbf{A}[/tex].
Am I missing something?
Two particular rules for matrix trace derivatives are
[tex]\frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}^2\mathbf{A})=(\mathbf{X} \mathbf{A}+\mathbf{A} \mathbf{X})^T[/tex]
and
[tex] \frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)=\mathbf{X} \mathbf{A}^T+\mathbf{X}\mathbf{A}[/tex]
Now assume that [tex]\mathbf{A}[/tex] is diagonal (or maybe even just symmetric) and [tex] \mathbf{X}[/tex] is anti-symmetric. Then by the cyclic property of the trace, [tex]-Tr(\mathbf{X}^2\mathbf{A})=Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)[/tex]. So the two derivatives should be equal up to a minus sign, no?
However, the first rule returns the derivative
[tex]- (\mathbf{X}\mathbf{A}+\mathbf{A}\mathbf{X})[/tex]
and the second returns
[tex] 2\mathbf{X}\mathbf{A}[/tex].
Am I missing something?