Solving Matrix Derivatives: Theorems & Examples

Click For Summary

Homework Help Overview

The discussion revolves around the derivatives of matrices, specifically focusing on the trace of a product of matrices and the associated theorems. The original poster expresses confusion regarding the derivation of a specific formula related to matrix derivatives, seeking clarification and additional theorems on the topic.

Discussion Character

  • Exploratory, Conceptual clarification, Mathematical reasoning

Approaches and Questions Raised

  • Participants discuss the use of indices to express the trace of a matrix product and how to differentiate it with respect to matrix elements. There are attempts to clarify the relationship between the trace and the components of the matrices involved. Questions arise about the definitions and properties of the matrices in question, particularly regarding their dimensions and the implications for the trace operation.

Discussion Status

The discussion is ongoing, with participants exploring different interpretations of the trace and its derivative. Some guidance has been provided regarding the use of indices and the summation convention, but no consensus has been reached on the original poster's question about the derivation.

Contextual Notes

There is a mention of the matrices being square, which may affect the interpretation of the trace and its components. The original poster indicates a lack of experience with matrix derivatives, which may contribute to the confusion in the discussion.

olds442
Messages
6
Reaction score
0
I have encountered some problems that have to do with the derivatives of matrices... I have NO experience with these and had little luck finding any theorems... I looked on wikipedia for some help and found a few definitions, but I am still unclear about how this is proven or attained... here is an example from wikipedia:

d tr(AXB)/ dX = A^T B^T

my question is... how are they getting that?!? I seem to be having a big mind block with this..

Any theorems about how to take derivatives of vectors or matrices would be great!

any help would be appreciated!
 
Physics news on Phys.org
Think indices. tr(AXB)=A_ij*X_jk*B_ki. The lm component of the derivative matrix is the derivative of that with respect to X_lm. The only terms that contribute to that are terms where j=l and k=m. Removing the X_lm since it is differentiated leaves A_il*B_mi. That's (A^T)_li*(B^T)_im=(A^T*B^T)_lm. So the lm component of d(tr(AXB))/dX is the same as (A^T*B^T). So they are equal as matrices.
 
hmmm.. if they are defined as square matrices, the tr(AXB) would be given by A_ii*X_ii*B_ii so that tr(AXB) is a square matrix whose diagonal elements are all AXB correct? If not, there is definitely something here that I am missing...
 
AXB_ij=A_ik*X_kl*B_lj. Repeated indices are summed over (I don't think I emphasized that). To get the trace, just set i=j and sum over it. Leave the summed dummy indices alone!
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
1K
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
5
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 5 ·
Replies
5
Views
14K