Solving Matrix Derivatives: Theorems & Examples

Click For Summary
The discussion focuses on understanding the derivatives of matrices, specifically the theorem related to the derivative of the trace of a matrix product, expressed as d tr(AXB)/dX = A^T B^T. The user expresses confusion about how this result is derived and seeks clarification on the underlying principles and theorems related to matrix derivatives. A detailed explanation is provided, breaking down the derivative process using index notation and emphasizing the importance of repeated indices in summation. The conversation highlights the need for a clearer understanding of matrix operations and their derivatives, particularly in the context of trace functions. Overall, the thread serves as a resource for those struggling with matrix calculus concepts.
olds442
Messages
6
Reaction score
0
I have encountered some problems that have to do with the derivatives of matrices... I have NO experience with these and had little luck finding any theorems... I looked on wikipedia for some help and found a few definitions, but I am still unclear about how this is proven or attained... here is an example from wikipedia:

d tr(AXB)/ dX = A^T B^T

my question is... how are they getting that?!? I seem to be having a big mind block with this..

Any theorems about how to take derivatives of vectors or matrices would be great!

any help would be appreciated!
 
Physics news on Phys.org
Think indices. tr(AXB)=A_ij*X_jk*B_ki. The lm component of the derivative matrix is the derivative of that with respect to X_lm. The only terms that contribute to that are terms where j=l and k=m. Removing the X_lm since it is differentiated leaves A_il*B_mi. That's (A^T)_li*(B^T)_im=(A^T*B^T)_lm. So the lm component of d(tr(AXB))/dX is the same as (A^T*B^T). So they are equal as matrices.
 
hmmm.. if they are defined as square matrices, the tr(AXB) would be given by A_ii*X_ii*B_ii so that tr(AXB) is a square matrix whose diagonal elements are all AXB correct? If not, there is definitely something here that I am missing...
 
AXB_ij=A_ik*X_kl*B_lj. Repeated indices are summed over (I don't think I emphasized that). To get the trace, just set i=j and sum over it. Leave the summed dummy indices alone!
 
Question: A clock's minute hand has length 4 and its hour hand has length 3. What is the distance between the tips at the moment when it is increasing most rapidly?(Putnam Exam Question) Answer: Making assumption that both the hands moves at constant angular velocities, the answer is ## \sqrt{7} .## But don't you think this assumption is somewhat doubtful and wrong?

Similar threads

Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
4
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
5
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 5 ·
Replies
5
Views
12K