Derivatives of functions with matrices

Click For Summary

Discussion Overview

The discussion revolves around the calculation of derivatives of functions involving matrices, specifically focusing on derivatives with respect to a single variable. Participants explore the application of the chain rule and product rule in this context, particularly when matrices depend on a variable.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions whether the ordinary chain rule and product rule can be applied when differentiating functions of matrices that depend on a variable.
  • Another participant provides an example with k=2, stating that the derivative involves terms that account for noncommutativity, leading to a more complex expression than a single term.
  • A different viewpoint suggests using the trace of the product of two matrix functions and applying standard rules, although there is a clarification needed regarding the indices of the components involved.
  • One participant emphasizes that the derivative of a matrix-valued function is simply the matrix of derivatives of its component functions, reducing the problem to basic operations with real or complex numbers.
  • Another participant counters this by stating that certain matrix operations, such as the derivative of the exponential of a matrix, require knowledge of matrix operations, indicating that the simplification may overlook necessary complexities.

Areas of Agreement / Disagreement

Participants express differing views on the applicability of the chain rule and the implications of noncommutativity in matrix differentiation. There is no consensus on the best approach to take when differentiating matrix functions, and the discussion remains unresolved.

Contextual Notes

Participants note the importance of noncommutativity in matrix operations and the potential complications that arise when matrices do not commute with their derivatives. There are also references to specific formulas that may require deeper understanding of matrix operations.

Leo321
Messages
38
Reaction score
0
I try to understand how to calculate derivatives of functions, which contain matrices.
For a start I am looking at derivatives by a single variable.
I have x=f(t) and I want to calculate [tex]\frac{dx}{dt}[/tex]. The caveat is that f contains matrices, that depend on t. Can I use the ordinary chain rule and product rule, and if not, then what can I use?
What for example would be [tex]\frac{d}{dt}Tr(M^kA)[/tex]? Assume M is a function of t and A is constant. Would it be [tex]kTr(M^{k-1}\frac{dM}{dt}A)[/tex], like it would have been for a scalar?
 
Physics news on Phys.org
Take as an example k=2. Then

[tex]\frac{d}{dt}M(t)M(t)=\dot{M}M+M\dot{M}[/tex]

If [tex]M[/tex] and [tex]\dot{M}[/tex] do not commute - that is all you can have. If your A does not commute with M and its derivative, then trace will not help.

So, yes, the chain rule applies, but noncommutativity needs to be taken into account, so differentiating M^k you will have k terms (with [tex]\dot{M}[/tex] at k different plces) and not just one.
 
Just write

[tex]\operatorname{Tr}(A(t)B(t))=\sum_{i=1}^n(A(t)B(t))_{ii}=\sum_{i=1}^n\sum_{j=1}^n A(t)_{ij} B(t)_{ji}[/tex]

and use the usual rules on the right-hand side.
 
Last edited:
Fredrik said:
Just write

[tex]\operatorname{Tr}(A(t)B(t))=\sum_{i=1}^n a_i(t) b_i(t)[/tex]

and use the usual rules on the right-hand side.

What are [tex]a_i[/tex] and [tex]b_i[/tex] on the RHS?
 
arkajad said:
What are [tex]a_i[/tex] and [tex]b_i[/tex] on the RHS?
Oops, they were supposed to be the components of the matrices A and B. Total brain fart. They obviously need two indices. I will edit my post right away.

I have edited it now. You may need to refresh the page to see it.
 
Your formula gives:

[tex]d/dt (AB)=d/dt(A)B+Ad/dt(B)[/tex] - the Leibniz rule, and only under the trace. The rule is valid also without the trace.
 
There are two things I think Leo needs to understand here:

1. The derivative of a matrix-valued function defined on a subset of the real numbers is just the matrix of derivatives of the component functions.

2. Matrix multiplication and many other operations (like the trace) are defined using only addition and multiplication of real (or complex) numbers.

These two facts reduce problems of the sort described in #1 to problems that require no knowledge of matrices.
 
Fredrik said:
These two facts reduce problems of the sort described in #1 to problems that require no knowledge of matrices.

Not really. Because we have the following nice formulas like:

[tex]\frac{d}{dt}\exp (At)=A\exp(At)[/tex]

or this:

[tex]\frac{d}{dt}( A(t)^{-1})=-A(t)^{-1}(\frac{d}{dt}A(t))A(t)^{-1}[/tex]

I do not know how you would derive such a formula without knowing about operations with matrices. It would be rather tedious...
 
arkajad said:
Take as an example k=2. Then

[tex]\frac{d}{dt}M(t)M(t)=\dot{M}M+M\dot{M}[/tex]

If [tex]M[/tex] and [tex]\dot{M}[/tex] do not commute - that is all you can have. If your A does not commute with M and its derivative, then trace will not help.

So, yes, the chain rule applies, but noncommutativity needs to be taken into account, so differentiating M^k you will have k terms (with [tex]\dot{M}[/tex] at k different plces) and not just one.

Thanks!
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 27 ·
Replies
27
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
6K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 27 ·
Replies
27
Views
3K