# Derivative of a function whose variable is a matrix

• lus1450
In summary, the Frechet derivative of a matrix with respect to a given vector is given by the following equation: f(x+h) - f(x) = (x+h)(x+h) - X^2.
lus1450

## Homework Statement

Let ##f: M_{n \times n} \rightarrow M_{n \times n}## with ##f(X) = X^2##, where ##M_{n \times n}## denotes the vector space of ##n \times n## matrices. Show ##f## is differentiable and find its differential.

## The Attempt at a Solution

So far, I've been looking at the difference quotient in order to "guess" linear transformation ##A## that will satisfy it. We have:
$$\lim_{|h|\to 0} \frac{|f(X+h) - f(X) - Ah|}{|h|} = \lim_{|h|\to 0} \frac{|Xh + hX + h^2 - Ah|}{|h|}$$after a little simplifying.
And I was thinking for a fixed ##X##, I have ##A(h) = Xh + hX##, as I wanted to get rid of the ##Xh + hX## term in the quotient. However, I'm stuck now since it's ##Ah## and not just ##A##. I was thinking of throwing in an ##h^{-1}## to my ##A##, but that would make it non-linear. Any suggestions?

Zaculus said:

## Homework Statement

Let ##f: M_{n \times n} \rightarrow M_{n \times n}## with ##f(X) = X^2##, where ##M_{n \times n}## denotes the vector space of ##n \times n## matrices. Show ##f## is differentiable and find its differential.

## The Attempt at a Solution

So far, I've been looking at the difference quotient in order to "guess" linear transformation ##A## that will satisfy it. We have:
$$\lim_{|h|\to 0} \frac{|f(X+h) - f(X) - Ah|}{|h|} = \lim_{|h|\to 0} \frac{|Xh + hX + h^2 - Ah|}{|h|}$$after a little simplifying.
And I was thinking for a fixed ##X##, I have ##A(h) = Xh + hX##, as I wanted to get rid of the ##Xh + hX## term in the quotient. However, I'm stuck now since it's ##Ah## and not just ##A##. I was thinking of throwing in an ##h^{-1}## to my ##A##, but that would make it non-linear. Any suggestions?

The concept you need is that of the Frechet Derivative; see http://en.wikipedia.org/wiki/Fréchet_derivative or
http://www.maths.lse.ac.uk/Courses/MA409/Notes-Part2.pdf . If ##h## is a matrix, then for ##f(x) = x^2## we have
$$f(X+h) - f(X) = (X+h)(X+h) - X^2 = hX + Xh + h^2,$$
so the linear operator ##D_X: h \rightarrow M_{n \times n},## defined as ##D_X(h) = Xh + hX##, is the Frechet derivative. Since ##D_X## is linear, it will always be possible to write ##D_X(h)## as ##A_X h## for some matrix ##A_X##, if you really want to, but I am not sure if your question really requires that you do so.

Ah okay, thank you! I just realized I was swapping linear transformation and matrix in my thought process, since in my linear algebra class "A" is normally a matrix while in my analysis class, "A" is a linear transformation. As such, it is saying A(h), not A*h. My confusion is cleared up =]

Zaculus said:
Ah okay, thank you! I just realized I was swapping linear transformation and matrix in my thought process, since in my linear algebra class "A" is normally a matrix while in my analysis class, "A" is a linear transformation. As such, it is saying A(h), not A*h. My confusion is cleared up =]

I take back what I said about it always being possible to write ##D_X(h)## as ##A_Xh## for some matrix ##A_X##. I forgot that ##h## is a matrix, not a vector. If it were a vector, that re-write would be possible, but maybe not if it is a matrix. (Actually, ##h## is an ##n^2##-dimensional vector, so if we are willing to re-write all matrices as ##n^2##-vectors, there would, indeed, be an ##n^2 \times n^2## matrix giving us what we need, but that seems excessive.)

## 1. What is a derivative of a function whose variable is a matrix?

The derivative of a function whose variable is a matrix is a matrix that represents the rate of change of the function with respect to each element in the matrix. It is a fundamental concept in linear algebra and is used in many fields of science, such as physics, engineering, and machine learning.

## 2. How is the derivative of a matrix function calculated?

The derivative of a matrix function is calculated using the same principles as derivatives of scalar functions. Each element in the resulting derivative matrix is calculated by taking the partial derivative of the original function with respect to the corresponding element in the input matrix.

## 3. What is the purpose of calculating the derivative of a matrix function?

The derivative of a matrix function is used to understand how the output of the function changes when the input matrix is varied. This information is crucial in optimization problems, where the goal is to find the input matrix that maximizes or minimizes the output of the function.

## 4. Can the derivative of a matrix function be a matrix itself?

Yes, the derivative of a matrix function is a matrix itself. This is because each element in the derivative matrix represents the rate of change of the function with respect to the corresponding element in the input matrix.

## 5. What are some real-world applications of the derivative of a matrix function?

The derivative of a matrix function has many real-world applications, such as in image processing, data compression, and computer graphics. It is also used in machine learning algorithms, such as backpropagation in neural networks, to optimize the weights and biases of the model.

Replies
7
Views
723
Replies
20
Views
2K
Replies
7
Views
1K
Replies
13
Views
440
Replies
8
Views
652
Replies
3
Views
1K
Replies
9
Views
4K
Replies
6
Views
1K
Replies
3
Views
1K
Replies
4
Views
537