Partial Derivatives of Matrix/Vector Function: An Easier Way?

In summary, the conversation discusses different methods for computing the Jacobian of a function consisting of matrices multiplied by a vector. The two main methods are using the product rule or the chain rule for partial derivatives. While both methods work, the chain rule may be simpler as it avoids dealing with 3rd order tensors. The conversation also mentions the need for further simplification of the resulting 3x3 Jacobian.
  • #1
datahead8888
10
0
I was working on a pde, and I needed to compute a Jacobian for it.

Suppose we have a function consisting of a series of matrices multiplied by a vector:
f(X) = A * B * b
--where X is a vector containing elements that are contained within A, b, and/or b,
--A is a matrix, B is a matrix, and b is a vector

Each Matrix and the vector is expressed as more terms, ie...
X = (x1, x2, x3)

A =
[ x1 + y1 y4 y7 ]
[ y2 x2 + y5 y8 ]
] y3 y6 x3 + y9 ]

B =
[ y1 x2 + y4 x3 + y7 ]
[x1 + y2 y5 y8 ]
] y3 y6 y9 ]

b = [y1 y2 y3]' (' means transposed)

Now we want to find the Jacobian of f - ie the partial derivative of f wrt X.

One way to do this is to multiply the two matrices and then multiply that by the vector, creating one 3x1 vector in which each element is an algebraic expression resulting from matrix multiplication. The partial derivative could then be computed per element to form a 3x3 Jacobian. This would be feasible in the above example, but the one I'm working is a lot more complicated (and so I would also have to look for patterns in order to simplify it afterwards).

I was wanting to try to use the chain rule and/or the product rule for partial derivatives if possible. However, with the product rule you end up with A' * B * b + A * B' * b + A * B * b', where each derivative is wrt to the vector X. I understand that the derivative of a matrix wrt a vector is actually a 3rd order tensor, which is not easy to deal with. If this is not correct, the other terms still have to evaluate to matrices in order for matrix addition to be valid. If I use the chain rule instead, I still end up with the derivative of a matrix wrt a vector.

Is there an easier way to break down a matrix calculus problem like this? I've scoured the web and cannot seem to find a good direction.
 
Physics news on Phys.org
  • #2
datahead8888 said:
I was working on a pde, and I needed to compute a Jacobian for it.

Suppose we have a function consisting of a series of matrices multiplied by a vector:
f(X) = A * B * b
--where X is a vector containing elements that are contained within A, b, and/or b,
--A is a matrix, B is a matrix, and b is a vector

Each Matrix and the vector is expressed as more terms, ie...
X = (x1, x2, x3)

A =
[ x1 + y1 y4 y7 ]
[ y2 x2 + y5 y8 ]
] y3 y6 x3 + y9 ]

B =
[ y1 x2 + y4 x3 + y7 ]
[x1 + y2 y5 y8 ]
] y3 y6 y9 ]

b = [y1 y2 y3]' (' means transposed)

Now we want to find the Jacobian of f - ie the partial derivative of f wrt X.

One way to do this is to multiply the two matrices and then multiply that by the vector, creating one 3x1 vector in which each element is an algebraic expression resulting from matrix multiplication. The partial derivative could then be computed per element to form a 3x3 Jacobian. This would be feasible in the above example, but the one I'm working is a lot more complicated (and so I would also have to look for patterns in order to simplify it afterwards).

I was wanting to try to use the chain rule and/or the product rule for partial derivatives if possible. However, with the product rule you end up with A' * B * b + A * B' * b + A * B * b', where each derivative is wrt to the vector X. I understand that the derivative of a matrix wrt a vector is actually a 3rd order tensor, which is not easy to deal with. If this is not correct, the other terms still have to evaluate to matrices in order for matrix addition to be valid. If I use the chain rule instead, I still end up with the derivative of a matrix wrt a vector.

Is there an easier way to break down a matrix calculus problem like this? I've scoured the web and cannot seem to find a good direction.

Hi datahead8888! :)

Your analysis is flawless.
Both methods would work perfectly.

Don't worry about 3rd order tensors too much though.
If it seems complicated, just start with the derivative with respect to the first variable.
No tensor in sight.
Then go on with the 2nd variable etcetera and... there you go! ;)
 

1. What is a partial derivative of a matrix/vector function?

A partial derivative of a matrix/vector function is a way to measure how much the output of the function changes with respect to a specific input variable, while holding all other variables constant. It is a useful tool in multivariable calculus and is commonly used in fields such as physics, engineering, and economics.

2. How is a partial derivative of a matrix/vector function calculated?

The calculation of a partial derivative of a matrix/vector function involves taking the derivative of the function with respect to the variable of interest while keeping all other variables constant. This can be done by treating the other variables as constants and using the standard rules of differentiation.

3. What is the purpose of calculating partial derivatives of a matrix/vector function?

Partial derivatives of a matrix/vector function have many applications in science and engineering. They can be used to optimize functions, solve differential equations, and analyze the sensitivity of a system to changes in certain variables. They also play a crucial role in the development of advanced mathematical models.

4. Can a partial derivative of a matrix/vector function be negative?

Yes, a partial derivative of a matrix/vector function can be negative. This means that the output of the function is decreasing with respect to the input variable. It is important to note that the sign of a partial derivative depends on the direction of the change in the input variable.

5. Is there an easier way to calculate partial derivatives of a matrix/vector function?

Yes, there are several techniques and shortcuts that can make calculating partial derivatives of a matrix/vector function easier. One method is to use the chain rule, which allows for the decomposition of complex functions into simpler ones. Another approach is to use the properties of matrix/vector operations, such as linearity and the product rule, to simplify the calculation process.

Similar threads

  • Linear and Abstract Algebra
Replies
8
Views
1K
Replies
1
Views
2K
  • Linear and Abstract Algebra
Replies
4
Views
1K
  • Linear and Abstract Algebra
Replies
17
Views
4K
  • Differential Geometry
Replies
9
Views
422
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
126
  • General Math
Replies
11
Views
1K
  • Linear and Abstract Algebra
Replies
5
Views
1K
Replies
14
Views
1K
  • Programming and Computer Science
Replies
9
Views
1K
Back
Top