MHB Partial Derivatives of Matrix/Vector Function: An Easier Way?

Click For Summary
The discussion focuses on computing the Jacobian of a function defined by the multiplication of matrices and a vector, specifically f(X) = A * B * b. The user explores using the product and chain rules for partial derivatives but finds the resulting expressions complex, particularly due to the involvement of third-order tensors. A suggestion is made to simplify the process by taking derivatives with respect to each variable sequentially, avoiding tensor complications. The conversation emphasizes that both methods for deriving the Jacobian are valid and encourages breaking down the problem into manageable parts. Overall, the thread highlights the challenges of matrix calculus while offering practical advice for simplification.
datahead8888
Messages
8
Reaction score
0
I was working on a pde, and I needed to compute a Jacobian for it.

Suppose we have a function consisting of a series of matrices multiplied by a vector:
f(X) = A * B * b
--where X is a vector containing elements that are contained within A, b, and/or b,
--A is a matrix, B is a matrix, and b is a vector

Each Matrix and the vector is expressed as more terms, ie...
X = (x1, x2, x3)

A =
[ x1 + y1 y4 y7 ]
[ y2 x2 + y5 y8 ]
] y3 y6 x3 + y9 ]

B =
[ y1 x2 + y4 x3 + y7 ]
[x1 + y2 y5 y8 ]
] y3 y6 y9 ]

b = [y1 y2 y3]' (' means transposed)

Now we want to find the Jacobian of f - ie the partial derivative of f wrt X.

One way to do this is to multiply the two matrices and then multiply that by the vector, creating one 3x1 vector in which each element is an algebraic expression resulting from matrix multiplication. The partial derivative could then be computed per element to form a 3x3 Jacobian. This would be feasible in the above example, but the one I'm working is a lot more complicated (and so I would also have to look for patterns in order to simplify it afterwards).

I was wanting to try to use the chain rule and/or the product rule for partial derivatives if possible. However, with the product rule you end up with A' * B * b + A * B' * b + A * B * b', where each derivative is wrt to the vector X. I understand that the derivative of a matrix wrt a vector is actually a 3rd order tensor, which is not easy to deal with. If this is not correct, the other terms still have to evaluate to matrices in order for matrix addition to be valid. If I use the chain rule instead, I still end up with the derivative of a matrix wrt a vector.

Is there an easier way to break down a matrix calculus problem like this? I've scoured the web and cannot seem to find a good direction.
 
Physics news on Phys.org
datahead8888 said:
I was working on a pde, and I needed to compute a Jacobian for it.

Suppose we have a function consisting of a series of matrices multiplied by a vector:
f(X) = A * B * b
--where X is a vector containing elements that are contained within A, b, and/or b,
--A is a matrix, B is a matrix, and b is a vector

Each Matrix and the vector is expressed as more terms, ie...
X = (x1, x2, x3)

A =
[ x1 + y1 y4 y7 ]
[ y2 x2 + y5 y8 ]
] y3 y6 x3 + y9 ]

B =
[ y1 x2 + y4 x3 + y7 ]
[x1 + y2 y5 y8 ]
] y3 y6 y9 ]

b = [y1 y2 y3]' (' means transposed)

Now we want to find the Jacobian of f - ie the partial derivative of f wrt X.

One way to do this is to multiply the two matrices and then multiply that by the vector, creating one 3x1 vector in which each element is an algebraic expression resulting from matrix multiplication. The partial derivative could then be computed per element to form a 3x3 Jacobian. This would be feasible in the above example, but the one I'm working is a lot more complicated (and so I would also have to look for patterns in order to simplify it afterwards).

I was wanting to try to use the chain rule and/or the product rule for partial derivatives if possible. However, with the product rule you end up with A' * B * b + A * B' * b + A * B * b', where each derivative is wrt to the vector X. I understand that the derivative of a matrix wrt a vector is actually a 3rd order tensor, which is not easy to deal with. If this is not correct, the other terms still have to evaluate to matrices in order for matrix addition to be valid. If I use the chain rule instead, I still end up with the derivative of a matrix wrt a vector.

Is there an easier way to break down a matrix calculus problem like this? I've scoured the web and cannot seem to find a good direction.

Hi datahead8888! :)

Your analysis is flawless.
Both methods would work perfectly.

Don't worry about 3rd order tensors too much though.
If it seems complicated, just start with the derivative with respect to the first variable.
No tensor in sight.
Then go on with the 2nd variable etcetera and... there you go! ;)
 
Thread 'How to define a vector field?'
Hello! In one book I saw that function ##V## of 3 variables ##V_x, V_y, V_z## (vector field in 3D) can be decomposed in a Taylor series without higher-order terms (partial derivative of second power and higher) at point ##(0,0,0)## such way: I think so: higher-order terms can be neglected because partial derivative of second power and higher are equal to 0. Is this true? And how to define vector field correctly for this case? (In the book I found nothing and my attempt was wrong...

Similar threads

  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 17 ·
Replies
17
Views
5K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 12 ·
Replies
12
Views
3K
Replies
14
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
1
Views
2K