Using the product rule for the partial derivative of a vector function

In summary: Similarly, you can compute the other elements of the Jacobian matrix by substituting the appropriate derivatives. In summary, to compute the Jacobian of f(X), you can use the product rule for differentiation, along with the usual rules for differentiating matrices and vectors with respect to a vector. This approach can be generalized to more complicated problems as well.
  • #1
datahead8888
10
0
I was working on PDE for a project and needed to compute a Jacobian for it.

Suppose we have a function consisting of a series of matrices multiplied by a vector:
f(X) = A * B * b
--where X is a vector containing elements that are contained within A, b, and/or b,
--A is a matrix, B is a matrix, and b is a vector

Each Matrix and the vector is expressed as more terms, ie...
X = (x1, x2, x3)

A =
[ x1 + y1 y4 y7 ]
[ y2 x2 + y5 y8 ]
] y3 y6 x3 + y9 ]

B =
[ y1 x2 + y4 x3 + y7 ]
[x1 + y2 y5 y8 ]
] y3 y6 y9 ]

b = [y1 y2 y3]' (' means transposed)

Now we want to find the Jacobian of f - ie the partial derivative of f wrt X.

One way to do this is to multiply the two matrices and then multiply that by the vector, creating one 3x1 vector in which each element is an algebraic expression resulting from matrix multiplication. The partial derivative could then be computed per element to form a 3x3 Jacobian. This would be feasible in the above example, but the one I'm working is a lot more complicated (and so I would also have to look for patterns in order to simplify it afterwards).

I was wanting to try to use the chain rule and/or the product rule for partial derivatives if possible. However, with the product rule you end up with A' * B * b + A * B' * b + A * B * b', where each derivative is wrt to the vector X. I understand that the derivative of a matrix wrt a vector is actually a 3rd order tensor, which is not easy to deal with. If this is not correct, the other terms still have to evaluate to matrices in order for matrix addition to be valid. If I use the chain rule instead, I still end up with the derivative of a matrix wrt a vector.

Is there an easier way to break down a matrix calculus problem like this? I've scoured the web and cannot seem to find a good direction.
 
Physics news on Phys.org
  • #2
Any help would be greatly appreciated.One way to approach this is to use the matrix-vector product rule for differentiation. The product rule states that the derivative of a matrix-vector product is equal to the derivative of the matrix multiplied by the vector plus the matrix multiplied by the derivative of the vector. In this case, the Jacobian matrix will be the derivative of A multiplied by B multiplied by b plus A multiplied by the derivative of B multiplied by b plus A multiplied by B multiplied by the derivative of b. To compute the derivative of the matrix, you can use the usual rules of differentiation (i.e. the chain rule). For example, the derivative of the matrix A wrt x1 is given by: dA/dx1 = [1 0 0; 0 0 0; 0 0 0] Similarly, you can find the derivatives of A wrt x2 and x3. To find the derivative of B wrt x1, x2, and x3, you can use the same approach. Finally, the derivative of the vector b wrt x1, x2, and x3 is given by: db/dx1 = [1 0 0] db/dx2 = [0 1 0] db/dx3 = [0 0 1] Now, you can compute each term in the Jacobian matrix by multiplying the appropriate matrices and vectors. For example, the (1,1) element of the Jacobian matrix is given by: d(f(X))/dx1 = (dA/dx1) * B * b + A * (dB/dx1) * b + A * B * (db/dx1)
 

1. What is the product rule for the partial derivative of a vector function?

The product rule for the partial derivative of a vector function states that the partial derivative of a product of two vector functions is equal to the first function multiplied by the partial derivative of the second function, plus the second function multiplied by the partial derivative of the first function.

2. Why is the product rule important in vector calculus?

The product rule is important in vector calculus because it allows us to find the rate of change of a vector function that is a product of two other vector functions. This is useful in many applications, such as in physics and engineering.

3. How is the product rule used to find the gradient of a scalar function?

The gradient of a scalar function is a vector that points in the direction of the steepest increase of the function. To find the gradient, we use the product rule to take the partial derivatives of the scalar function with respect to each variable in the vector and then combine them into a vector using the unit vectors in each direction.

4. Can the product rule be applied to functions with more than two variables?

Yes, the product rule can be applied to functions with any number of variables. The general formula for the product rule in this case is similar to the one used for functions with two variables, but it includes all the partial derivatives of the individual functions with respect to each variable.

5. Are there any special cases where the product rule does not apply?

The product rule does not apply if the two vector functions are not differentiable, or if the two functions are not products of vector functions. In these cases, different methods must be used to find the partial derivative of the product.

Similar threads

  • Linear and Abstract Algebra
Replies
1
Views
2K
  • Differential Equations
Replies
5
Views
1K
Replies
1
Views
59
  • Differential Geometry
Replies
9
Views
332
Replies
9
Views
646
Replies
14
Views
1K
  • Precalculus Mathematics Homework Help
Replies
7
Views
805
  • General Math
Replies
11
Views
1K
  • Science and Math Textbooks
Replies
9
Views
1K
Replies
12
Views
996
Back
Top