Matrix dimensions are not matching after differentiation

In summary, the conversation discusses the difficulties the speaker is having with a neural network equation and trying to differentiate it using the chain rule. They mention issues with matrix dimensions and ask for clarification on the derivative of a function with respect to a matrix. They also bring up the lack of an article on matrix calculus on Wikipedia.
  • #1
sakian
1
0
I'm doing some work with neural networks lately and I'm having trouble with this seemingly simple equation.

The equation describing the network is:
y = [itex]\psi[/itex](W3 x [itex]\psi[/itex](W2 x [itex]\psi[/itex](W1 x I)))​

Where:
y (scalar) is the output value​
W1 (2x2 matrix) are the 1st layer weights​
W2 (2x2 matrix) are the 2nd layer weights​
W3 (1x2 matrix) are the output layer weight​
I (2x1 vector) is the input vector​
[itex]\psi[/itex] is the activation function (log sigmoid)​

I'm trying to differentiate the equation by the weight matrices (using the chain rule) but I'm getting equations that don't work. When I try to differentiate by W1 I get:

dy/dW1 = [itex]\psi[/itex]' (W3 x [itex]\psi[/itex](W2 x [itex]\psi[/itex](W1 x I))) x W3 x [itex]\psi[/itex]' (W2 x [itex]\psi[/itex](W1 x I)) x W2 x [itex]\psi[/itex]' (W1 x I) x I

When I try to calculate I'm getting matrix dimension mismatches. Am I doing something wrong?
 
Physics news on Phys.org
  • #2
I don't know the answer to your question, but I'm curious what definition you are using for the derivative of a function with respect to a matrix. Also, do you know a link that gives a rule for the derivative of a product of matrices with respect to a matrix?

I find it interesting that the current Wikipedia has a discussion page that brings up some of these issues ( http://en.wikipedia.org/wiki/Talk:Matrix_calculus ) but there is no article to go with it!
 

1. What does it mean when I receive an error message saying "Matrix dimensions are not matching after differentiation"?

This error message typically means that there is an issue with the size or shape of your matrices when you are trying to perform a differentiation operation. In order for differentiation to work, the dimensions of the matrices involved must match up in a specific way.

2. How can I fix this error?

To fix this error, you will need to check the dimensions of your matrices and make sure they are compatible for differentiation. This usually means that the number of rows and columns in each matrix must match up, or that one matrix must be a scalar value.

3. Why do the dimensions of my matrices need to match for differentiation?

Differentiation involves finding the rate of change of a function, which requires comparing small changes in the input variables to small changes in the output variables. If the dimensions of the matrices do not match, the values being compared will not be meaningful and the differentiation operation will not be accurate.

4. Can I use matrices with different dimensions for differentiation?

No, matrices with different dimensions cannot be used for differentiation. The dimensions must match in order for the operation to be valid. If you need to use matrices with different dimensions, you may need to reshape or transpose them to make them compatible.

5. Are there any common mistakes that could lead to this error?

Yes, some common mistakes that could lead to this error include using matrices with different dimensions, forgetting to transpose matrices when necessary, or forgetting to reshape matrices before performing differentiation.

Similar threads

  • Differential Equations
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
5
Views
914
  • High Energy, Nuclear, Particle Physics
Replies
8
Views
875
Replies
14
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
2K
  • Advanced Physics Homework Help
Replies
11
Views
226
Replies
2
Views
2K
Replies
7
Views
4K
Replies
3
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
Back
Top