- #1

sakian

- 1

- 0

The equation describing the network is:

y = [itex]\psi[/itex](

**W3**x [itex]\psi[/itex](**W2**x [itex]\psi[/itex](**W1**x**I**)))Where:

y (scalar) is the output value

**W1**(2x2 matrix) are the 1st layer weights

**W2**(2x2 matrix) are the 2nd layer weights

**W3**(1x2 matrix) are the output layer weight

**I**(2x1 vector) is the input vector

[itex]\psi[/itex] is the activation function (log sigmoid)

I'm trying to differentiate the equation by the weight matrices (using the chain rule) but I'm getting equations that don't work. When I try to differentiate by

**W1**I get:

dy/d

**W1**= [itex]\psi[/itex]' (**W3**x [itex]\psi[/itex](**W2**x [itex]\psi[/itex](**W1**x**I**))) x**W3**x [itex]\psi[/itex]' (**W2**x [itex]\psi[/itex](**W1**x**I**)) x**W2**x [itex]\psi[/itex]' (**W1**x**I**) x**I**When I try to calculate I'm getting matrix dimension mismatches. Am I doing something wrong?