View Single Post
sakian
#1
Jun21-11, 03:52 PM
P: 1
I'm doing some work with neural networks lately and I'm having trouble with this seemingly simple equation.

The equation describing the network is:
y = [itex]\psi[/itex](W3 x [itex]\psi[/itex](W2 x [itex]\psi[/itex](W1 x I)))
Where:
y (scalar) is the output value
W1 (2x2 matrix) are the 1st layer weights
W2 (2x2 matrix) are the 2nd layer weights
W3 (1x2 matrix) are the output layer weight
I (2x1 vector) is the input vector
[itex]\psi[/itex] is the activation function (log sigmoid)
I'm trying to differentiate the equation by the weight matrices (using the chain rule) but I'm getting equations that don't work. When I try to differentiate by W1 I get:
dy/dW1 = [itex]\psi[/itex]' (W3 x [itex]\psi[/itex](W2 x [itex]\psi[/itex](W1 x I))) x W3 x [itex]\psi[/itex]' (W2 x [itex]\psi[/itex](W1 x I)) x W2 x [itex]\psi[/itex]' (W1 x I) x I
When I try to calculate I'm getting matrix dimension mismatches. Am I doing something wrong?
Phys.Org News Partner Science news on Phys.org
Experts defend operational earthquake forecasting, counter critiques
EU urged to convert TV frequencies to mobile broadband
Sierra Nevada freshwater runoff could drop 26 percent by 2100