Single layer neural network. What am I doing wrong?

GProgramer · Dec 27, 2011

I'm trying to implement this example http://www.cs.bham.ac.uk/~jxb/INC/l3.pdf (page 15)
I'm trying to do Iteration training, but it seems as if the results always converging to a steady error rate that is too large to be acceptable, the values centering around 0 while they should be close to -1 and +1.

I don't know if there's something wrong with the code, or I have the training concept misunderstood?

Code:

	close all;clc;
	M=3;N=1;

	X=[-1 1.0 0.1;-1 2.0 0.2; -1 0.1 0.3; -1 2.0 0.3; -1 0.2 0.4; -1 3.0 0.4; -1 0.1 0.5; -1 1.5 0.5; -1 0.5 0.6; -1 1.6 0.7];
	X=X';
	d=[-1;-1;1;-1;1;-1;1;-1;1;1];

	Wp=rand([M,N]);
	Wp=Wp'/sum(Wp(:));  % theta is 1 so sum of Wp and W needs to be <1
	W=rand([M,N]);
	W=W'/sum(W(:));
	V1=zeros(1,10);  %Pre allocating for speed
	Y1=zeros(1,10);
	e=zeros(1,10);

	while(1)
		
	i=randi(length(X),1);
	%---------------Feed forward---------------%    
	V1(i)=W*X(:,i);
	Y1(i)=tanh(V1(i)/2);
	e(i)=d(i)-Y1(i);


	%------------Backward propagation---------%
	delta1=e(i)*0.5*(1+Y1(i))*(1-Y1(i));

	Wn(1,1)=W(1,1) + 0.1*(W(1,1)-Wp(1,1)) + 0.1*delta1*Y1(i);
	Wn(1,2)=W(1,2) + 0.1*(W(1,2)-Wp(1,2)) + 0.1*delta1*Y1(i);
	Wn(1,3)=W(1,3) + 0.1*(W(1,3)-Wp(1,3)) + 0.1*delta1*Y1(i);

	Wp=W;
	W=Wn;

	figure(1);
	stem(Y1);
	axis([1 10 -1 1]);
	drawnow;
	end

D H · Dec 27, 2011

First off, your backprop algorithm is just wrong. Do you have a reference for the algorithm you are using?

Even if you correct your update algorithm, a single layer perceptron is going to have a very, very hard time with this dataset. Make a scatter plot of this dataset. For example, make a graph with mass on the x axis, speed on the y axis. Mark each fighter with an F, bomber with a B. There are three clusters. Near the y-axis there's a cluster of four light, fast fighters. Near the x-axis there's a cluster of four slow, heavy bombers. The third cluster is going to be problematic for a single layer perceptron. The fighter with mass=1.6, speed=0.7 is very similar to the bomber with mass=1.5, speed=0.5.

Adding a two node hidden layer makes this problem much more amenable to a backprop neural network.

GProgramer · Dec 27, 2011

D H said:

First off, your backprop algorithm is just wrong. Do you have a reference for the algorithm you are using?

Even if you correct your update algorithm, a single layer perceptron is going to have a very, very hard time with this dataset. Make a scatter plot of this dataset. For example, make a graph with mass on the x axis, speed on the y axis. Mark each fighter with an F, bomber with a B. There are three clusters. Near the y-axis there's a cluster of four light, fast fighters. Near the x-axis there's a cluster of four slow, heavy bombers. The third cluster is going to be problematic for a single layer perceptron. The fighter with mass=1.6, speed=0.7 is very similar to the bomber with mass=1.5, speed=0.5.

Adding a two node hidden layer makes this problem much more amenable to a backprop neural network.

Thank you for your detailed reply. I realized my mistake was taking the Y1 instead of X when readjusting the weights.

Currently it's more or less working (with a couple of the training values having around 30% error rate)

This particular set isn't what I'm aiming for, so it's why I didn't bother doing a detailed graph for it.

I realize that a single layer perceptron isn't enough, and I was just making one so as to test the algo if it's working. I am currently making a 2 hidden layer network, with a dynamic number of neurons in the hidden layers, and I will test it on a sampled cosine function, drawing the output function and the desired function over the same graph so as to compare.

May I ask exactly what is wrong with the algo? If you are talking about the hidden layer deltas, I've yet to add them.

Thank you again for the great reply!

Single layer neural network. What am I doing wrong?

Similar threads

Use of AI (ML/DL) in Science

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect