An Introduction to Deep Learning and Modifying Code

ver_mathstats · Dec 4, 2022

The code with no modifications is at the very bottom. For part 1: loading a mat file into the code, I just put it above the last function but it's going to have to replace the data part of the original code thus I have to extract it out of the file: (I would add the file but I am unsure of how to add a matlab file onto Physics Forum, or if that is even possible).


x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7];
x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6];
y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

So the above part then becomes:


x1 = [X(:,1)];
x2 = [X(:,2)];
y = [zeros(1,42) ones(1,42); ones(1,42) zeros(1,42)];

As for the accuracy I am unsure of how to adjust my code to calculate accuracy = (number of points classified correctly)/(total number of points). I know I must modify the last function. And I know I will have to change the 10 to 84 but that's the obvious parts. And I know the total number of points is 84. My issue is I don't know how to figure out the numerator and this is where I do not know how to proceed any help would be appreciated thank you.


load('dataset.mat');

function [costval,accuracy] = cost(W2,W3,W4,b2,b3,b4)
     accuracy = ?/84; %Not sure what the numerator should be
     costvec = zeros(84,1);
     for i = 1:84
         x =[x1(i);x2(i)];
         a2 = activate(x,W2,b2);
         a3 = activate(a2,W3,b3);
         a4 = activate(a3,W4,b4);
         costvec(i) = norm(y(:,i) - a4,2);
     end
     costval = norm(costvec,2)^2;
   end % of nested function
 end

The original code to be modified:

Matlab:

function netbp_full
%NETBP_FULL
% Extended version of netbp, with more graphics
%
%   Set up data for neural net test
%   Use backpropagation to train
%   Visualize results
%
% C F Higham and D J Higham, Aug 2017
%
%%%%%%% DATA %%%%%%%%%%%
% xcoords, ycoords, targets
x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7];
x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6];
y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

figure(1)
clf
a1 = subplot(1,1,1);
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
hold on
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a1.XTick = [0 1];
a1.YTick = [0 1];
a1.FontWeight = 'Bold';
a1.FontSize = 16;
xlim([0,1])
ylim([0,1])
%print -dpng pic_xy.png
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Initialize weights and biases
rng(5000);
W2 = 0.5*randn(2,2);
W3 = 0.5*randn(3,2);
W4 = 0.5*randn(2,3);
b2 = 0.5*randn(2,1);
b3 = 0.5*randn(3,1);
b4 = 0.5*randn(2,1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Forward and Back propagate
% Pick a training point at random
eta = 0.05;
Niter = 1e6;
savecost = zeros(Niter,1);
for counter = 1:Niter
    k = randi(10);
    x = [x1(k); x2(k)];
    % Forward pass
    a2 = activate(x,W2,b2);
    a3 = activate(a2,W3,b3);
    a4 = activate(a3,W4,b4);
    % Backward pass
    delta4 = a4.*(1-a4).*(a4-y(:,k));
    delta3 = a3.*(1-a3).*(W4'*delta4);
    delta2 = a2.*(1-a2).*(W3'*delta3);
    % Gradient step
    W2 = W2 - eta*delta2*x';
    W3 = W3 - eta*delta3*a2';
    W4 = W4 - eta*delta4*a3';
    b2 = b2 - eta*delta2;
    b3 = b3 - eta*delta3;
    b4 = b4 - eta*delta4;
    % Monitor progress
    newcost = cost(W2,W3,W4,b2,b3,b4)   % display cost to screen
    savecost(counter) = newcost;
end

figure(2)
clf
semilogy([1:1e4:Niter],savecost(1:1e4:Niter),'b-','LineWidth',2)
xlabel('Iteration Number')
ylabel('Value of cost function')
set(gca,'FontWeight','Bold','FontSize',18)
print -dpng pic_cost.png

%%%%%%%%%%% Display shaded and unshaded regions
N = 500;
Dx = 1/N;
Dy = 1/N;
xvals = [0:Dx:1];
yvals = [0:Dy:1];
for k1 = 1:N+1
    xk = xvals(k1);
    for k2 = 1:N+1
        yk = yvals(k2);
        xy = [xk;yk];
        a2 = activate(xy,W2,b2);
        a3 = activate(a2,W3,b3);
        a4 = activate(a3,W4,b4);
        Aval(k2,k1) = a4(1);
        Bval(k2,k1) = a4(2);
     end
end
[X,Y] = meshgrid(xvals,yvals);

figure(3)
clf
a2 = subplot(1,1,1);
Mval = Aval>Bval;
contourf(X,Y,Mval,[0.5 0.5])
hold on
colormap([1 1 1; 0.8 0.8 0.8])
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a2.XTick = [0 1];
a2.YTick = [0 1];
a2.FontWeight = 'Bold';
a2.FontSize = 16;
xlim([0,1])
ylim([0,1])
print -dpng pic_bdy_bp.png
  function [costval] = cost(W2,W3,W4,b2,b3,b4)
     costvec = zeros(10,1);
     for i = 1:10
         x =[x1(i);x2(i)];
         a2 = activate(x,W2,b2);
         a3 = activate(a2,W3,b3);
         a4 = activate(a3,W4,b4);
         costvec(i) = norm(y(:,i) - a4,2);
     end
     costval = norm(costvec,2)^2;
   end % of nested function
 end

pbuk · Dec 5, 2022

Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?

ver_mathstats · Dec 5, 2022

pbuk said:

Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?

It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.

Matlab:

% Forward and Back propagate

% Pick a training point at random

alpha = 0.05;%Learnin rate for BP.

N_iter = 4000;%Number of iterations.

savecost = zeros(N_iter,1);%cost function values across iterations

for counter = 1:N_iter

    %     x = [x1n(k); x2n(k)];%one single element of the input data set is used

    x = [x1n; x2n];%The whole input data set is used

    % Forward pass

    H1 = activate(x,W1,b1,0);%First hidden layer

    H2 = activate(H1,W2,b2,0);%Second hidden layer

    H3 = activate(H2,W3,b3,1);%linear layer

    % Backward pass

    %     delta4 = a4.*(1-a4).*(a4-y);

    %     delta3 = a3.*(1-a3).*(W4'*delta4);

    %     delta2 = a2.*(1-a2).*(W3'*delta3);

    PE3 = (H3-y);%PE of the linear layer

    PE2 = (1-H2.^2).*(W3'*PE3);%PE of the second hidden layer

    PE1 = (1-H1.^2).*(W2'*PE2);%PE of the first hidden layer

    % Gradient step

    W1 = W1 - alpha*PE1*x';%update for the weights in the first hidden layer

    W2 = W2 - alpha*PE2*H1';%update for the weights in the second hidden layer

    W3 = W3 - 0.1*alpha*PE3*H2';%update for the weights in the linear layer

    b1 = b1 - alpha*sum(PE1,2);%update for the weights in the first hidden layer

    b2 = b2 - alpha*sum(PE2,2);%update for the weights in the second hidden layer

    b3 = b3 - 0.1*alpha*sum(PE3,2);%update for the biases in the linear layer

    % Monitor progress

    newcost = cost(W1,W2,W3,b1,b2,b3,counter,N_iter);   % Cost function value

    savecost(counter) = newcost;

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%

Screen Shot 2022-12-05 at 10.10.41 AM.png

pbuk · Dec 5, 2022

ver_mathstats said:

It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.

I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?

ver_mathstats · Dec 5, 2022

pbuk said:

I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?

a4 would represent the output based on my notes that I am reading, and then as for y(:,i), I know it's the ith column of y, I'm having a bit of trouble with what this one represents, however but I think y represents the target output. So the accuracy would involve these two?

pbuk · Dec 5, 2022

ver_mathstats said:

So the accuracy would involve these two?

Yes. What do you think a4 would look like if the network predicted Category B?

ver_mathstats · Dec 6, 2022

pbuk said:

Yes. What do you think a4 would look like if the network predicted Category B?

This is where I struggle and I am not 100% sure what it would look like.

pbuk · Dec 6, 2022

ver_mathstats said:

This is where I struggle and I am not 100% sure what it would look like.

Perhaps you could print out y(:,i) and a4 and see if that helps you.

An Introduction to Deep Learning and Modifying Code

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Four L-shaped members: Mechanical Analysis Problem

Engineering Joint and Marginal Distributions of a Randomly Selected Test Answer

Engineering Half wave voltage doubler

Truss analysis problem

Engineering Shear Stress Question (Rocker Arm & pin diameter)

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect