Comp Sci An Introduction to Deep Learning and Modifying Code

  • Thread starter Thread starter ver_mathstats
  • Start date Start date
  • Tags Tags
    Code Introduction
AI Thread Summary
The discussion focuses on modifying a deep learning code to load data from a MATLAB file and calculate the accuracy of a neural network model. The user is attempting to replace original data arrays with data extracted from a .mat file and is unsure how to implement the accuracy calculation in the modified code. The accuracy formula requires determining the number of correctly classified points, which is currently unclear to the user. Additionally, there is a conversation about understanding the output vectors from the neural network and their relation to the target outputs. The thread emphasizes the need for clarity in the implementation of these modifications and calculations.
ver_mathstats
Messages
258
Reaction score
21
Homework Statement
We are given a code from an article "Deep Learning: An Introduction for Applied Mathematicians" https://epubs.siam.org/doi/epdf/10.1137/18M1165748 , the code is below but has been extended. For homework, we are required to make modifications, these include 1. loading a mat file into the code, 2. modifying the cost function to make it so that it includes accuracy = (number of points classified correctly)/(total number of points), 3. the training should stop if an accuracy of 0.97 is reached otherwise continue to Niter (iterate).
Relevant Equations
Deep Learning
The code with no modifications is at the very bottom. For part 1: loading a mat file into the code, I just put it above the last function but it's going to have to replace the data part of the original code thus I have to extract it out of the file: (I would add the file but I am unsure of how to add a matlab file onto Physics Forum, or if that is even possible).

x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7]; x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6]; y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

So the above part then becomes:

x1 = [X(:,1)]; x2 = [X(:,2)]; y = [zeros(1,42) ones(1,42); ones(1,42) zeros(1,42)];

As for the accuracy I am unsure of how to adjust my code to calculate accuracy = (number of points classified correctly)/(total number of points). I know I must modify the last function. And I know I will have to change the 10 to 84 but that's the obvious parts. And I know the total number of points is 84. My issue is I don't know how to figure out the numerator and this is where I do not know how to proceed any help would be appreciated thank you.

load('dataset.mat'); function [costval,accuracy] = cost(W2,W3,W4,b2,b3,b4) accuracy = ?/84; %Not sure what the numerator should be costvec = zeros(84,1); for i = 1:84 x =[x1(i);x2(i)]; a2 = activate(x,W2,b2); a3 = activate(a2,W3,b3); a4 = activate(a3,W4,b4); costvec(i) = norm(y(:,i) - a4,2); end costval = norm(costvec,2)^2; end % of nested function end

The original code to be modified:

Matlab:
function netbp_full
%NETBP_FULL
% Extended version of netbp, with more graphics
%
%   Set up data for neural net test
%   Use backpropagation to train
%   Visualize results
%
% C F Higham and D J Higham, Aug 2017
%
%%%%%%% DATA %%%%%%%%%%%
% xcoords, ycoords, targets
x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7];
x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6];
y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

figure(1)
clf
a1 = subplot(1,1,1);
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
hold on
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a1.XTick = [0 1];
a1.YTick = [0 1];
a1.FontWeight = 'Bold';
a1.FontSize = 16;
xlim([0,1])
ylim([0,1])
%print -dpng pic_xy.png
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Initialize weights and biases
rng(5000);
W2 = 0.5*randn(2,2);
W3 = 0.5*randn(3,2);
W4 = 0.5*randn(2,3);
b2 = 0.5*randn(2,1);
b3 = 0.5*randn(3,1);
b4 = 0.5*randn(2,1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Forward and Back propagate
% Pick a training point at random
eta = 0.05;
Niter = 1e6;
savecost = zeros(Niter,1);
for counter = 1:Niter
    k = randi(10);
    x = [x1(k); x2(k)];
    % Forward pass
    a2 = activate(x,W2,b2);
    a3 = activate(a2,W3,b3);
    a4 = activate(a3,W4,b4);
    % Backward pass
    delta4 = a4.*(1-a4).*(a4-y(:,k));
    delta3 = a3.*(1-a3).*(W4'*delta4);
    delta2 = a2.*(1-a2).*(W3'*delta3);
    % Gradient step
    W2 = W2 - eta*delta2*x';
    W3 = W3 - eta*delta3*a2';
    W4 = W4 - eta*delta4*a3';
    b2 = b2 - eta*delta2;
    b3 = b3 - eta*delta3;
    b4 = b4 - eta*delta4;
    % Monitor progress
    newcost = cost(W2,W3,W4,b2,b3,b4)   % display cost to screen
    savecost(counter) = newcost;
end

figure(2)
clf
semilogy([1:1e4:Niter],savecost(1:1e4:Niter),'b-','LineWidth',2)
xlabel('Iteration Number')
ylabel('Value of cost function')
set(gca,'FontWeight','Bold','FontSize',18)
print -dpng pic_cost.png

%%%%%%%%%%% Display shaded and unshaded regions
N = 500;
Dx = 1/N;
Dy = 1/N;
xvals = [0:Dx:1];
yvals = [0:Dy:1];
for k1 = 1:N+1
    xk = xvals(k1);
    for k2 = 1:N+1
        yk = yvals(k2);
        xy = [xk;yk];
        a2 = activate(xy,W2,b2);
        a3 = activate(a2,W3,b3);
        a4 = activate(a3,W4,b4);
        Aval(k2,k1) = a4(1);
        Bval(k2,k1) = a4(2);
     end
end
[X,Y] = meshgrid(xvals,yvals);

figure(3)
clf
a2 = subplot(1,1,1);
Mval = Aval>Bval;
contourf(X,Y,Mval,[0.5 0.5])
hold on
colormap([1 1 1; 0.8 0.8 0.8])
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a2.XTick = [0 1];
a2.YTick = [0 1];
a2.FontWeight = 'Bold';
a2.FontSize = 16;
xlim([0,1])
ylim([0,1])
print -dpng pic_bdy_bp.png
  function [costval] = cost(W2,W3,W4,b2,b3,b4)
     costvec = zeros(10,1);
     for i = 1:10
         x =[x1(i);x2(i)];
         a2 = activate(x,W2,b2);
         a3 = activate(a2,W3,b3);
         a4 = activate(a3,W4,b4);
         costvec(i) = norm(y(:,i) - a4,2);
     end
     costval = norm(costvec,2)^2;
   end % of nested function
 end
 
Last edited by a moderator:
Physics news on Phys.org
Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?
 
pbuk said:
Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?
It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.
Matlab:
% Forward and Back propagate

% Pick a training point at random

alpha = 0.05;%Learnin rate for BP.

N_iter = 4000;%Number of iterations.

savecost = zeros(N_iter,1);%cost function values across iterations

for counter = 1:N_iter

    %     x = [x1n(k); x2n(k)];%one single element of the input data set is used

    x = [x1n; x2n];%The whole input data set is used

    % Forward pass

    H1 = activate(x,W1,b1,0);%First hidden layer

    H2 = activate(H1,W2,b2,0);%Second hidden layer

    H3 = activate(H2,W3,b3,1);%linear layer

    % Backward pass

    %     delta4 = a4.*(1-a4).*(a4-y);

    %     delta3 = a3.*(1-a3).*(W4'*delta4);

    %     delta2 = a2.*(1-a2).*(W3'*delta3);

    PE3 = (H3-y);%PE of the linear layer

    PE2 = (1-H2.^2).*(W3'*PE3);%PE of the second hidden layer

    PE1 = (1-H1.^2).*(W2'*PE2);%PE of the first hidden layer

    % Gradient step

    W1 = W1 - alpha*PE1*x';%update for the weights in the first hidden layer

    W2 = W2 - alpha*PE2*H1';%update for the weights in the second hidden layer

    W3 = W3 - 0.1*alpha*PE3*H2';%update for the weights in the linear layer

    b1 = b1 - alpha*sum(PE1,2);%update for the weights in the first hidden layer

    b2 = b2 - alpha*sum(PE2,2);%update for the weights in the second hidden layer

    b3 = b3 - 0.1*alpha*sum(PE3,2);%update for the biases in the linear layer

    % Monitor progress

    newcost = cost(W1,W2,W3,b1,b2,b3,counter,N_iter);   % Cost function value

    savecost(counter) = newcost;

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%
Screen Shot 2022-12-05 at 10.10.41 AM.png
 
ver_mathstats said:
It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.
I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?
 
pbuk said:
I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?
a4 would represent the output based on my notes that I am reading, and then as for y(:,i), I know it's the ith column of y, I'm having a bit of trouble with what this one represents, however but I think y represents the target output. So the accuracy would involve these two?
 
ver_mathstats said:
So the accuracy would involve these two?
Yes. What do you think a4 would look like if the network predicted Category B?
 
pbuk said:
Yes. What do you think a4 would look like if the network predicted Category B?
This is where I struggle and I am not 100% sure what it would look like.
 
ver_mathstats said:
This is where I struggle and I am not 100% sure what it would look like.
Perhaps you could print out y(:,i) and a4 and see if that helps you.
 
Back
Top