An Introduction to Deep Learning and Modifying Code

  • Context: Comp Sci 
  • Thread starter Thread starter ver_mathstats
  • Start date Start date
  • Tags Tags
    Code Introduction
Click For Summary

Discussion Overview

The discussion revolves around modifying a deep learning code, specifically focusing on loading data from a MATLAB file and calculating accuracy in a neural network context. Participants explore the implications of these modifications on the existing code structure and functionality.

Discussion Character

  • Technical explanation
  • Homework-related
  • Debate/contested

Main Points Raised

  • One participant discusses how to load a MATLAB file into the code and mentions the need to replace parts of the original data structure.
  • Another participant questions the calculation of accuracy, specifically how to determine the numerator for the accuracy formula.
  • Several participants seek clarification on the meaning of the vectors y(:,i) and a4, with some suggesting that y represents the target output.
  • There are repeated requests for explanations regarding the cost evaluation and the roles of different layers in the neural network.
  • One participant suggests printing out y(:,i) and a4 to better understand their values and implications for accuracy.

Areas of Agreement / Disagreement

Participants express uncertainty about the specifics of the accuracy calculation and the representation of the output vectors. There is no consensus on how to proceed with the modifications or the interpretation of certain elements within the code.

Contextual Notes

Participants have not reached a resolution on how to calculate the numerator for accuracy, and there are unresolved questions about the definitions and roles of the vectors involved in the cost evaluation.

ver_mathstats
Messages
258
Reaction score
21
Homework Statement
We are given a code from an article "Deep Learning: An Introduction for Applied Mathematicians" https://epubs.siam.org/doi/epdf/10.1137/18M1165748 , the code is below but has been extended. For homework, we are required to make modifications, these include 1. loading a mat file into the code, 2. modifying the cost function to make it so that it includes accuracy = (number of points classified correctly)/(total number of points), 3. the training should stop if an accuracy of 0.97 is reached otherwise continue to Niter (iterate).
Relevant Equations
Deep Learning
The code with no modifications is at the very bottom. For part 1: loading a mat file into the code, I just put it above the last function but it's going to have to replace the data part of the original code thus I have to extract it out of the file: (I would add the file but I am unsure of how to add a matlab file onto Physics Forum, or if that is even possible).

x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7]; x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6]; y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

So the above part then becomes:

x1 = [X(:,1)]; x2 = [X(:,2)]; y = [zeros(1,42) ones(1,42); ones(1,42) zeros(1,42)];

As for the accuracy I am unsure of how to adjust my code to calculate accuracy = (number of points classified correctly)/(total number of points). I know I must modify the last function. And I know I will have to change the 10 to 84 but that's the obvious parts. And I know the total number of points is 84. My issue is I don't know how to figure out the numerator and this is where I do not know how to proceed any help would be appreciated thank you.

load('dataset.mat'); function [costval,accuracy] = cost(W2,W3,W4,b2,b3,b4) accuracy = ?/84; %Not sure what the numerator should be costvec = zeros(84,1); for i = 1:84 x =[x1(i);x2(i)]; a2 = activate(x,W2,b2); a3 = activate(a2,W3,b3); a4 = activate(a3,W4,b4); costvec(i) = norm(y(:,i) - a4,2); end costval = norm(costvec,2)^2; end % of nested function end

The original code to be modified:

Matlab:
function netbp_full
%NETBP_FULL
% Extended version of netbp, with more graphics
%
%   Set up data for neural net test
%   Use backpropagation to train
%   Visualize results
%
% C F Higham and D J Higham, Aug 2017
%
%%%%%%% DATA %%%%%%%%%%%
% xcoords, ycoords, targets
x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7];
x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6];
y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

figure(1)
clf
a1 = subplot(1,1,1);
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
hold on
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a1.XTick = [0 1];
a1.YTick = [0 1];
a1.FontWeight = 'Bold';
a1.FontSize = 16;
xlim([0,1])
ylim([0,1])
%print -dpng pic_xy.png
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Initialize weights and biases
rng(5000);
W2 = 0.5*randn(2,2);
W3 = 0.5*randn(3,2);
W4 = 0.5*randn(2,3);
b2 = 0.5*randn(2,1);
b3 = 0.5*randn(3,1);
b4 = 0.5*randn(2,1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Forward and Back propagate
% Pick a training point at random
eta = 0.05;
Niter = 1e6;
savecost = zeros(Niter,1);
for counter = 1:Niter
    k = randi(10);
    x = [x1(k); x2(k)];
    % Forward pass
    a2 = activate(x,W2,b2);
    a3 = activate(a2,W3,b3);
    a4 = activate(a3,W4,b4);
    % Backward pass
    delta4 = a4.*(1-a4).*(a4-y(:,k));
    delta3 = a3.*(1-a3).*(W4'*delta4);
    delta2 = a2.*(1-a2).*(W3'*delta3);
    % Gradient step
    W2 = W2 - eta*delta2*x';
    W3 = W3 - eta*delta3*a2';
    W4 = W4 - eta*delta4*a3';
    b2 = b2 - eta*delta2;
    b3 = b3 - eta*delta3;
    b4 = b4 - eta*delta4;
    % Monitor progress
    newcost = cost(W2,W3,W4,b2,b3,b4)   % display cost to screen
    savecost(counter) = newcost;
end

figure(2)
clf
semilogy([1:1e4:Niter],savecost(1:1e4:Niter),'b-','LineWidth',2)
xlabel('Iteration Number')
ylabel('Value of cost function')
set(gca,'FontWeight','Bold','FontSize',18)
print -dpng pic_cost.png

%%%%%%%%%%% Display shaded and unshaded regions
N = 500;
Dx = 1/N;
Dy = 1/N;
xvals = [0:Dx:1];
yvals = [0:Dy:1];
for k1 = 1:N+1
    xk = xvals(k1);
    for k2 = 1:N+1
        yk = yvals(k2);
        xy = [xk;yk];
        a2 = activate(xy,W2,b2);
        a3 = activate(a2,W3,b3);
        a4 = activate(a3,W4,b4);
        Aval(k2,k1) = a4(1);
        Bval(k2,k1) = a4(2);
     end
end
[X,Y] = meshgrid(xvals,yvals);

figure(3)
clf
a2 = subplot(1,1,1);
Mval = Aval>Bval;
contourf(X,Y,Mval,[0.5 0.5])
hold on
colormap([1 1 1; 0.8 0.8 0.8])
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a2.XTick = [0 1];
a2.YTick = [0 1];
a2.FontWeight = 'Bold';
a2.FontSize = 16;
xlim([0,1])
ylim([0,1])
print -dpng pic_bdy_bp.png
  function [costval] = cost(W2,W3,W4,b2,b3,b4)
     costvec = zeros(10,1);
     for i = 1:10
         x =[x1(i);x2(i)];
         a2 = activate(x,W2,b2);
         a3 = activate(a2,W3,b3);
         a4 = activate(a3,W4,b4);
         costvec(i) = norm(y(:,i) - a4,2);
     end
     costval = norm(costvec,2)^2;
   end % of nested function
 end
 
Last edited by a moderator:
Physics news on Phys.org
Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?
 
pbuk said:
Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?
It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.
Matlab:
% Forward and Back propagate

% Pick a training point at random

alpha = 0.05;%Learnin rate for BP.

N_iter = 4000;%Number of iterations.

savecost = zeros(N_iter,1);%cost function values across iterations

for counter = 1:N_iter

    %     x = [x1n(k); x2n(k)];%one single element of the input data set is used

    x = [x1n; x2n];%The whole input data set is used

    % Forward pass

    H1 = activate(x,W1,b1,0);%First hidden layer

    H2 = activate(H1,W2,b2,0);%Second hidden layer

    H3 = activate(H2,W3,b3,1);%linear layer

    % Backward pass

    %     delta4 = a4.*(1-a4).*(a4-y);

    %     delta3 = a3.*(1-a3).*(W4'*delta4);

    %     delta2 = a2.*(1-a2).*(W3'*delta3);

    PE3 = (H3-y);%PE of the linear layer

    PE2 = (1-H2.^2).*(W3'*PE3);%PE of the second hidden layer

    PE1 = (1-H1.^2).*(W2'*PE2);%PE of the first hidden layer

    % Gradient step

    W1 = W1 - alpha*PE1*x';%update for the weights in the first hidden layer

    W2 = W2 - alpha*PE2*H1';%update for the weights in the second hidden layer

    W3 = W3 - 0.1*alpha*PE3*H2';%update for the weights in the linear layer

    b1 = b1 - alpha*sum(PE1,2);%update for the weights in the first hidden layer

    b2 = b2 - alpha*sum(PE2,2);%update for the weights in the second hidden layer

    b3 = b3 - 0.1*alpha*sum(PE3,2);%update for the biases in the linear layer

    % Monitor progress

    newcost = cost(W1,W2,W3,b1,b2,b3,counter,N_iter);   % Cost function value

    savecost(counter) = newcost;

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%
Screen Shot 2022-12-05 at 10.10.41 AM.png
 
ver_mathstats said:
It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.
I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?
 
pbuk said:
I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?
a4 would represent the output based on my notes that I am reading, and then as for y(:,i), I know it's the ith column of y, I'm having a bit of trouble with what this one represents, however but I think y represents the target output. So the accuracy would involve these two?
 
ver_mathstats said:
So the accuracy would involve these two?
Yes. What do you think a4 would look like if the network predicted Category B?
 
pbuk said:
Yes. What do you think a4 would look like if the network predicted Category B?
This is where I struggle and I am not 100% sure what it would look like.
 
ver_mathstats said:
This is where I struggle and I am not 100% sure what it would look like.
Perhaps you could print out y(:,i) and a4 and see if that helps you.
 

Similar threads

Replies
2
Views
2K
Replies
6
Views
6K
  • · Replies 2 ·
Replies
2
Views
4K