An Introduction to Deep Learning and Modifying Code

In summary, the conversation discusses modifying code for a neural network and calculating accuracy. The code includes loading a mat file, initializing weights and biases, and performing a forward and backward pass. The goal is to train the network and visualize the results. The speaker is unsure of how to calculate accuracy and asks for help. The code to be modified includes a function to calculate the cost and a nested function to perform the backpropagation algorithm. The conversation ends with a question about the purpose of the norm function used in the code.
  • #1
ver_mathstats
260
21
Homework Statement
We are given a code from an article "Deep Learning: An Introduction for Applied Mathematicians" https://epubs.siam.org/doi/epdf/10.1137/18M1165748 , the code is below but has been extended. For homework, we are required to make modifications, these include 1. loading a mat file into the code, 2. modifying the cost function to make it so that it includes accuracy = (number of points classified correctly)/(total number of points), 3. the training should stop if an accuracy of 0.97 is reached otherwise continue to Niter (iterate).
Relevant Equations
Deep Learning
The code with no modifications is at the very bottom. For part 1: loading a mat file into the code, I just put it above the last function but it's going to have to replace the data part of the original code thus I have to extract it out of the file: (I would add the file but I am unsure of how to add a matlab file onto Physics Forum, or if that is even possible).

x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7]; x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6]; y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

So the above part then becomes:

x1 = [X(:,1)]; x2 = [X(:,2)]; y = [zeros(1,42) ones(1,42); ones(1,42) zeros(1,42)];

As for the accuracy I am unsure of how to adjust my code to calculate accuracy = (number of points classified correctly)/(total number of points). I know I must modify the last function. And I know I will have to change the 10 to 84 but that's the obvious parts. And I know the total number of points is 84. My issue is I don't know how to figure out the numerator and this is where I do not know how to proceed any help would be appreciated thank you.

load('dataset.mat'); function [costval,accuracy] = cost(W2,W3,W4,b2,b3,b4) accuracy = ?/84; %Not sure what the numerator should be costvec = zeros(84,1); for i = 1:84 x =[x1(i);x2(i)]; a2 = activate(x,W2,b2); a3 = activate(a2,W3,b3); a4 = activate(a3,W4,b4); costvec(i) = norm(y(:,i) - a4,2); end costval = norm(costvec,2)^2; end % of nested function end

The original code to be modified:

Matlab:
function netbp_full
%NETBP_FULL
% Extended version of netbp, with more graphics
%
%   Set up data for neural net test
%   Use backpropagation to train
%   Visualize results
%
% C F Higham and D J Higham, Aug 2017
%
%%%%%%% DATA %%%%%%%%%%%
% xcoords, ycoords, targets
x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7];
x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6];
y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)];

figure(1)
clf
a1 = subplot(1,1,1);
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
hold on
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a1.XTick = [0 1];
a1.YTick = [0 1];
a1.FontWeight = 'Bold';
a1.FontSize = 16;
xlim([0,1])
ylim([0,1])
%print -dpng pic_xy.png
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Initialize weights and biases
rng(5000);
W2 = 0.5*randn(2,2);
W3 = 0.5*randn(3,2);
W4 = 0.5*randn(2,3);
b2 = 0.5*randn(2,1);
b3 = 0.5*randn(3,1);
b4 = 0.5*randn(2,1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Forward and Back propagate
% Pick a training point at random
eta = 0.05;
Niter = 1e6;
savecost = zeros(Niter,1);
for counter = 1:Niter
    k = randi(10);
    x = [x1(k); x2(k)];
    % Forward pass
    a2 = activate(x,W2,b2);
    a3 = activate(a2,W3,b3);
    a4 = activate(a3,W4,b4);
    % Backward pass
    delta4 = a4.*(1-a4).*(a4-y(:,k));
    delta3 = a3.*(1-a3).*(W4'*delta4);
    delta2 = a2.*(1-a2).*(W3'*delta3);
    % Gradient step
    W2 = W2 - eta*delta2*x';
    W3 = W3 - eta*delta3*a2';
    W4 = W4 - eta*delta4*a3';
    b2 = b2 - eta*delta2;
    b3 = b3 - eta*delta3;
    b4 = b4 - eta*delta4;
    % Monitor progress
    newcost = cost(W2,W3,W4,b2,b3,b4)   % display cost to screen
    savecost(counter) = newcost;
end

figure(2)
clf
semilogy([1:1e4:Niter],savecost(1:1e4:Niter),'b-','LineWidth',2)
xlabel('Iteration Number')
ylabel('Value of cost function')
set(gca,'FontWeight','Bold','FontSize',18)
print -dpng pic_cost.png

%%%%%%%%%%% Display shaded and unshaded regions
N = 500;
Dx = 1/N;
Dy = 1/N;
xvals = [0:Dx:1];
yvals = [0:Dy:1];
for k1 = 1:N+1
    xk = xvals(k1);
    for k2 = 1:N+1
        yk = yvals(k2);
        xy = [xk;yk];
        a2 = activate(xy,W2,b2);
        a3 = activate(a2,W3,b3);
        a4 = activate(a3,W4,b4);
        Aval(k2,k1) = a4(1);
        Bval(k2,k1) = a4(2);
     end
end
[X,Y] = meshgrid(xvals,yvals);

figure(3)
clf
a2 = subplot(1,1,1);
Mval = Aval>Bval;
contourf(X,Y,Mval,[0.5 0.5])
hold on
colormap([1 1 1; 0.8 0.8 0.8])
plot(x1(1:5),x2(1:5),'ro','MarkerSize',12,'LineWidth',4)
plot(x1(6:10),x2(6:10),'bx','MarkerSize',12,'LineWidth',4)
a2.XTick = [0 1];
a2.YTick = [0 1];
a2.FontWeight = 'Bold';
a2.FontSize = 16;
xlim([0,1])
ylim([0,1])
print -dpng pic_bdy_bp.png
  function [costval] = cost(W2,W3,W4,b2,b3,b4)
     costvec = zeros(10,1);
     for i = 1:10
         x =[x1(i);x2(i)];
         a2 = activate(x,W2,b2);
         a3 = activate(a2,W3,b3);
         a4 = activate(a3,W4,b4);
         costvec(i) = norm(y(:,i) - a4,2);
     end
     costval = norm(costvec,2)^2;
   end % of nested function
 end
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?
 
  • #3
pbuk said:
Can you explain what norm(y(:,i) - a4,2) does, or at least what the vectors y(:,i) and a4 represent?
It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.
Matlab:
% Forward and Back propagate

% Pick a training point at random

alpha = 0.05;%Learnin rate for BP.

N_iter = 4000;%Number of iterations.

savecost = zeros(N_iter,1);%cost function values across iterations

for counter = 1:N_iter

    %     x = [x1n(k); x2n(k)];%one single element of the input data set is used

    x = [x1n; x2n];%The whole input data set is used

    % Forward pass

    H1 = activate(x,W1,b1,0);%First hidden layer

    H2 = activate(H1,W2,b2,0);%Second hidden layer

    H3 = activate(H2,W3,b3,1);%linear layer

    % Backward pass

    %     delta4 = a4.*(1-a4).*(a4-y);

    %     delta3 = a3.*(1-a3).*(W4'*delta4);

    %     delta2 = a2.*(1-a2).*(W3'*delta3);

    PE3 = (H3-y);%PE of the linear layer

    PE2 = (1-H2.^2).*(W3'*PE3);%PE of the second hidden layer

    PE1 = (1-H1.^2).*(W2'*PE2);%PE of the first hidden layer

    % Gradient step

    W1 = W1 - alpha*PE1*x';%update for the weights in the first hidden layer

    W2 = W2 - alpha*PE2*H1';%update for the weights in the second hidden layer

    W3 = W3 - 0.1*alpha*PE3*H2';%update for the weights in the linear layer

    b1 = b1 - alpha*sum(PE1,2);%update for the weights in the first hidden layer

    b2 = b2 - alpha*sum(PE2,2);%update for the weights in the second hidden layer

    b3 = b3 - 0.1*alpha*sum(PE3,2);%update for the biases in the linear layer

    % Monitor progress

    newcost = cost(W1,W2,W3,b1,b2,b3,counter,N_iter);   % Cost function value

    savecost(counter) = newcost;

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%
Screen Shot 2022-12-05 at 10.10.41 AM.png
 
  • #4
ver_mathstats said:
It's the cost evaluation (I included a picture of the formula at the very bottom), a2 would be the first hidden layer, a3 would be the second hidden layer and a4 is the linear layer. Here are comments from another example.
I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?
 
  • #5
pbuk said:
I think you have misunderstood: I know the answer and I am trying to help you work it out. I'll try again:

Can you explain what the elements of the vectors y(:,i) and a4 represent?
a4 would represent the output based on my notes that I am reading, and then as for y(:,i), I know it's the ith column of y, I'm having a bit of trouble with what this one represents, however but I think y represents the target output. So the accuracy would involve these two?
 
  • #6
ver_mathstats said:
So the accuracy would involve these two?
Yes. What do you think a4 would look like if the network predicted Category B?
 
  • #7
pbuk said:
Yes. What do you think a4 would look like if the network predicted Category B?
This is where I struggle and I am not 100% sure what it would look like.
 
  • #8
ver_mathstats said:
This is where I struggle and I am not 100% sure what it would look like.
Perhaps you could print out y(:,i) and a4 and see if that helps you.
 

1. What is deep learning?

Deep learning is a subset of machine learning that involves training artificial neural networks to learn from large amounts of data. It is inspired by the structure and function of the human brain, and it allows computers to learn and make decisions without being explicitly programmed.

2. How is deep learning different from traditional machine learning?

Traditional machine learning algorithms require manual feature extraction, whereas deep learning algorithms are able to automatically extract features from raw data. Deep learning also typically involves training on larger datasets and can handle more complex and unstructured data.

3. What is code modification in the context of deep learning?

Code modification in deep learning refers to making changes to the code of a deep learning model in order to improve its performance or adapt it to a specific task or dataset. This can involve changing the architecture of the model, adjusting hyperparameters, or adding new layers or techniques.

4. How can I get started with deep learning and modifying code?

To get started with deep learning and modifying code, it is important to have a strong understanding of programming fundamentals and some experience with machine learning. There are many online resources, tutorials, and courses available to help you learn the basics of deep learning and how to modify code for different purposes.

5. What are some common applications of deep learning?

Some common applications of deep learning include image and speech recognition, natural language processing, recommender systems, and autonomous vehicles. It is also being used in various industries such as healthcare, finance, and marketing for tasks such as fraud detection, predictive maintenance, and customer segmentation.

Similar threads

Replies
2
Views
1K
Replies
6
Views
5K
  • General Discussion
Replies
2
Views
3K
Back
Top