Neural Network training BUT testing fails

In summary: It would be better to have a larger training set and/or multiple outputs for each input to better constrain the possible solutions. In summary, the conversation discusses a problem with training a basic neural network on a dataset of triplets, where the third number does not provide any information. The individual is trying to apply what the neural network learned on testing data, but is getting unexpected results. They discuss how changing the last training data point affects the results and question why the network fails at certain inputs. In the end, they suggest that a larger training set or multiple outputs for each input would be better for constraining the possible solutions.
  • #1
ChrisVer
Gold Member
3,378
464
Hi, I was trying to work on a basic NN training problem I found online. The pattern of data to train are the triplets:
[101]->1 , [011]->1, [001]->0 , [111]->0 , [100]->1
Meaning that the 3rd number brings no info, while for the first 2 if they are identical it should evaluate to 0 else to 1. The last data is to also let it learn what happens when the 3rd argument is 0.

I train on these data and then I am trying to apply what the NN learned on testing data:
[000], [010], [110]
Unfortunately I get 1 back... I assume that the problem is that it somehow learns that if the 3rd argument is 0, it should evaluate to 1 by the last training data point. When I changed the last ->1 to ->0, the tested data also evaluated to 0.
So I migrate one more data point in the training, [110]->0.
and test on: [000] , [010] ... In the latter case, both evaluate to 1. I don't understand why it fails so miserably at 000... any idea?

The code is given below (it's the based on the basic NN code given in many online tutorials but written as a class)...
Python:
import numpy as np
import matplotlib.pyplot as plt
class NN:
    def __init__(self, input_data , output_data=None, testing=None ):
        self.input_data = input_data
        self.testing = testing

        if not testing:
            self.output_data = output_data
        # randomly initialize weights
        np.random.seed(1)
        if testing:
            self.weights0 = testing[0]
            self.weights1 = testing[1]
        else:
            self.weights0 = 2*np.random.random([self.input_data.shape[1], self.input_data.shape[0] ]) -1.
            self.weights1 = 2*np.random.random([self.input_data.shape[0], 1])

        self.errros = None
        self.xerros = []

        self.results = {
                        "OutputTrain": None,
                        "Matrix0": None,
                        "Matrix1": None
                        }

    def sigmoid(self, x, derivative=False):
        if derivative: return x*(1-x)
        return 1/(1+np.exp(-x))

    def train(self, Nsteps , test= True):
        self.errors = np.zeros( (self.input_data.shape[0], Nsteps) )
        for step in range(Nsteps):
            l0 = self.input_data
            l1 = self.sigmoid( np.dot(l0, self.weights0 ) ) #(4x3) x (3x4) = 4x4 matrix
            l2 = self.sigmoid( np.dot(l1, self.weights1 ) ) #(4x4) x (4x1) = 4x1 matrix (output)

            if not self.testing:
                l2_err = self.output_data - l2
                delta_l2 = l2_err * self.sigmoid( l2 , derivative=True )
                l1_err = delta_l2.dot(self.weights1.T)
                delta_l1 = l1_err * self.sigmoid( l1 , derivative=True )

                for i in range(self.output_data.shape[0]):
                    self.errors[i][step] = l2_err[i]
                self.xerros.append(step)

                self.weights1 += l1.T.dot(delta_l2)
                self.weights0 += l0.T.dot(delta_l1)
        self.results["OutputTrain"]=l2
        self.results["Matrix1"] = self.weights1
        self.results["Matrix0"] = self.weights0    def summary(self):
        print("Training Results : ")
        print("\t Output data (trained) : ")
        print(self.results["OutputTrain"])
        print("\t Matrix 1 : ")
        print(self.results["Matrix0"])
        print("\t Matrix 2 : ")
        print(self.results["Matrix1"])    def plot(self):
        x = np.array(self.xerros)
        cols = {0:'black',1:'r',2:'g',3:'b',4:'m',5:'y'}
        for i in range(self.output_data.shape[0]):

            plt.plot(x, np.array(self.errors[i]), cols[i], label="Entry %s"%i)
            legend = plt.legend(loc='upper right', shadow=True, fontsize='x-large', framealpha=0.05)
            axes = plt.gca()
            axes.set_ylim([-2., 2.])
            #legend.get_frame().set_facecolor('#00FFCC')
        plt.title("Error vs Ntrial")
        plt.show()

if __name__=="__main__":
    x = np.array([[0, 0, 1],
                  [0, 1, 1],
                  [1, 0, 0],
                  [1, 1, 0],
                  [1, 0, 1],
                  [1, 1, 1]])

    # output
    y = np.array([[0],
                  [1],
                  [1],
                  [0],
                  [1],
                  [0]])

    NeuralNetwork = NN(x,y)
    NeuralNetwork.train(25000)
    #NeuralNetwork.plot()
    #NeuralNetwork.summary()

    print("Test Data")
    data_test = np.array([[0,0,0],[0,1,0]])#,[1,1,0]])
    #isOne = (data_test.dot(NeuralNetwork.results["Matrix0"])).dot(NeuralNetwork.results["Matrix1"])
    #isOne = data_test.dot(NeuralNetwork.results["Matrix0"])
    #isOne = isOne.dot(NeuralNetwork.results["Matrix1"])
    #print(isOne)
    tester = NN(data_test, testing = [NeuralNetwork.results["Matrix0"], NeuralNetwork.results["Matrix1"]])
    tester.train(25000)
    #tester.summary()
    print(tester.results["OutputTrain"])
 
Technology news on Phys.org
  • #2
Your training dataset is compatible with the hypotheses "(A XOR B) OR NOT C" in the original case and "(A XOR B) AND NOT C" in the modified case, and probably something similar in the third case.
How is the NN supposed to know that you didn't want these, but "A XOR B"?

In general training a neural net on such a small input space and with fixed results for each possible input rarely works because there are always multiple possible hypotheses.
 

1. What is a neural network?

A neural network is a type of machine learning model inspired by the structure and function of the human brain. It consists of interconnected nodes or neurons that work together to process and analyze data, making it useful for tasks such as pattern recognition and prediction.

2. How is a neural network trained?

A neural network is trained by adjusting the weights and biases of its neurons based on a given dataset. This process, known as backpropagation, involves repeatedly feeding the data through the network, comparing the output to the desired output, and using an algorithm to update the weights and biases accordingly.

3. What does it mean when neural network training succeeds but testing fails?

This means that the neural network has overfit the training data, meaning it has learned the specific patterns and noise of the training data too well and is unable to generalize to new data. This can happen when the network is too complex or when there is not enough diverse data to train on.

4. How can overfitting in neural networks be prevented?

Overfitting in neural networks can be prevented by using techniques such as regularization, early stopping, and dropout. Regularization adds a penalty term to the loss function to discourage complex models, early stopping stops training when the model starts overfitting, and dropout randomly drops a percentage of neurons during training to prevent them from relying too heavily on each other.

5. Are there other potential reasons for neural network testing failure?

Yes, there are other potential reasons for neural network testing failure, including underfitting, which occurs when the network is not complex enough to capture the patterns in the data, and vanishing or exploding gradients, which can happen when the gradient values become too small or too large during training, making it difficult for the network to learn.

Similar threads

  • Programming and Computer Science
Replies
2
Views
918
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
5
Views
2K
  • Programming and Computer Science
Replies
1
Views
2K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
7
Views
6K
  • Advanced Physics Homework Help
Replies
7
Views
1K
  • Programming and Computer Science
Replies
2
Views
2K
  • Programming and Computer Science
Replies
15
Views
3K
  • Programming and Computer Science
Replies
2
Views
2K
Back
Top