Neural Network training BUT testing fails

Click For Summary
SUMMARY

The forum discussion centers on a neural network (NN) training issue where the model fails to accurately predict outputs for specific test data. The training dataset consists of triplets with defined outputs, but the NN incorrectly predicts outputs for test cases like [000] and [010]. The user identifies that the NN learns incorrect associations due to the limited training data, leading to erroneous predictions. The discussion highlights the challenges of training NNs on small datasets with fixed results, emphasizing the need for diverse training examples to avoid multiple conflicting hypotheses.

PREREQUISITES
  • Understanding of neural network architecture and training processes
  • Familiarity with Python programming and libraries such as NumPy and Matplotlib
  • Knowledge of activation functions, specifically the sigmoid function
  • Basic concepts of overfitting and generalization in machine learning
NEXT STEPS
  • Explore techniques to augment training datasets for neural networks
  • Learn about regularization methods to prevent overfitting in neural networks
  • Investigate alternative activation functions and their impact on training outcomes
  • Study the implications of hypothesis space in machine learning and how to manage it
USEFUL FOR

Data scientists, machine learning engineers, and developers working with neural networks who are troubleshooting training and testing discrepancies in model predictions.

ChrisVer
Science Advisor
Messages
3,372
Reaction score
465
Hi, I was trying to work on a basic NN training problem I found online. The pattern of data to train are the triplets:
[101]->1 , [011]->1, [001]->0 , [111]->0 , [100]->1
Meaning that the 3rd number brings no info, while for the first 2 if they are identical it should evaluate to 0 else to 1. The last data is to also let it learn what happens when the 3rd argument is 0.

I train on these data and then I am trying to apply what the NN learned on testing data:
[000], [010], [110]
Unfortunately I get 1 back... I assume that the problem is that it somehow learns that if the 3rd argument is 0, it should evaluate to 1 by the last training data point. When I changed the last ->1 to ->0, the tested data also evaluated to 0.
So I migrate one more data point in the training, [110]->0.
and test on: [000] , [010] ... In the latter case, both evaluate to 1. I don't understand why it fails so miserably at 000... any idea?

The code is given below (it's the based on the basic NN code given in many online tutorials but written as a class)...
Python:
import numpy as np
import matplotlib.pyplot as plt
class NN:
    def __init__(self, input_data , output_data=None, testing=None ):
        self.input_data = input_data
        self.testing = testing

        if not testing:
            self.output_data = output_data
        # randomly initialize weights
        np.random.seed(1)
        if testing:
            self.weights0 = testing[0]
            self.weights1 = testing[1]
        else:
            self.weights0 = 2*np.random.random([self.input_data.shape[1], self.input_data.shape[0] ]) -1.
            self.weights1 = 2*np.random.random([self.input_data.shape[0], 1])

        self.errros = None
        self.xerros = []

        self.results = {
                        "OutputTrain": None,
                        "Matrix0": None,
                        "Matrix1": None
                        }

    def sigmoid(self, x, derivative=False):
        if derivative: return x*(1-x)
        return 1/(1+np.exp(-x))

    def train(self, Nsteps , test= True):
        self.errors = np.zeros( (self.input_data.shape[0], Nsteps) )
        for step in range(Nsteps):
            l0 = self.input_data
            l1 = self.sigmoid( np.dot(l0, self.weights0 ) ) #(4x3) x (3x4) = 4x4 matrix
            l2 = self.sigmoid( np.dot(l1, self.weights1 ) ) #(4x4) x (4x1) = 4x1 matrix (output)

            if not self.testing:
                l2_err = self.output_data - l2
                delta_l2 = l2_err * self.sigmoid( l2 , derivative=True )
                l1_err = delta_l2.dot(self.weights1.T)
                delta_l1 = l1_err * self.sigmoid( l1 , derivative=True )

                for i in range(self.output_data.shape[0]):
                    self.errors[i][step] = l2_err[i]
                self.xerros.append(step)

                self.weights1 += l1.T.dot(delta_l2)
                self.weights0 += l0.T.dot(delta_l1)
        self.results["OutputTrain"]=l2
        self.results["Matrix1"] = self.weights1
        self.results["Matrix0"] = self.weights0    def summary(self):
        print("Training Results : ")
        print("\t Output data (trained) : ")
        print(self.results["OutputTrain"])
        print("\t Matrix 1 : ")
        print(self.results["Matrix0"])
        print("\t Matrix 2 : ")
        print(self.results["Matrix1"])    def plot(self):
        x = np.array(self.xerros)
        cols = {0:'black',1:'r',2:'g',3:'b',4:'m',5:'y'}
        for i in range(self.output_data.shape[0]):

            plt.plot(x, np.array(self.errors[i]), cols[i], label="Entry %s"%i)
            legend = plt.legend(loc='upper right', shadow=True, fontsize='x-large', framealpha=0.05)
            axes = plt.gca()
            axes.set_ylim([-2., 2.])
            #legend.get_frame().set_facecolor('#00FFCC')
        plt.title("Error vs Ntrial")
        plt.show()

if __name__=="__main__":
    x = np.array([[0, 0, 1],
                  [0, 1, 1],
                  [1, 0, 0],
                  [1, 1, 0],
                  [1, 0, 1],
                  [1, 1, 1]])

    # output
    y = np.array([[0],
                  [1],
                  [1],
                  [0],
                  [1],
                  [0]])

    NeuralNetwork = NN(x,y)
    NeuralNetwork.train(25000)
    #NeuralNetwork.plot()
    #NeuralNetwork.summary()

    print("Test Data")
    data_test = np.array([[0,0,0],[0,1,0]])#,[1,1,0]])
    #isOne = (data_test.dot(NeuralNetwork.results["Matrix0"])).dot(NeuralNetwork.results["Matrix1"])
    #isOne = data_test.dot(NeuralNetwork.results["Matrix0"])
    #isOne = isOne.dot(NeuralNetwork.results["Matrix1"])
    #print(isOne)
    tester = NN(data_test, testing = [NeuralNetwork.results["Matrix0"], NeuralNetwork.results["Matrix1"]])
    tester.train(25000)
    #tester.summary()
    print(tester.results["OutputTrain"])
 
Technology news on Phys.org
Your training dataset is compatible with the hypotheses "(A XOR B) OR NOT C" in the original case and "(A XOR B) AND NOT C" in the modified case, and probably something similar in the third case.
How is the NN supposed to know that you didn't want these, but "A XOR B"?

In general training a neural net on such a small input space and with fixed results for each possible input rarely works because there are always multiple possible hypotheses.
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K