Load model in TensorFlow gives a different result than the original one

  • Thread starter Thread starter Gaussian97
  • Start date Start date
  • Tags Tags
    Load Model
AI Thread Summary
The discussion revolves around an issue encountered while using TensorFlow 2.3.0, specifically with model evaluation after saving and loading a model. A convolutional neural network is created and trained using the ModelCheckpoint callback to save the model's weights. However, when evaluating the original model and the loaded model, significant discrepancies in accuracy are observed, with the loaded model performing poorly despite having identical weights and loss values. The distinction between TensorFlow checkpoints and the SavedModel format is highlighted, noting that checkpoints only save parameter values without the model's computation description, making them less suitable for independent deployment. The user expresses confusion over why two models with the same architecture and weights yield different evaluation results, suggesting a potential misunderstanding of TensorFlow's saving mechanisms or a possible bug. Despite attempts to adjust the checkpointing process, the issue persists, indicating a need for further investigation into the model saving and loading procedures.
Gaussian97
Homework Helper
Messages
683
Reaction score
412
TL;DR Summary
I'm using the TensorFlow library in Python. After creating a model and saving it, if I load the entire model, I get inconsistent results.
First of all, I'm using TensorFlow version 2.3.0
The code I'm using is the following:
Python:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import ModelCheckpoint

def get_new_model():
    model = Sequential([
        Conv2D(filters=16, input_shape=(32, 32, 3), kernel_size=(3, 3), activation='relu', name='conv_1'),
        Conv2D(filters=8, kernel_size=(3, 3), activation='relu', name='conv_2'),
        MaxPooling2D(pool_size=(4, 4), name='pool_1'),
        Flatten(name='flatten'),
        Dense(units=32, activation='relu', name='dense_1'),
        Dense(units=10, activation='softmax', name='dense_2')
    ])
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

checkpoint_path = 'model_checkpoints'
checkpoint = ModelCheckpoint(filepath=checkpoint_path, save_weights_only=False, frequency='epoch', verbose=1)

model = get_new_model()
model.fit(x_train, y_train, epochs=3, callbacks=[checkpoint])
Until here no problem, I create the model, compile it and train it with some data. I also use ModelCheckpoint to save the model.
The problem comes when I try the following
Python:
from tensorflow.keras.models import load_model

model2 = load_model(checkpoint_path)

model.evaluate(x_test, y_test)
model2.evaluate(x_test, y_test)
Then, the first evaluation returns an accuracy of 0.477, while the other returns an accuracy of 0.128, which is essentially a random choice.
Where's the error? The two models are supposed to be identical, and actually, they give the same value for the loss function up to 16 decimal places.
 
Computer science news on Phys.org
I don't have any experience with checkpoints in TF, but maybe you could try to save a complete model using this guide:
https://www.tensorflow.org/guide/saved_model

In the checkpoint guide, they state:
The phrase "Saving a TensorFlow model" typically means one of two things:
Checkpoints, OR
SavedModel.

Checkpoints capture the exact value of all parameters (tf.Variable objects) used by a model. Checkpoints do not contain any description of the computation defined by the model and thus are typically only useful when source code that will use the saved parameter values is available.

The SavedModel format on the other hand includes a serialized description of the computation defined by the model in addition to the parameter values (checkpoint). Models in this format are independent of the source code that created the model. They are thus suitable for deployment via TensorFlow Serving, TensorFlow Lite, TensorFlow.js, or programs in other programming languages (the C, C++, Java, Go, Rust, C# etc. TensorFlow APIs).
 
Yes, my problem is that this code should work theoretically. So either I'm doing something wrong, which is mainly what I'm interested to know. Or maybe there's some bug in TensorFlow? Because I checked and the weights of the two models are the same, I don't know what else affects the evaluate function but as far as I know, if I have two models with the same architecture and same weights, they should perform equally on the same data.
 
Have you tried creating the checkpoint file after you have trained the model?
 
I have tried to define the checkpoint after creating the model, doesn't help. I can't define it after the fit method (which is the one that trains the model) because the checkpoint is an argument to that method.
But that is not the problem, even if I ignore the callback and save the model manually using the 'save' method after the training, I still have the same problem.
 
I came across a video regarding the use of AI/ML to work through complex datasets to determine complicated protein structures. It is a promising and beneficial use of AI/ML. AlphaFold - The Most Useful Thing AI Has Ever Done https://www.ebi.ac.uk/training/online/courses/alphafold/an-introductory-guide-to-its-strengths-and-limitations/what-is-alphafold/ https://en.wikipedia.org/wiki/AlphaFold https://deepmind.google/about/ Edit/update: The AlphaFold article in Nature John Jumper...
Thread 'Urgent: Physically repair - or bypass - power button on Asus laptop'
Asus Vivobook S14 flip. The power button is wrecked. Unable to turn it on AT ALL. We can get into how and why it got wrecked later, but suffice to say a kitchen knife was involved: These buttons do want to NOT come off, not like other lappies, where they can snap in and out. And they sure don't go back on. So, in the absence of a longer-term solution that might involve a replacement, is there any way I can activate the power button, like with a paperclip or wire or something? It looks...

Similar threads

Back
Top