I am trying to save a model and then load it later to make some predictions; what happens is that the accuracy of the model after training is 95%+
, but when I save it and then load it, the accuracy drops to nearly 10%
on the same dataset.
To reproduce this erroneous result, you can run this really small notebook.
The model is defined as follows:
model_scratch_auto = models.Sequential()
model_scratch_auto.add(Flatten(input_shape=(28,28)))
model_scratch_auto.add(Dense(80, activation='relu'))
model_scratch_auto.add(Dense(100, activation='relu'))
model_scratch_auto.add(Dense(120, activation='relu'))
model_scratch_auto.add(Dense(100, activation='relu'))
auto_srelu=AutoSRELU()
model_scratch_auto.add(auto_srelu)
model_scratch_auto.add(Dense(120, activation='relu'))
model_scratch_auto.add(auto_srelu)
model_scratch_auto.add(BatchNormalization())
model_scratch_auto.add(Dense(10, activation='softmax'))
model_scratch_auto.compile(optimizer = tf.optimizers.Adam(),loss='categorical_crossentropy', metrics=['acc',f1_m,precision_m, recall_m])
model_scratch_auto.fit(X_train, y_train , batch_size=64, epochs=5, validation_data=(X_test, y_test),verbose=1)
Where the custom layer, AutoSRELU
is defined as follows:
initializer0 = keras.initializers.RandomUniform(minval = -1, maxval =1)
initializer1 = keras.initializers.RandomUniform(minval = 0.5, maxval =3)
class MinMaxConstraint(keras.constraints.Constraint):
def __init__(self, minval, maxval):
self.minval = tf.constant(minval ,dtype='float32')
self.maxval = tf.constant(maxval ,dtype='float32')
def __call__(self, w):
tf.cond(tf.greater(self.minval,w)
, lambda: w + (self.minval - w)
, lambda: tf.cond(tf.greater(w,self.maxval)
, lambda: w - (w - self.maxval)
, lambda: w))
def get_config(self):
return {'Lower Bound': self.minval, 'Upper Bound':self.maxval}
def srelu(inputs, k1, k2):
cond1 = tf.cast(tf.math.less(inputs, 0.0), tf.float32)
cond2 = tf.cast(tf.math.greater_equal(inputs, 0.0), tf.float32)
a = tf.math.multiply(cond1, tf.add(k1,tf.multiply(0.3, inputs)))
b = tf.math.multiply(cond2, tf.add(k1,tf.multiply(k2, inputs)))
outputs = a + b
return outputs
class AutoSRELU(keras.layers.Layer):
def __init__(self, trainable = True, **kwargs):
super(AutoSRELU, self).__init__()
self.k1 = self.add_weight(name='k', shape = (), initializer=initializer0, trainable=trainable)#, constraint=keras.constraints.NonNeg())
self.k2 = self.add_weight(name='n', shape = (), initializer=initializer1, trainable=trainable)#, constraint=MinMaxConstraint(1,10))
def call(self, inputs):
return srelu(inputs, self.k1, self.k2)
Then I evaluate the model performance using the evaluate()
function and get the following result:
model_scratch_auto.evaluate(X_train, y_train)
Output:
1875/1875 [==============================] - 4s 2ms/step - loss: 0.0517 - acc: 0.9834 - f1_m: 0.9836 - precision_m: 0.9851 - recall_m: 0.9823
[0.05167238786816597,
0.9834166765213013,
0.983639121055603,
0.9850572943687439,
0.9822666645050049]
Then I save the model as:
model_scratch_auto.save('test_model.h5')
And when I load the same model by setting the dependencies as follows:
dependencies = {
'f1_m': f1_m,
'precision_m': precision_m,
'recall_m': recall_m,
'AutoSRELU': AutoSRELU
}
test_model = models.load_model('test_model.h5', custom_objects=dependencies)
And when I evaluate this model on the same dataset, I get the following result:
test_model.evaluate(X_train, y_train)
Output:
1875/1875 [==============================] - 2s 1ms/step - loss: 8.5696 - acc: 0.1047 - f1_m: 0.1047 - precision_m: 0.1047 - recall_m: 0.1047
[8.569587707519531,
0.10468333214521408,
0.10468332469463348,
0.10468333214521408,
0.10468333214521408]
As you can see, saving the same model and evaluating it on the same dataset significantly reduces the performance. I tried many things to see why this must be happening and I found out that removing BatchNormalization()
and AutoSRELU
corrected the issue, but I can't seem to understand why they are causing this issue. To see if the RandomUniform
function was maybe causing some problems, I re-ran the loading part along with the class definition multiple times to see if there was some randomness in the loaded model but that was returning an identical worse result every time. I then saw that removing the batch normalization layer gave almost identical results.
So I was able to narrow down the problem to BatchNormalization
AutoSRELU
but I can't understand how to correct it. How do I save and load the model correctly so that it gives the same results?