How to save/restore large model in tensorflow 2.0 w/ keras?

Question

I have a large custom model made with the new tensorflow 2.0 and mixing keras and tensorflow. I want to save it (architecture and weights). Exact command to reproduce:

import tensorflow as tf


OUTPUT_CHANNELS = 3

def downsample(filters, size, apply_batchnorm=True):
  initializer = tf.random_normal_initializer(0., 0.02)

  result = tf.keras.Sequential()
  result.add(
      tf.keras.layers.Conv2D(filters, size, strides=2, padding='same',
                             kernel_initializer=initializer, use_bias=False))

  if apply_batchnorm:
    result.add(tf.keras.layers.BatchNormalization())

  result.add(tf.keras.layers.LeakyReLU())

  return result

def upsample(filters, size, apply_dropout=False):
  initializer = tf.random_normal_initializer(0., 0.02)

  result = tf.keras.Sequential()
  result.add(
    tf.keras.layers.Conv2DTranspose(filters, size, strides=2,
                                    padding='same',
                                    kernel_initializer=initializer,
                                    use_bias=False))

  result.add(tf.keras.layers.BatchNormalization())

  if apply_dropout:
      result.add(tf.keras.layers.Dropout(0.5))

  result.add(tf.keras.layers.ReLU())

  return result


def Generator():
  down_stack = [
    downsample(64, 4, apply_batchnorm=False), # (bs, 128, 128, 64)
    downsample(128, 4), # (bs, 64, 64, 128)
    downsample(256, 4), # (bs, 32, 32, 256)
    downsample(512, 4), # (bs, 16, 16, 512)
    downsample(512, 4), # (bs, 8, 8, 512)
    downsample(512, 4), # (bs, 4, 4, 512)
    downsample(512, 4), # (bs, 2, 2, 512)
    downsample(512, 4), # (bs, 1, 1, 512)
  ]

  up_stack = [
    upsample(512, 4, apply_dropout=True), # (bs, 2, 2, 1024)
    upsample(512, 4, apply_dropout=True), # (bs, 4, 4, 1024)
    upsample(512, 4, apply_dropout=True), # (bs, 8, 8, 1024)
    upsample(512, 4), # (bs, 16, 16, 1024)
    upsample(256, 4), # (bs, 32, 32, 512)
    upsample(128, 4), # (bs, 64, 64, 256)
    upsample(64, 4), # (bs, 128, 128, 128)
  ]

  initializer = tf.random_normal_initializer(0., 0.02)
  last = tf.keras.layers.Conv2DTranspose(OUTPUT_CHANNELS, 4,
                                         strides=2,
                                         padding='same',
                                         kernel_initializer=initializer,
                                         activation='tanh') # (bs, 256, 256, 3)

  concat = tf.keras.layers.Concatenate()

  inputs = tf.keras.layers.Input(shape=[None,None,3])
  x = inputs

  # Downsampling through the model
  skips = []
  for down in down_stack:
    x = down(x)
    skips.append(x)

  skips = reversed(skips[:-1])

  # Upsampling and establishing the skip connections
  for up, skip in zip(up_stack, skips):
    x = up(x)
    x = concat([x, skip])

  x = last(x)

  return tf.keras.Model(inputs=inputs, outputs=x)

generator = Generator()
generator.summary()

generator.save('generator.h5')
generator_loaded = tf.keras.models.load_model('generator.h5')

I manage to save the model with:

generator.save('generator.h5')

But when I try to load it with:

generator_loaded = tf.keras.models.load_model('generator.h5')

It never ends (no error message). Maybe the model is too large? I tried to save as JSON with model.to_json() as well as the full API tf.keras.models.save_model(), but same problem, impossible to load it (or at least far too long).

Same problem on Windows/Linux and with/without GPU.

The save and restore work well with full Keras and simple model.

Edit

Saving weights and then loading them works well, but it's impossible to load the model structure.
I put the model I use to reproduce the bug, it comes from Pix2Pix example (https://www.tensorflow.org/alpha/tutorials/generative/pix2pix)
I also wrote an issue on tensorflow github : https://github.com/tensorflow/tensorflow/issues/28281

TensorFlow 2.0 is still currently an alpha release, it has bugs, you shouldn't be using it for normal development. Maybe report this bug and move to a stable TF version. — Dr. Snoopy, Apr 30, 2019 at 11:27
Few minutes. Yes, I know it's just an an alpha release, but it may be a mistake on my side. — Ridane, Apr 30, 2019 at 11:30

NiallJG · Accepted Answer · 2019-10-13 12:21:14Z

2

As of tensorflow release 2.0.0 there is now a keras / tf agnostic way of saving models using tf.saved_model

        ....

        model.fit(images, labels , epochs=30, validation_data=(images_val, labels_val), verbose=1)

        tf.saved_model.save( model, "path/to/model_dir" )

You can then load with

        loaded_model = tf.saved_model.load("path/to/model_dir")

answered Oct 13, 2019 at 12:21

NiallJG

1,93120 silver badges24 bronze badges

Add a comment |

Suleiman · Accepted Answer · 2019-04-30 10:16:26Z

0

Try instead to save the model as:

model.save('model_name.model')

Then Load it with:

model = tf.keras.models.load_model('model_name.model')

answered Apr 30, 2019 at 10:16

Suleiman

3161 gold badge4 silver badges15 bronze badges

Thanks for the answer. Unfortunately, same problem, impossible to load it (at least too long, I stopped it before).
– Ridane
Apr 30, 2019 at 11:11
@Ridane Why not pickle it instead?
– bg2094
Apr 30, 2019 at 11:36

Add a comment |

Ridane · Accepted Answer · 2019-05-07 08:12:50Z

0

I found a temporary solution. It seems that the issue occurs with the sequential API tf.keras.Sequential, by using the functional API, tf.keras.models.load_model manages to load the saved model. I hope they will fixed this issue in the final release, have a look to the issue I raised in github https://github.com/tensorflow/tensorflow/issues/28281.

Cheers,

answered May 7, 2019 at 8:12

Ridane

1011 silver badge7 bronze badges

Add a comment |

Albert James Teddy · Accepted Answer · 2019-05-17 17:41:58Z

I managed to save and load custom models by implementing similar functions to the Sequential model in Keras.

The key functions are CustomModel.get_config() CustomModel.from_config(), which also should exist on any of your custom layers (similar to the functions below, but see keras layers if you want a better understanding):

# In the CustomModel class    
def get_config(self):
    layer_configs = []
    for layer in self.layers:
        layer_configs.append({
            'class_name': layer.__class__.__name__,
            'config': layer.get_config()
        })
    config = {
        'name': self.name,
        'layers': copy.deepcopy(layer_configs),
        "arg1": self.arg1,
        ...
    }
    if self._build_input_shape:
        config['build_input_shape'] = self._build_input_shape
    return config

@classmethod
def from_config(cls, config, custom_objects=None):
    from tensorflow.python.keras import layers as layer_module
    if custom_objects is None:
        custom_objects = {'CustomLayer1Class': CustomLayer1Class, ...}
    else:
        custom_objects = dict(custom_objects, **{'CustomLayer1Class': CustomLayer1Class, ...})

    if 'name' in config:
        name = config['name']
        build_input_shape = config.get('build_input_shape')
        layer_configs = config['layers']
    else:
        name = None
        build_input_shape = None
        layer_configs = config
    model = cls(name=name,
                arg1=config['arg1'],
                should_build_graph=False,
                ...)
    for layer_config in tqdm(layer_configs, 'Loading Layers'):
        layer = layer_module.deserialize(layer_config,
                                         custom_objects=custom_objects)
        model.add(layer) # This function looks at the name of the layers to place them in the right order
    if not model.inputs and build_input_shape:
        model.build(build_input_shape)
    if not model._is_graph_network:
        # Still needs to be built when passed input data.
        model.built = False
    return model

I also added a CustomModel.add() function that adds layers one by one from their config. Also a parameter should_build_graph=False that makes sure you do not build the graph in the __init__() when calling cls().

Then the CustomModel.save() function looks like this:

    def save(self, filepath, overwrite=True, include_optimizer=True, **kwargs):
        from tensorflow.python.keras.models import save_model  
        save_model(self, filepath, overwrite, include_optimizer)

After that you can save using:

model.save("model.h5")
new_model = keras.models.load_model('model.h5',
                                        custom_objects={
                                        'CustomModel': CustomModel,                                                     
                                        'CustomLayer1Class': CustomLayer1Class,
                                        ...
                                        })

But somehow this approach seem to be quite slow... This approach on the other hand is almost 30x faster. Not sure why:

    model.save_weights("weights.h5")
    config = model.get_config()
    reinitialized_model = CustomModel.from_config(config)
    reinitialized_model.load_weights("weights.h5")

I works, but it seems quite hacky. Maybe future versions of TF2 will make the process clearer.

bg2094 · Accepted Answer · 2019-04-30 11:43:40Z

-1

One other method of saving a trained model is to use the pickle module in python.

import pickle
pickle.dump(model, open(filename, 'wb'))

In order to load the pickled model,

loaded_model = pickle.load(open(filename, 'rb'))

The extension of the pickle file is usually .sav

answered Apr 30, 2019 at 11:43

bg2094

1889 bronze badges

Does not work either : "TypeError: can't pickle _thread.RLock objects"
– Ridane
Apr 30, 2019 at 12:12
There is a work around for that error in the following link. Why not give it a shot? stackoverflow.com/questions/44855603/…
– bg2094
May 2, 2019 at 5:59
Also how big was the h5 file? Couple of Gigabytes at the least i suppose ?
– bg2094
May 2, 2019 at 6:04
The *.h5 file for this one from example is 212 722Ko. Yes I could give it a shot with the pickle module. But i'd rather use the tensorflow API in a clean way, and I don't think that the size is the issue here, it seems to be deeper. In my opinion, lot of people will use the keras.save API, therefore I wrote an issue on tensorflow github here: github.com/tensorflow/tensorflow/issues/28281 :)
– Ridane
May 2, 2019 at 13:29

Add a comment |

Collectives™ on Stack Overflow

How to save/restore large model in tensorflow 2.0 w/ keras?

Edit

5 Answers 5

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
keras
tensorflow2.0
tf.keras
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Edit

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythontensorflowkerastensorflow2.0tf.keras or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
keras
tensorflow2.0
tf.keras
or ask your own question.