Using tensorflow.keras (2.0-alpha0 with GPU support) I have extremely long initialize times with tf.keras.model.fit() on both newly compiled models and models previously saved and reloaded.
I believe this is after the tf.data.Datasets() have already been loaded and preprocessed, so I don't understand what is taking so long and there is no output from TF/Keras:
2019-04-19 23:29:18.109067: tensorflow/core/common_runtime/gpu/gpu_device.cc:1149] Created TensorFlow device
Resizing images and creating data sets with num_parallel_calls=8
Loading existing model to continue training.
Starting model.fit()
Epoch 1/100
2019-04-19 23:32:22.934394: tensorflow/core/kernels/data/shuffle_dataset_op.cc:150] Shuffle buffer filled.
2019-04-19 23:38:52.374924: tensorflow/core/common_runtime/bfc_allocator.cc:230] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.62GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
3 minutes to load the model and fill the shuffle buffer and 6 minutes for ... what? And how can this mysterious work be optimized? (5ghz 8700K, 32 GB RAM, NVME SSD, 1080ti 11G DDR5 - task manager shows 100% single-thread CPU use, moderate disk access, slowly expanding RAM usage to ~28GB max, zero GPU usage during this period).
Is there any way to serialize or store the models in a more efficient way such that they can be started and stopped regularly without the 10 minutes of overhead?
Is TF/Keras somehow lazy-loading the data sets and preprocessing them in this period?