While training model I got this warning "UserWarning: An input could not be retrieved. It could be because a worker has died.We do not have any information on the lost sample.)", after showing this warning, model starts training. What does this warning means? Is it something that will affect my training and I need to worry about?
8 Answers
This is just a user warning that will be usually thrown when you try to fetch the inputs,targets during training. This is because a timeout is set for the queuing mechanism which will be specified inside the data_utils.py
.
For more details you can refer the data_utils.py
file which will be inside the keras/utils
folder.
https://github.com/keras-team/keras/blob/master/keras/utils/data_utils.py
I got the same warning when training a model in Google Colab. The problem was that I tried to fetch the data from my Google Drive that I had mounted to the Colab session. The solution was to move the data into Colab's working directory and use it from there. This can be done simply via !cp -r path/to/google_drive_data_dir/ path/to/colab_data_dir
in the notebook. Note that you will have to do this each time when a new Colab session is created.
This may or may not be the problem that Rahul was asking, but I think this might be helpful to others who face the issue.
-
2I am using my Google Drive as a storage. Where else would I put this? Colab uses Google Drive as a hard disk right? Apr 22, 2020 at 10:00
-
-
Sorry, I thought that I had answered the first question already. AFAIK opening a Google Colab session spins up an virtual machine to which you can mount your Google Drive. However, the mount is not a physical one (fast) but the files need to be transferred over internet (slow). It's this file transfer that will cause a bottleneck. To avoid this, it's best copy the files from Drive physically to Colab session's drive (any folder you prefer) after which you can use them faster.– mjkvaakMay 25, 2020 at 13:17
If you are running the training in GPU, the Warning
will occur. You have to know that there are two running progress during the fit_generator
running.
- GPU,
trains
the IMAGE DATASETS with each steps in each epoch. - CPU,
prepares
the IMAGE DATASETS with each batch size.
While, they are parallel tasks. So if CPU's compute is lower than GPUs', the Warning
occurs.
Solution:
Just set your batch_size smaller or upgrade your CPU config.
make sure the path of data set you have given is correct only..this definitely helps example:train_data_dir="/content/drive/My Drive/Colab Notebooks/dataset"
I faced the same issue while training a deep neural network on my machine using keras, and it took me a while to figure it out. The images I was loading using the
ImageDataGenerator(target_size = (256, 256))
from
keras.preprocessing
were of a lower resolution, say 100*100 and I was trying to convert them into 256*256, and apparently there is no inbuilt support provided for this.
As soon as I fixed the output shape of the image returned by the ImageDataGenerator, the warning vanished.
//Note: the figures 100*100 and 255*255 are just for explanation.
You can reduce the number of workers and max_queue_size to solve problems.
-
2May we know why reducing the number of workers and max_queue_size will solve the problem?– FernandMar 23, 2020 at 4:18
I got this warning when I was training on the amount of data samples that was smaller than the batch size.
(The training would actually seem to have started, but then get stuck before even showing the progress bar for the first epoch.)
I faced the same issue, when I change my Keras version from 2.3.1 to 2.2.4, the warning is disappeared, And my CUDA and cuDNN can also work normally.
If still not resolved, please additional references:https://github.com/keras-team/keras/issues/13878
Now my System information:
OS : Win10
TensorFlow version: 1.15.0
Keras version: 2.2.4
Python version: 3.6
CUDA version: 10.0.130
cuDNN version: 7.6.5