10

I'm using keras layers on tensorflow 2.0 to build a simple LSTM-based Seq2Seq model for text generation.

versions I'm using: Python 3.6.9, Tensorflow 2.0.0, CUDA 10.0, CUDNN 7.6.1, Nvidia driver version 410.78.

I'm aware of the criteria needed by TF to delegate to CUDNNLstm when a GPU is present (I do have a GPU and my model/data fill all these criteria).

Training goes smoothly (with a warning message, see the end of this post) and I can verify that CUDNNLstm is being used.

However, when I try to call encoder_model.predict(input_sequence) at inference time, I get the following error message:

UnknownError:  [_Derived_]  CUDNN_STATUS_BAD_PARAM
in tensorflow/stream_executor/cuda/cuda_dnn.cc(1424): 'cudnnSetRNNDataDescriptor( data_desc.get(), data_type, layout, max_seq_length, batch_size, data_size, seq_lengths_array, (void*)&padding_fill)'
     [[{{node cond/then/_0/CudnnRNNV3}}]]
     [[lstm/StatefulPartitionedCall]] [Op:__inference_keras_scratch_graph_91878]

Function call stack:
keras_scratch_graph -> keras_scratch_graph -> keras_scratch_graph

Here is the training code: (both source_sequences and target_sequences are right-padded sequences and the embedding matrices are pretrained Glove embeddings)

# Define an input sequence and process it.
encoder_inputs = tf.keras.layers.Input(shape=(24,))
encoder_embedding_layer = tf.keras.layers.Embedding(
  VOCABULARY_SIZE_1,
  EMBEDDING_DIMS,
  embeddings_initializer=initializers.Constant(encoder_embedding_matrix),
  mask_zero=True)
encoder_embedding = encoder_embedding_layer(encoder_inputs)

_, state_h, state_c = tf.keras.layers.LSTM(
  EMBEDDING_DIMS,
  implementation=1,
  return_state=True)(encoder_embedding)

encoder_states = [state_h, state_c]

decoder_inputs = tf.keras.layers.Input(shape=(24,))
decoder_embedding_layer = tf.keras.layers.Embedding(
  VOCABULARY_SIZE_2,
  EMBEDDING_DIMS,
  embeddings_initializer=initializers.Constant(decoder_embedding_matrix),
  mask_zero=True)
decoder_embedding = decoder_embedding_layer(decoder_inputs)

decoder_lstm = tf.keras.layers.LSTM(
    EMBEDDING_DIMS, 
    return_sequences=True, 
    return_state=True,
    implementation=1)

decoder_outputs, _, _ = decoder_lstm(decoder_embedding, initial_state=encoder_states)

decoder_dense = tf.keras.layers.Dense(VOCABULARY_SIZE_TITLE, activation='softmax')

output = decoder_dense(decoder_outputs)

model = tf.keras.models.Model([encoder_inputs, decoder_inputs], output)

model.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy')
model.summary()

model.fit([source_sequences, target_sequences], decoder_target_data,
    batch_size=32,
    epochs=10,
    validation_split=0.0,
    verbose=2)

enter image description here

These are the inference models:

encoder_model = tf.keras.models.Model(encoder_inputs, encoder_states)

decoder_state_input_h = tf.keras.layers.Input(shape=(input_dimension ,))
decoder_state_input_c = tf.keras.layers.Input(shape=(input_dimension ,))

decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

decoder_outputs, state_h, state_c = decoder_lstm_layer(
        decoder_embedding_layer , initial_state=decoder_states_inputs)

decoder_states = [state_h, state_c]

decoder_outputs = output_layer(decoder_outputs)
decoder_model = tf.keras.models.Model(
        [decoder_inputs] + decoder_states_inputs,
        [decoder_outputs] + decoder_states)

When I call predict() on the encoder_model, I get CUDNN_STATUS_BAD_PARAM

Inference code (where error gets triggered)

# build the initial state with a right-padded input sequence
#### CUDNN_STATUS_BAD_PARAM is TRIGGERED ON THIS LINE!!! ######## <<<<<<<<<
state = encoder_model.predict(masked_input_sequence)

empty_target_sequence = np.zeros((1,1))
# this signals the Start of sequence
empty_target_sequence[0,0] = titles_word_index[sos_token]

decoder_outputs, h, c = decoder_model.predict([empty_target_sequence] + state)

Things I have tried

  • create masks explicitly (encoder_embedding_layer.compute_mask()) and add them as parameters every time I call an LSTM layer, for example:

    encoder_embedding = encoder_embedding_layer(encoder_inputs)
    
    encoder_mask = encoder_embedding_layer.compute_mask(encoder_inputs)
    
    _, state_h, state_c = tf.keras.layers.LSTM(
      EMBEDDING_DIMS,
      return_state=True)(encoder_embedding,mask=encoder_mask)
    
  • not use initializers for the embedding layers to see if the problem was there


P.S.: forcing the training to take place on a CPU makes the error go away but I need to train it on GPU otherwise it would take ages to complete.

P.S.: This seems to be the very same error I have: Masking LSTM: OP_REQUIRES failed at cudnn_rnn_ops.cc:1498 : Unknown: CUDNN_STATUS_BAD_PARAM

P.S.: when I call method supports_masking on model,encoder_model and decoder_model, all of them return False for some reason.

P.S.: Like I said, training is done with no (apparent) errors but if I look at the Jupyter output log on the command line, I can see the following warning message during training:

2019-11-16 19:48:20.144265: W 
tensorflow/core/grappler/optimizers/implementation_selector.cc:310] Skipping optimization due to error while loading function libraries: 
Invalid argument: Functions '__inference___backward_cudnn_lstm_with_fallback_47598_49057' and 
'__inference___backward_cudnn_lstm_with_fallback_47598_49057_specialized_for_StatefulPartitionedCall_1_at___inference_distributed_function_52868'
 both implement 'lstm_d41d5ccb-14be-4a74-b5e8-cc4f63c5bb02' but their signatures do not match.
4
  • So your Input layers are of shape (None, None) (when you add the batch dimension). Can you explain why is that? Isn't there a way for you to define the number of time steps?
    – thushv89
    Nov 19, 2019 at 0:51
  • And can you provide some sample data to test the model.
    – thushv89
    Nov 19, 2019 at 2:43
  • 1
    As to the last P.S. (Skipping optimization): it seems this warning message can be ignored as stated here: github.com/tensorflow/tensorflow/issues/… Nov 19, 2019 at 10:36
  • 1
    @thushv89 sorry I've set the time steps now
    – Felipe
    Nov 21, 2019 at 1:23

1 Answer 1

0

You should use cudnn7.4 referring to this web

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.