CUDNN_STATUS_BAD_PARAM when trying to perform inference on a LSTM Seq2Seq with masked inputs

Question

I'm using keras layers on tensorflow 2.0 to build a simple LSTM-based Seq2Seq model for text generation.

versions I'm using: Python 3.6.9, Tensorflow 2.0.0, CUDA 10.0, CUDNN 7.6.1, Nvidia driver version 410.78.

I'm aware of the criteria needed by TF to delegate to CUDNNLstm when a GPU is present (I do have a GPU and my model/data fill all these criteria).

Training goes smoothly (with a warning message, see the end of this post) and I can verify that CUDNNLstm is being used.

However, when I try to call encoder_model.predict(input_sequence) at inference time, I get the following error message:

UnknownError:  [_Derived_]  CUDNN_STATUS_BAD_PARAM
in tensorflow/stream_executor/cuda/cuda_dnn.cc(1424): 'cudnnSetRNNDataDescriptor( data_desc.get(), data_type, layout, max_seq_length, batch_size, data_size, seq_lengths_array, (void*)&padding_fill)'
     [[{{node cond/then/_0/CudnnRNNV3}}]]
     [[lstm/StatefulPartitionedCall]] [Op:__inference_keras_scratch_graph_91878]

Function call stack:
keras_scratch_graph -> keras_scratch_graph -> keras_scratch_graph

Here is the training code: (both source_sequences and target_sequences are right-padded sequences and the embedding matrices are pretrained Glove embeddings)

# Define an input sequence and process it.
encoder_inputs = tf.keras.layers.Input(shape=(24,))
encoder_embedding_layer = tf.keras.layers.Embedding(
  VOCABULARY_SIZE_1,
  EMBEDDING_DIMS,
  embeddings_initializer=initializers.Constant(encoder_embedding_matrix),
  mask_zero=True)
encoder_embedding = encoder_embedding_layer(encoder_inputs)

_, state_h, state_c = tf.keras.layers.LSTM(
  EMBEDDING_DIMS,
  implementation=1,
  return_state=True)(encoder_embedding)

encoder_states = [state_h, state_c]

decoder_inputs = tf.keras.layers.Input(shape=(24,))
decoder_embedding_layer = tf.keras.layers.Embedding(
  VOCABULARY_SIZE_2,
  EMBEDDING_DIMS,
  embeddings_initializer=initializers.Constant(decoder_embedding_matrix),
  mask_zero=True)
decoder_embedding = decoder_embedding_layer(decoder_inputs)

decoder_lstm = tf.keras.layers.LSTM(
    EMBEDDING_DIMS, 
    return_sequences=True, 
    return_state=True,
    implementation=1)

decoder_outputs, _, _ = decoder_lstm(decoder_embedding, initial_state=encoder_states)

decoder_dense = tf.keras.layers.Dense(VOCABULARY_SIZE_TITLE, activation='softmax')

output = decoder_dense(decoder_outputs)

model = tf.keras.models.Model([encoder_inputs, decoder_inputs], output)

model.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy')
model.summary()

model.fit([source_sequences, target_sequences], decoder_target_data,
    batch_size=32,
    epochs=10,
    validation_split=0.0,
    verbose=2)

These are the inference models:

encoder_model = tf.keras.models.Model(encoder_inputs, encoder_states)

decoder_state_input_h = tf.keras.layers.Input(shape=(input_dimension ,))
decoder_state_input_c = tf.keras.layers.Input(shape=(input_dimension ,))

decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

decoder_outputs, state_h, state_c = decoder_lstm_layer(
        decoder_embedding_layer , initial_state=decoder_states_inputs)

decoder_states = [state_h, state_c]

decoder_outputs = output_layer(decoder_outputs)
decoder_model = tf.keras.models.Model(
        [decoder_inputs] + decoder_states_inputs,
        [decoder_outputs] + decoder_states)

When I call predict() on the encoder_model, I get CUDNN_STATUS_BAD_PARAM

Inference code (where error gets triggered)

# build the initial state with a right-padded input sequence
#### CUDNN_STATUS_BAD_PARAM is TRIGGERED ON THIS LINE!!! ######## <<<<<<<<<
state = encoder_model.predict(masked_input_sequence)

empty_target_sequence = np.zeros((1,1))
# this signals the Start of sequence
empty_target_sequence[0,0] = titles_word_index[sos_token]

decoder_outputs, h, c = decoder_model.predict([empty_target_sequence] + state)

Things I have tried

create masks explicitly (encoder_embedding_layer.compute_mask()) and add them as parameters every time I call an LSTM layer, for example:

encoder_embedding = encoder_embedding_layer(encoder_inputs)

encoder_mask = encoder_embedding_layer.compute_mask(encoder_inputs)

_, state_h, state_c = tf.keras.layers.LSTM(
  EMBEDDING_DIMS,
  return_state=True)(encoder_embedding,mask=encoder_mask)

not use initializers for the embedding layers to see if the problem was there

P.S.: forcing the training to take place on a CPU makes the error go away but I need to train it on GPU otherwise it would take ages to complete.

P.S.: This seems to be the very same error I have: Masking LSTM: OP_REQUIRES failed at cudnn_rnn_ops.cc:1498 : Unknown: CUDNN_STATUS_BAD_PARAM

P.S.: when I call method supports_masking on model,encoder_model and decoder_model, all of them return False for some reason.

P.S.: Like I said, training is done with no (apparent) errors but if I look at the Jupyter output log on the command line, I can see the following warning message during training:

2019-11-16 19:48:20.144265: W 
tensorflow/core/grappler/optimizers/implementation_selector.cc:310] Skipping optimization due to error while loading function libraries: 
Invalid argument: Functions '__inference___backward_cudnn_lstm_with_fallback_47598_49057' and 
'__inference___backward_cudnn_lstm_with_fallback_47598_49057_specialized_for_StatefulPartitionedCall_1_at___inference_distributed_function_52868'
 both implement 'lstm_d41d5ccb-14be-4a74-b5e8-cc4f63c5bb02' but their signatures do not match.

So your Input layers are of shape (None, None) (when you add the batch dimension). Can you explain why is that? Isn't there a way for you to define the number of time steps? — thushv89, Nov 19, 2019 at 0:51
As to the last P.S. (Skipping optimization): it seems this warning message can be ignored as stated here: github.com/tensorflow/tensorflow/issues/… — SheepPerplexed, Nov 19, 2019 at 10:36

DachuanZhao · Accepted Answer · 2020-11-26 01:36:23Z

0

You should use cudnn7.4 referring to this web

answered Nov 26, 2020 at 1:36

DachuanZhao

1,2654 gold badges16 silver badges37 bronze badges

Add a comment |

Collectives™ on Stack Overflow

CUDNN_STATUS_BAD_PARAM when trying to perform inference on a LSTM Seq2Seq with masked inputs

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
lstm
tensorflow2.0
tf.keras
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythontensorflowlstmtensorflow2.0tf.keras or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
lstm
tensorflow2.0
tf.keras
or ask your own question.