In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?

Question

I have a network made with InceptionNet, and for an input sample bx, I want to compute the gradients of the model output w.r.t. the hidden layer. I have the following code:

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))


with tf.GradientTape() as gtape:
    #gtape.watch(x)
    preds = model(bx)
    print(preds.shape, end='  ')

    class_idx = np.argmax(preds[0])
    print(class_idx, end='   ')

    class_output = model.output[:, class_idx]
    print(class_output, end='   ')

    last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')
    #gtape.watch(last_conv_layer)
    print(last_conv_layer)


grads = gtape.gradient(class_output, last_conv_layer.output)#[0]
print(grads)

But, this will give None. I tried gtape.watch(bx) as well, but it still gives None.

Before trying GradientTape, I tried using tf.keras.backend.gradient but that gave an error as follows:

RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.

My model is as follows:

model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
inception_v3 (Model)         (None, 1000)              23851784  
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 2002      
=================================================================
Total params: 23,853,786
Trainable params: 23,819,354
Non-trainable params: 34,432
_________________________________________________________________

Any solution is appreciated. It doesn't have to be GradientTape, if there is any other way to compute these gradients.

Possible duplicate of stackoverflow.com/questions/52340645/… — giser_yugang, Jun 6, 2019 at 13:30
Thanks but this problem cannt be solved. As you can see in the code above, I had also tried gtape.watch(bx) but it goves None at the end. I wille dit my quesiton and mention that as well. — Vahid Mirjalili, Jun 6, 2019 at 14:29

Fantasty · Accepted Answer · 2019-06-12 17:15:32Z

I had the same problem as you. I'm not sure if this is the cleanest way to solve the problem, but here's my solution.

I think the problem is that you need to pass along the actual return value of last_conv_layer.call(...) as an argument to tape.watch(). Since all layers are called sequentially within the scope of the model(bx) call, you'll have to somehow inject some code into this inner scope. I did this using the following decorator:

def watch_layer(layer, tape):
    """
    Make an intermediate hidden `layer` watchable by the `tape`.
    After calling this function, you can obtain the gradient with
    respect to the output of the `layer` by calling:

        grads = tape.gradient(..., layer.result)

    """
    def decorator(func):
        def wrapper(*args, **kwargs):
            # Store the result of `layer.call` internally.
            layer.result = func(*args, **kwargs)
            # From this point onwards, watch this tensor.
            tape.watch(layer.result)
            # Return the result to continue with the forward pass.
            return layer.result
        return wrapper
    layer.call = decorator(layer.call)
    return layer

In your example, I believe the following should then work for you:

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))
last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')
with tf.GradientTape() as gtape:
    # Make the `last_conv_layer` watchable
    watch_layer(last_conv_layer, gtape)  
    preds = model(bx)
    class_idx = np.argmax(preds[0])
    class_output = model.output[:, class_idx]
# Get the gradient w.r.t. the output of `last_conv_layer`
grads = gtape.gradient(class_output, last_conv_layer.result)  
print(grads)

I tried your solution, however, when I call model.predict() within the with tf.GradientTape() as gtape block, then I get the following error: "LookupError: No gradient defined for operation 'IteratorGetNext' (op type: IteratorGetNext)". Any ideas what might cause this? — Matthias, Oct 9, 2019 at 9:22
@Matthias Hey, did you find a solution for this? I am getting the same error — Rao208, Nov 27, 2019 at 23:38

nessuno · Accepted Answer · 2019-06-07 07:21:24Z

1

You can use the tape to compute the gradient of an output node, wrt a set of watchable objects. By default, trainable variables are watchable by the tape, and you can access the trainable variables of a specific layer by getting it by name and accessing to the trainable_variables property.

E.g. in the code below, I compute the gradients of the prediction, only with respect to the variables of the first FC layer (name "fc1") considering any other variable a constant.

import tensorflow as tf

model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Dense(10, input_shape=(3,), name="fc1", activation="relu"),
        tf.keras.layers.Dense(3, input_shape=(3,), name="fc2"),
    ]
)

inputs = tf.ones((1, 299, 299, 3))

with tf.GradientTape() as tape:
    preds = model(inputs)

grads = tape.gradient(preds, model.get_layer("fc1").trainable_variables)
print(grads)

answered Jun 7, 2019 at 7:21

nessuno

26.8k5 gold badges84 silver badges75 bronze badges

Thank you for your reply. However, I want the gradients wrt the the hidden layer itself, not the training-variables at that layer. How would you change your code to calculate the gradients of output wrt the layer "fc1"?
– Vahid Mirjalili
Jun 7, 2019 at 13:12
I don't understand your requirements. Do you want to compute the gradient with respect to which part of the layers fc1?
– nessuno
Jun 7, 2019 at 13:23
1

The output of layer fc1. In the older version of TF, I could do this as follows: layer = model.get_layer("fc1") grads = K.gradients(class_output, layer.output)[0]
– Vahid Mirjalili
Jun 7, 2019 at 17:14

Add a comment |

Ali Salehi · Accepted Answer · 2020-02-21 00:35:27Z

If you need the gradients of predictions with respect to output's of all layers, you can do:

(Building on @nessuno 's answer)

import tensorflow as tf

model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Dense(10, input_shape=(3,), name="fc1", activation="relu"),
        tf.keras.layers.Dense(3, input_shape=(3,), name="fc2"),
    ]
)

# build a new model
output_layer = model.outputs
all_layers = [layer.output for layer in model.layers]
grad_model = tf.keras.model(inputs=model.inputs, outputs=all_layers)

inputs = tf.ones((1, 299, 299, 3))
with tf.GradientTape() as tape:
    output_of_all_layers = grad_model(inputs)
    preds = output_layer[-1]  # last layer is output layer
    # take gradients of last layer with respect to all layers in the model
    grads = tape.gradient(preds, output_of_all_layers)
    # note: grads[-1] should be all 1, since it it d(output)/d(output)
print(grads)

Arnab Das · Accepted Answer · 2020-08-09 14:11:33Z

Example to compute a gradient of a network of a output with respect of a specific layer .

def example():

def grad_cam(input_model, image, category_index, layer_name):

    gradModel = Model(
        inputs=[model.inputs],
        outputs=[model.get_layer(layer_name).output,
                 model.output])

    with tf.GradientTape() as tape:

        inputs = tf.cast(image, tf.float32)
        (convOutputs, predictions) = gradModel(inputs)
        loss = predictions[:, category_index]

    grads = tape.gradient(loss, convOutputs)


    castConvOutputs = tf.cast(convOutputs > 0, "float32")
    castGrads = tf.cast(grads > 0, "float32")
    guidedGrads = castConvOutputs * castGrads * grads


    convOutputs = convOutputs[0]
    guidedGrads = guidedGrads[0]

    weights = tf.reduce_mean(guidedGrads, axis=(0, 1))
    cam = tf.reduce_sum(tf.multiply(weights, convOutputs), axis=-1)


    H, W = image.shape[1], image.shape[2]
    cam = np.maximum(cam, 0)  # ReLU so we only get positive importance
    cam = cv2.resize(cam, (W, H), cv2.INTER_NEAREST)
    cam = cam / cam.max()

    return cam



im = load_image_normalize(im_path, mean, std)

print(im.shape)
cam = grad_cam(model, im, 5, 'conv5_block16_concat') # Mass is class 5

# Loads reference CAM to compare our implementation with.
reference = np.load("reference_cam.npy")
error = np.mean((cam-reference)**2)

print(f"Error from reference: {error:.4f}, should be less than 0.05")




plt.imshow(load_image(im_path, df, preprocess=False), cmap='gray')
plt.title("Original")
plt.axis('off')

plt.show()

plt.imshow(load_image(im_path, df, preprocess=False), cmap='gray')
plt.imshow(cam, cmap='magma', alpha=0.5)
plt.title("GradCAM")
plt.axis('off')
plt.show()

Collectives™ on Stack Overflow

In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?

4 Answers 4

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
gradient
tensorflow2.0
tf.keras
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythontensorflowgradienttensorflow2.0tf.keras or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
gradient
tensorflow2.0
tf.keras
or ask your own question.