Merge pull request #18 from keisen/features/implement-gradcam-plus-plus

Implement GradCAM++
keisen · Jun 26, 2020 · 03cce2a · 03cce2a
2 parents 68ef380 + 4c89b0a
commit 03cce2a
Show file tree

Hide file tree

Showing 16 changed files with 404 additions and 73 deletions.
diff --git a/.gitignore b/.gitignore
@@ -111,3 +111,4 @@ examples/workspace.ipynb
 /*.ipynb
 .node-version
 *.nbconvert.ipynb
+/temp/
diff --git a/README.md b/README.md
@@ -5,13 +5,23 @@
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 
 tf-keras-vis is a visualization toolkit for debugging `tf.keras` models in Tensorflow2.0+.
+Currently supported algorisms for visualization include:
 
-These features are based on ones of [keras-vis](https://github.com/raghakot/keras-vis), but tf-keras-vis APIs doesn't have compatibility with keras-vis, because we prioritized to get following features for our expriments.
+* Activation Maximization
+* Class Activation Maps
+   - [GradCAM](https://arxiv.org/pdf/1610.02391v1.pdf)
+   - [GradCAM++](https://arxiv.org/pdf/1710.11063.pdf)
+* Saliency Maps
+   - [Vanilla Saliency](https://arxiv.org/pdf/1312.6034.pdf)
+   - [SmoothGrad](https://arxiv.org/pdf/1706.03825.pdf)
 
-- Support processing multiple images at a time as a batch
-- Support tf.keras.Model that has multiple inputs (and, of course, multiple outpus too)
-- Allow to utilize optimizers that embeded in tf.keras
-- Get faster processing by optimal calculation
+tf-keras-vis is designed to be ease of use, light-weight and flexible.
+All visualizations have the features as follows:
+
+* Support N-dim image inputs, that's, not only support pictures but also such as 3D images.
+* Support batch-wise processing, so, be able to efficiently process multiple inputs.
+* Support the model that have either multiple inputs or multiple outputs, or both.
+* Support Optimizers embeded in tf.keras to process Activation maximization.
 
 
 ## Visualizations
@@ -26,12 +36,15 @@ These features are based on ones of [keras-vis](https://github.com/raghakot/kera
 
 ### GradCAM
 
-<img src='https://github.com/keisen/tf-keras-vis/raw/master/examples/images/gradcam.png' width='600px' />
+<img src='https://github.com/keisen/tf-keras-vis/raw/master/examples/images/gradcam_plus_plus.png' width='600px' />
+
+The images above are generated by `GradCAM++`.
 
-### Saliency Map (SmoothGrad)
+### Saliency Map
 
 <img src='https://github.com/keisen/tf-keras-vis/raw/master/examples/images/smoothgrad.png' width='600px' />
 
+The images above are generated by `SmoothGrad`.
 
 
 ## Requirements
@@ -51,35 +64,46 @@ $ pip install tf-keras-vis tensorflow
 * Docker (container that run Jupyter Notebook)
 
 ```bash
-$ docker run -itd -p 8888:8888 keisen/tf-keras-vis:0.2.4
+$ docker run -itd -p 8888:8888 keisen/tf-keras-vis:0.4.0
 ```
 
 If you have GPU processors,
 
 ```bash
-$ docker run -itd --runtime=nvidia -p 8888:8888 keisen/tf-keras-vis:0.2.4-gpu
+$ docker run -itd --runtime=nvidia -p 8888:8888 keisen/tf-keras-vis:0.4.0-gpu
 ```
 
 > You can find other images at [Docker Hub](https://hub.docker.com/repository/docker/keisen/tf-keras-vis/tags).
 
 
 ## Usage
 
-Please see [examples/attentions.ipynb](https://github.com/keisen/tf-keras-vis/blob/master/examples/attentions.ipynb), [examples/visualize_dense_layer.ipynb](https://github.com/keisen/tf-keras-vis/blob/master/examples/visualize_dense_layer.ipynb) and [examples/visualize_conv_filters.ipynb](https://github.com/keisen/tf-keras-vis/blob/master/examples/visualize_conv_filters.ipynb) for details.
+Please see below for details:
 
+* [examples/attentions.ipynb](https://github.com/keisen/tf-keras-vis/blob/master/examples/attentions.ipynb)
+* [examples/visualize_dense_layer.ipynb](https://github.com/keisen/tf-keras-vis/blob/master/examples/visualize_dense_layer.ipynb)
+* [examples/visualize_conv_filters.ipynb](https://github.com/keisen/tf-keras-vis/blob/master/examples/visualize_conv_filters.ipynb)
 
-## Known Issues
-
-* With InceptionV3, ActivationMaximization doesn't work well, that's, it might generate meanninglessly bulr image.
-* With cascading model, Gradcam and Gradcam++ don't work well, that's, it might occur some error.
-* Unsupport `channels-first` models and datas.
+**[NOTE]**
+If you have ever used [keras-vis](https://github.com/raghakot/keras-vis), perhaps you may feel that tf-keras-vis is similar with keras-vis.
+Yes, tf-keras-vis derived from keras-vis.
+And then it was designed to support features in the description of this README such as multiple inputs/outputs, batchwise processing and so on.
+Therefore, although both provided visualization algorisms are almost the same, those software architectures are different.
+Please notice that tf-keras-vis APIs doesn’t have compatibility with keras-vis.
 
 
 ## ToDo
 * API documentations
 * We're going to add some algorisms such as below.
-   - [SmoothGrad: removing noise by adding noise](https://arxiv.org/pdf/1706.03825.pdf) (DONE)
-   - [GradCAM++](https://arxiv.org/abs/1710.11063)
-   - [ScoreCAM](https://arxiv.org/pdf/1910.01279.pdf)
+   - [ScoreCAM: Score-Weighted Visual Explanations for Convolutional Neural Networks](https://arxiv.org/pdf/1910.01279.pdf)
    - Deep Dream
    - Style transfer
+
+
+## Known Issues
+
+* With InceptionV3, ActivationMaximization doesn't work well, that's, it might generate meanninglessly bulr image.
+* With cascading model, Gradcam and Gradcam++ don't work well, that's, it might occur some error.
+* Unsupport `channels-first` models and datas.
+
+
diff --git a/dockerfiles/cpu.Dockerfile b/dockerfiles/cpu.Dockerfile
@@ -1,7 +1,7 @@
 FROM tensorflow/tensorflow:2.2.0
 
 # Default ENV Settings
-ARG TF_KERAS_VIS_VERSION=0.2.4
+ARG TF_KERAS_VIS_VERSION=0.4.0
 ARG JUPYTER_ALLOW_IP="0.0.0.0"
 ARG JUPYTER_TOKEN=""
 

diff --git a/dockerfiles/gpu.Dockerfile b/dockerfiles/gpu.Dockerfile
@@ -1,7 +1,7 @@
 FROM tensorflow/tensorflow:2.2.0-gpu
 
 # Default ENV Settings
-ARG TF_KERAS_VIS_VERSION=0.2.4
+ARG TF_KERAS_VIS_VERSION=0.4.0
 ARG JUPYTER_ALLOW_IP="0.0.0.0"
 ARG JUPYTER_TOKEN=""
 

diff --git a/examples/attentions.ipynb b/examples/attentions.ipynb
diff --git a/examples/images/gradcam_plus_plus.png b/examples/images/gradcam_plus_plus.png
diff --git a/examples/images/smoothgrad.png b/examples/images/smoothgrad.png
diff --git a/setup.py b/setup.py
@@ -5,7 +5,7 @@
 
 setup(
     name="tf-keras-vis",
-    version="0.3.3",
+    version="0.4.0",
     author="keisen",
     author_email="[email protected]",
     description="Neural network visualization toolkit for tf.keras",

diff --git a/tests/tf-keras-vis/test_gradcam_plus_plus.py b/tests/tf-keras-vis/test_gradcam_plus_plus.py
@@ -0,0 +1,127 @@
+import numpy as np
+import pytest
+from tensorflow.keras import backend as K
+from tensorflow.keras.layers import Conv2D, Input, Dense, Flatten
+from tensorflow.keras.models import Sequential, Model
+
+from tf_keras_vis.gradcam import GradcamPlusPlus
+from tf_keras_vis.utils.losses import SmoothedLoss
+
+
+@pytest.fixture(scope="function", autouse=True)
+def dense_model():
+    return Sequential(
+        [Dense(5, input_shape=(3, ), activation='relu'),
+         Dense(2, activation='softmax')])
+
+
+@pytest.fixture(scope="function", autouse=True)
+def cnn_model():
+    return Sequential([
+        Conv2D(5, 3, input_shape=(8, 8, 3), activation='relu'),
+        Flatten(),
+        Dense(2, activation='softmax')
+    ])
+
+
+@pytest.fixture(scope="function", autouse=True)
+def multiple_inputs_cnn_model():
+    input_a = Input((8, 8, 3))
+    input_b = Input((10, 10, 3))
+    x_a = Conv2D(2, 5, activation='relu')(input_a)
+    x_b = Conv2D(2, 5, activation='relu')(input_b)
+    x = K.concatenate([Flatten()(x_a), Flatten()(x_b)], axis=-1)
+    x = Dense(2, activation='softmax')(x)
+    return Model(inputs=[input_a, input_b], outputs=x)
+
+
+def test__call__if_loss_is_None(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    try:
+        gradcam(None, None)
+        assert False
+    except ValueError:
+        assert True
+
+
+def test__call__if_seed_input_is_None(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    try:
+        gradcam(SmoothedLoss(1), None)
+        assert False
+    except ValueError:
+        assert True
+
+
+def test__call__if_seed_input_has_not_batch_dim(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    result = gradcam(SmoothedLoss(1), np.random.sample((8, 8, 3)))
+    assert result.shape == (1, 8, 8)
+
+
+def test__call__(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    result = gradcam(SmoothedLoss(1), np.random.sample((1, 8, 8, 3)))
+    assert result.shape == (1, 8, 8)
+
+
+def test__call__if_penultimate_layer_is_None(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    result = gradcam(SmoothedLoss(1), np.random.sample((1, 8, 8, 3)), penultimate_layer=None)
+    assert result.shape == (1, 8, 8)
+
+
+def test__call__if_penultimate_layer_is_noexist_index(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    try:
+        gradcam(SmoothedLoss(1), np.random.sample((1, 8, 8, 3)), penultimate_layer=100000)
+        assert False
+    except ValueError:
+        assert True
+
+
+def test__call__if_penultimate_layer_is_noexist_name(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    try:
+        gradcam(SmoothedLoss(1), np.random.sample((1, 8, 8, 3)), penultimate_layer='hoge')
+        assert False
+    except ValueError:
+        assert True
+
+
+def test__call__if_model_has_only_dense_layer(dense_model):
+    gradcam = GradcamPlusPlus(dense_model)
+    result = gradcam(SmoothedLoss(1),
+                     np.random.sample((1, 8, 8, 3)),
+                     seek_penultimate_conv_layer=False)
+    assert result.shape == (1, 8, 8)
+    try:
+        gradcam(SmoothedLoss(1), np.random.sample((1, 8, 8, 3)))
+        assert False
+    except ValueError:
+        assert True
+
+
+def test__call__if_model_has_multiple_inputs(multiple_inputs_cnn_model):
+    gradcam = GradcamPlusPlus(multiple_inputs_cnn_model)
+    result = gradcam(
+        SmoothedLoss(1), [np.random.sample(
+            (1, 8, 8, 3)), np.random.sample((1, 10, 10, 3))])
+    assert len(result) == 2
+    assert result[0].shape == (1, 8, 8)
+    assert result[1].shape == (1, 10, 10)
+
+
+def test__call__if_expand_cam_is_False(cnn_model):
+    gradcam = GradcamPlusPlus(cnn_model)
+    result = gradcam(SmoothedLoss(1), np.random.sample((1, 8, 8, 3)), expand_cam=False)
+    assert result.shape == (1, 6, 6)
+
+
+def test__call__if_expand_cam_is_False_and_model_has_multiple_inputs(multiple_inputs_cnn_model):
+    gradcam = GradcamPlusPlus(multiple_inputs_cnn_model)
+    result = gradcam(
+        SmoothedLoss(1), [np.random.sample(
+            (1, 8, 8, 3)), np.random.sample((1, 10, 10, 3))],
+        expand_cam=False)
+    assert result.shape == (1, 6, 6)
diff --git a/tf_keras_vis/activation_maximization.py b/tf_keras_vis/activation_maximization.py
@@ -1,7 +1,7 @@
 import numpy as np
 from collections import defaultdict
 import tensorflow as tf
-from tensorflow.keras import backend as K
+import tensorflow.keras.backend as K
 
 from tf_keras_vis import ModelVisualization
 from tf_keras_vis.utils import check_steps, listify
@@ -93,7 +93,7 @@ def __call__(self,
                 # Calculate regularization values
                 regularization_values = [(regularizer.name, regularizer(seed_inputs))
                                          for regularizer in regularizers]
-                ys = [(-1. * loss_value) + sum([rv for (_, rv) in regularization_values])
+                ys = [(-1. * loss_value) + sum([val for _, val in regularization_values])
                       for loss_value in loss_values]
             grads = tape.gradient(ys, seed_inputs)
             grads = listify(grads)

diff --git a/tf_keras_vis/gradcam.py b/tf_keras_vis/gradcam.py
@@ -1,7 +1,7 @@
 import numpy as np
 import tensorflow as tf
+import tensorflow.keras.backend as K
 from scipy.ndimage.interpolation import zoom
-from tensorflow.keras import backend as K
 from tensorflow.python.keras.layers.convolutional import Conv
 
 from tf_keras_vis import ModelVisualization
@@ -100,3 +100,99 @@ def _zoom_for_visualizing(self, seed_inputs, cam):
                         for input_dims in input_dims_list)
         cam = [np.asarray([zoom(v, factor) for v in cam]) for factor in zoom_factors]
         return cam
+
+
+class GradcamPlusPlus(Gradcam):
+    def __call__(self,
+                 loss,
+                 seed_input,
+                 penultimate_layer=-1,
+                 seek_penultimate_conv_layer=True,
+                 activation_modifier=lambda cam: K.relu(cam),
+                 expand_cam=True):
+        """Generate a gradient based class activation map (CAM) by using positive gradient of
+            penultimate_layer with respect to loss.
+
+            For details on GradCAM++, see the paper:
+            [GradCAM++: Improved Visual Explanations for Deep Convolutional Networks]
+            (https://arxiv.org/pdf/1710.11063.pdf).
+
+        # Arguments
+            loss: A loss function. If the model has multiple outputs, you can use a different
+                loss on each output by passing a list of losses.
+            seed_input: An N-dim Numpy array. If the model has multiple inputs,
+                you have to pass a list of N-dim Numpy arrays.
+            penultimate_layer: A number of integer or a tf.keras.layers.Layer object.
+            seek_penultimate_conv_layer: True to seek the penultimate layter that is a subtype of
+                `keras.layers.convolutional.Conv` class.
+                If False, the penultimate layer is that was elected by penultimate_layer index.
+            activation_modifier: A function to modify gradients.
+            expand_cam: True to expand cam to same as input image size.
+                ![Note] Even if the model has multiple inputs, this function return only one cam
+                value (That's, when `expand_cam` is True, multiple cam images are generated from
+                a model that has multiple inputs).
+        # Returns
+            The heatmap image or a list of their images that indicate the `seed_input` regions
+                whose change would most contribute  the loss value,
+        # Raises
+            ValueError: In case of invalid arguments for `loss`, or `penultimate_layer`.
+        """
+        # Preparing
+        losses = self._get_losses_for_multiple_outputs(loss)
+        seed_inputs = self._get_seed_inputs_for_multiple_inputs(seed_input)
+        penultimate_output_tensor = self._find_penultimate_output(penultimate_layer,
+                                                                  seek_penultimate_conv_layer)
+        # Processing gradcam
+        model = tf.keras.Model(inputs=self.model.inputs,
+                               outputs=self.model.outputs + [penultimate_output_tensor])
+
+        with tf.GradientTape() as tape:
+            tape.watch(seed_inputs)
+            outputs = model(seed_inputs)
+            outputs, penultimate_output = outputs[:-1], outputs[-1]
+            loss_values = [loss(y) for y, loss in zip(outputs, losses)]
+        grads = tape.gradient(loss_values, penultimate_output)
+
+        score = sum([K.exp(v) for v in loss_values])
+        score = tf.reshape(score, (-1, ) + ((1, ) * (len(grads.shape) - 1)))
+
+        first_derivative = score * grads
+        second_derivative = first_derivative * grads
+        third_derivative = second_derivative * grads
+
+        global_sum = K.sum(penultimate_output,
+                           axis=tuple(np.arange(len(penultimate_output.shape))[1:-1]),
+                           keepdims=True)
+
+        alpha_denom = second_derivative * 2.0 + third_derivative * global_sum
+        alpha_denom = alpha_denom + tf.cast((second_derivative == 0.0), second_derivative.dtype)
+        alphas = second_derivative / alpha_denom
+
+        alpha_normalization_constant = K.sum(alphas,
+                                             axis=tuple(np.arange(len(alphas.shape))[1:-1]),
+                                             keepdims=True)
+        alpha_normalization_constant = alpha_normalization_constant + tf.cast(
+            (alpha_normalization_constant == 0.0), alpha_normalization_constant.dtype)
+        alphas = alphas / alpha_normalization_constant
+
+        if activation_modifier is None:
+            weights = first_derivative
+        else:
+            weights = activation_modifier(first_derivative)
+        deep_linearization_weights = weights * alphas
+        deep_linearization_weights = K.sum(
+            deep_linearization_weights,
+            axis=tuple(np.arange(len(deep_linearization_weights.shape))[1:-1]),
+            keepdims=True)
+
+        cam = K.sum(deep_linearization_weights * penultimate_output, axis=-1)
+        if activation_modifier is not None:
+            cam = activation_modifier(cam)
+
+        if not expand_cam:
+            return cam
+
+        cam = self._zoom_for_visualizing(seed_inputs, cam)
+        if len(self.model.inputs) == 1 and not isinstance(seed_input, list):
+            cam = cam[0]
+        return cam