Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/master/Advanced Computer Vision with TensorFlow/Week 4 - Visualization and Interpretability/Copy of C3_W4_Lab_1_FashionMNIST-CAM.ipynb
Views: 13371
Ungraded Lab: Class Activation Maps with Fashion MNIST
In this lab, you will see how to implement a simple class activation map (CAM) of a model trained on the Fashion MNIST dataset. This will show what parts of the image the model was paying attention to when deciding the class of the image. Let's begin!
Imports
Download and Prepare the Data
Build the Classifier
Let's quickly recap how we can build a simple classifier with this dataset.
Define the Model
You can build the classifier with the model below. The image will go through 4 convolutions followed by pooling layers. The final Dense layer will output the probabilities for each class.
Train the Model
Generate the Class Activation Map
To generate the class activation map, we want to get the features detected in the last convolution layer and see which ones are most active when generating the output probabilities. In our model above, we are interested in the layers shown below.
You can now create your CAM model as shown below.
Use the CAM model to predict on the test set, so that it generates the features and the predicted probability for each class (results
).
You can generate the CAM by getting the dot product of the class activation features and the class activation weights.
You will need the weights from the Global Average Pooling layer (GAP) to calculate the activations of each feature given a particular class.
Note that you'll get the weights from the dense layer that follows the global average pooling layer.
The last conv2D layer has (h,w,depth) of (3 x 3 x 128), so there are 128 features.
The global average pooling layer collapses the h,w,f (3 x 3 x 128) into a dense layer of 128 neurons (1 neuron per feature).
The activations from the global average pooling layer get passed to the last dense layer.
The last dense layer assigns weights to each of those 128 features (for each of the 10 classes),
So the weights of the last dense layer (which immmediately follows the global average pooling layer) are referred to in this context as the "weights of the global average pooling layer".
For each of the 10 classes, there are 128 features, so there are 128 feature weights, one weight per feature.
Now, get the features for a specific image, indexed between 0 and 999.
The features have height and width of 3 by 3. Scale them up to the original image height and width, which is 28 by 28.
For a particular class (0...9), get the 128 weights.
Take the dot product with the scaled features for this selected image with the weights.
The shapes are: scaled features: (h,w,depth) of (28 x 28 x 128). weights for one class: 128
The dot product produces the class activation map, with the shape equal to the height and width of the image: 28 x 28.
Conceptual interpretation
To think conceptually about what what you're doing and why:
In the 28 x 28 x 128 feature map, each of the 128 feature filters is tailored to look for a specific set of features (for example, a shoelace).
The actual features are learned, not selected by you directly.
Each of the 128 weights for a particular class decide how much weight to give to each of the 128 features, for that class.
For instance, for the "shoe" class, it may have a higher weight for the feature filters that look for shoelaces.
At each of the 28 by 28 pixels, you can take the vector of 128 features and compare them with the vector of 128 weights.
You can do this comparison with a dot product.
The dot product results in a scalar value at each pixel.
Apply this dot product across all of the 28 x 28 pixels.
The scalar result of the dot product will be larger when the image both has the particular feature (e.g. shoelace), and that feature is also weighted more heavily for the particular class (e.g shoe).
So you've created a matrix with the same number of pixels as the image, where the value at each pixel is higher when that pixel is relevant to the prediction of a particular class.
Here is the function that implements the Class activation map calculations that you just saw.
You can now test generating class activation maps. Let's use the utility function below.
For class 8 (handbag), you'll notice that most of the images have dark spots in the middle and right side.
This means that these areas were given less importance when categorizing the image.
The other parts such as the outline or handle contribute more when deciding if an image is a handbag or not.
Observe the other classes and see if there are also other common areas that the model uses more in determining the class of the image.