Path: blob/master/section-4-unstructured-data-projects/end-to-end-dog-vision-video.ipynb
874 views
🐶 End-to-end Multil-class Dog Breed Classification
This notebook builds an end-to-end multi-class image classifier using TensorFlow 2.x and TensorFlow Hub.
1. Problem
Identifying the breed of a dog given an image of a dog.
When I'm sitting at the cafe and I take a photo of a dog, I want to know what breed of dog it is.
2. Data
The data we're using is from Kaggle's dog breed identification competition.
https://www.kaggle.com/c/dog-breed-identification/data
3. Evaluation
The evaluation is a file with prediction probabilities for each dog breed of each test image.
https://www.kaggle.com/c/dog-breed-identification/overview/evaluation
4. Features
Some information about the data:
We're dealing with images (unstructured data) so it's probably best we use deep learning/transfer learning.
There are 120 breeds of dogs (this means there are 120 different classes).
There are around 10,000+ images in the training set (these images have labels).
There are around 10,000+ images in the test set (these images have no labels, because we'll want to predict them).
Get our workspace ready
Import TensorFlow 2.x ✅
Import TensorFlow Hub ✅
Make sure we're using a GPU ✅
Getting our data ready (turning into Tensors)
With all machine learning models, our data has to be in numerical format. So that's what we'll be doing first. Turning our images into Tensors (numerical representations).
Let's start by accessing our data and checking out the labels.
Getting images and their labels
Let's get a list of all of our image file pathnames.
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-12-302a5f437b95> in <module>()
1 import os
----> 2 if len(os.listdir("drive/My Drive/Dog Vision/train/")) == len(filenames):
3 print("Filenames match actual amount of files!!! Proceed.")
4 else:
5 print("Filenames do no match actual amount of files, check the target directory.")
OSError: [Errno 5] Input/output error: 'drive/My Drive/Dog Vision/train/'
Since we've now got our training image filepaths in a list, let's prepare our labels.
Creating our own validation set
Since the dataset from Kaggle doesn't come with a validation set, we're going to create our own.
We're going to start off experimenting with ~1000 images and increase as needed.
Preprocessing Images (turning images into Tensors)
To preprocess our images into Tensors we're going to write a function which does a few things:
Take an image filepath as input
Use TensorFlow to read the file and save it to a variable,
image
Turn our
image
(a jpg) into TensorsNormalize our image (convert color channel values from from 0-255 to 0-1).
Resize the
image
to be a shape of (224, 224)Return the modified
image
Before we do, let's see what importing an image looks like.
Now we've seen what an image looks like as a Tensor, let's make a function to preprocess them.
We'll create a function to:
Take an image filepath as input
Use TensorFlow to read the file and save it to a variable,
image
Turn our
image
(a jpg) into TensorsNormalize our image (convert color channel values from from 0-255 to 0-1).
Resize the
image
to be a shape of (224, 224)Return the modified
image
More information on loading images in TensorFlow can be seen here: https://www.tensorflow.org/tutorials/load_data/images
Turning our data into batches
Why turn our data into batches?
Let's say you're trying to process 10,000+ images in one go... they all might not fit into memory.
So that's why we do about 32 (this is the batch size) images at a time (you can manually adjust the batch size if need be).
In order to use TensorFlow effectively, we need our data in the form of Tensor tuples which look like this: (image, label)
.
Now we've got a way to turn our data into tuples of Tensors in the form: (image, label)
, let's make a function to turn all of our data (X
& y
) into batches!
Visualizing Data Batches
Our data is now in batches, however, these can be a little hard to understand/comprehend, let's visualize them!
Building a model
Before we build a model, there are a few things we need to define:
The input shape (our images shape, in the form of Tensors) to our model.
The output shape (image labels, in the form of Tensors) of our model.
The URL of the model we want to use from TensorFlow Hub - https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4
Now we've got our inputs, outputs and model ready to go. Let's put them together into a Keras deep learning model!
Knowing this, let's create a function which:
Takes the input shape, output shape and the model we've chosen as parameters.
Defines the layers in a Keras model in sequential fashion (do this first, then this, then that).
Compiles the model (says it should be evaluated and improved).
Builds the model (tells the model the input shape it'll be getting).
Returns the model.
All of these steps can be found here: https://www.tensorflow.org/guide/keras/overview
Creating callbacks
Callbacks are helper functions a model can use during training to do such things as save its progress, check its progress or stop training early if a model stops improving.
We'll create two callbacks, one for TensorBoard which helps track our models progress and another for early stopping which prevents our model from training for too long.
TensorBoard Callback
To setup a TensorBoard callback, we need to do 3 things:
Load the TensorBoard notebook extension ✅
Create a TensorBoard callback which is able to save logs to a directory and pass it to our model's
fit()
function. ✅Visualize our models training logs with the
%tensorboard
magic function (we'll do this after model training).
https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard
Early Stopping Callback
Early stopping helps stop our model from overfitting by stopping training if a certain evaluation metric stops improving.
https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping
Training a model (on subset of data)
Our first model is only going to train on 1000 images, to make sure everything is working.
Let's create a function which trains a model.
Create a model using
create_model()
Setup a TensorBoard callback using
create_tensorboard_callback()
Call the
fit()
function on our model passing it the training data, validation data, number of epochs to train for (NUM_EPOCHS
) and the callbacks we'd like to useReturn the model
Question: It looks like our model is overfitting because it's performing far better on the training dataset than the validation dataset, what are some ways to prevent model overfitting in deep learning neural networks?
Note: Overfitting to begin with is a good thing! It means our model is learning!!!
Checking the TensorBoard logs
The TensorBoard magic function (%tensorboard
) will access the logs directory we created earlier and visualize its contents.
Making and evaluating predictions using a trained model
Having the the above functionality is great but we want to be able to do it at scale.
And it would be even better if we could see the image the prediction is being made on!
Note: Prediction probabilities are also known as confidence levels.
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-55-29272b678cf2> in <module>()
6
7 # Get a predicted label based on an array of prediction probabilities
----> 8 pred_label = get_pred_label(predictions[81])
9 pred_label
NameError: name 'predictions' is not defined
Now since our validation data is still in a batch dataset, we'll have to unbatchify it to make predictions on the validation images and then compare those predictions to the validation labels (truth labels).
Now we've got ways to get get:
Prediction labels
Validation labels (truth labels)
Validation images
Let's make some function to make these all a bit more visaulize.
We'll create a function which:
Takes an array of prediction probabilities, an array of truth labels and an array of images and an integer. ✅
Convert the prediction probabilities to a predicted label. ✅
Plot the predicted label, its predicted probability, the truth label and the target image on a single plot. ✅
Now we've got one function to visualize our models top prediction, let's make another to view our models top 10 predictions.
This function will:
Take an input of prediction probabilities array and a ground truth array and an integer ✅
Find the prediction using
get_pred_label()
✅Find the top 10:
Prediction probabilities indexes ✅
Prediction probabilities values ✅
Prediction labels ✅
Plot the top 10 prediction probability values and labels, coloring the true label green ✅
Now we've got some function to help us visualize our predictions and evaluate our modle, let's check out a few.
Challenge: How would you create a confusion matrix with our models predictions and true labels?
Saving and reloading a trained model
Now we've got functions to save and load a trained model, let's make sure they work!
Training a big dog model 🐶 (on the full data)
Note: Running the cell below will take a little while (maybe up to 30 minutes for the first epoch) because the GPU we're using in the runtime has to load all of the images into memory.
Making predictions on the test dataset
Since our model has been trained on images in the form of Tensor batches, to make predictions on the test data, we'll have to get it into the same format.
Luckily we created create_data_batches()
earlier which can take a list of filenames as input and conver them into Tensor batches.
To make predictions on the test data, we'll:
Get the test image filenames. ✅
Convert the filenames into test data batches using
create_data_batches()
and setting thetest_data
parameter toTrue
(since the test data doesn't have labels). ✅Make a predictions array by passing the test batches to the
predict()
method called on our model.
Note: Calling predict()
on our full model and passing it the test data batch will take a long time to run (about a ~1hr). This is because we have to process ~10,000+ images and get our model to find patterns in those images and generate predictions based on what its learned in the training dataset.
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-82-9f5a139687b9> in <module>()
1 test_predictions = loaded_full_model.predict(test_data,
----> 2 verbose=1)
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/training.py in predict(self, x, batch_size, verbose, steps, callbacks, max_queue_size, workers, use_multiprocessing)
1011 max_queue_size=max_queue_size,
1012 workers=workers,
-> 1013 use_multiprocessing=use_multiprocessing)
1014
1015 def reset_metrics(self):
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/training_v2.py in predict(self, model, x, batch_size, verbose, steps, callbacks, max_queue_size, workers, use_multiprocessing, **kwargs)
496 model, ModeKeys.PREDICT, x=x, batch_size=batch_size, verbose=verbose,
497 steps=steps, callbacks=callbacks, max_queue_size=max_queue_size,
--> 498 workers=workers, use_multiprocessing=use_multiprocessing, **kwargs)
499
500
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/training_v2.py in _model_iteration(self, model, mode, x, y, batch_size, verbose, sample_weight, steps, callbacks, max_queue_size, workers, use_multiprocessing, **kwargs)
473 mode=mode,
474 training_context=training_context,
--> 475 total_epochs=1)
476 cbks.make_logs(model, epoch_logs, result, mode)
477
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/training_v2.py in run_one_epoch(model, iterator, execution_function, dataset_size, batch_size, strategy, steps_per_epoch, num_samples, mode, training_context, total_epochs)
126 step=step, mode=mode, size=current_batch_size) as batch_logs:
127 try:
--> 128 batch_outs = execution_function(iterator)
129 except (StopIteration, errors.OutOfRangeError):
130 # TODO(kaftan): File bug about tf function and errors.OutOfRangeError?
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/training_v2_utils.py in execution_function(input_fn)
96 # `numpy` translates Tensors to values in Eager mode.
97 return nest.map_structure(_non_none_constant_value,
---> 98 distributed_function(input_fn))
99
100 return execution_function
/tensorflow-2.1.0/python3.6/tensorflow_core/python/eager/def_function.py in __call__(self, *args, **kwds)
566 xla_context.Exit()
567 else:
--> 568 result = self._call(*args, **kwds)
569
570 if tracing_count == self._get_tracing_count():
/tensorflow-2.1.0/python3.6/tensorflow_core/python/eager/def_function.py in _call(self, *args, **kwds)
604 # In this case we have not created variables on the first call. So we can
605 # run the first trace but we should fail if variables are created.
--> 606 results = self._stateful_fn(*args, **kwds)
607 if self._created_variables:
608 raise ValueError("Creating variables on a non-first call to a function"
/tensorflow-2.1.0/python3.6/tensorflow_core/python/eager/function.py in __call__(self, *args, **kwargs)
2361 with self._lock:
2362 graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
-> 2363 return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
2364
2365 @property
/tensorflow-2.1.0/python3.6/tensorflow_core/python/eager/function.py in _filtered_call(self, args, kwargs)
1609 if isinstance(t, (ops.Tensor,
1610 resource_variable_ops.BaseResourceVariable))),
-> 1611 self.captured_inputs)
1612
1613 def _call_flat(self, args, captured_inputs, cancellation_manager=None):
/tensorflow-2.1.0/python3.6/tensorflow_core/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
1690 # No tape is watching; skip to running the function.
1691 return self._build_call_outputs(self._inference_function.call(
-> 1692 ctx, args, cancellation_manager=cancellation_manager))
1693 forward_backward = self._select_forward_and_backward_functions(
1694 args,
/tensorflow-2.1.0/python3.6/tensorflow_core/python/eager/function.py in call(self, ctx, args, cancellation_manager)
543 inputs=args,
544 attrs=("executor_type", executor_type, "config_proto", config),
--> 545 ctx=ctx)
546 else:
547 outputs = execute.execute_with_cancellation(
/tensorflow-2.1.0/python3.6/tensorflow_core/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
59 tensors = pywrap_tensorflow.TFE_Py_Execute(ctx._handle, device_name,
60 op_name, inputs, attrs,
---> 61 num_outputs)
62 except core._NotOkStatusException as e:
63 if name is not None:
KeyboardInterrupt:
Preparing test dataset predictions for Kaggle
Looking at the Kaggle sample submission, we find that it wants our models prediction probaiblity outputs in a DataFrame with an ID and a column for each different dog breed. https://www.kaggle.com/c/dog-breed-identification/overview/evaluation
To get the data in this format, we'll:
Create a pandas DataFrame with an ID column as well as a column for each dog breed. ✅
Add data to the ID column by extracting the test image ID's from their filepaths.
Add data (the prediction probabilites) to each of the dog breed columns.
Export the DataFrame as a CSV to submit it to Kaggle.
Making predictions on custom images
To make predictions on custom images, we'll:
Get the filepaths of our own images.
Turn the filepaths into data batches using
create_data_batches()
. And since our custom images won't have labels, we set thetest_data
parameter toTrue
.Pass the custom image data batch to our model's
predict()
method.Convert the prediction output probabilities to predictions labels.
Compare the predicted labels to the custom images.