Path: blob/main/C1/W4/assignment/C1W4_Assignment.ipynb
2956 views
Week 4: Handling Complex Images - Happy or Sad Dataset
In this assignment you will be using the happy or sad dataset, which contains 80 images of emoji-like faces, 40 happy and 40 sad.
Create a convolutional neural network that trains to 99.9% accuracy on these images, which cancels training upon hitting this training accuracy threshold.
TIPS FOR SUCCESSFUL GRADING OF YOUR ASSIGNMENT:
All cells are frozen except for the ones where you need to submit your solutions or when explicitly mentioned you can interact with it.
You can add new cells to experiment but these will be omitted by the grader, so don't rely on newly created cells to host your solution code, use the provided places for this.
You can add the comment # grade-up-to-here in any graded cell to signal the grader that it must only evaluate up to that point. This is helpful if you want to check if you are on the right track even if you are not done with the whole assignment. Be sure to remember to delete the comment afterwards!
Avoid using global variables unless you absolutely have to. The grader tests your code in an isolated environment without running all cells from the top. As a result, global variables may be unavailable when scoring your submission. Global variables that are meant to be used will be defined in UPPERCASE.
To submit your notebook, save it and then click on the blue submit button at the beginning of the page.
Load and explore the data
Begin by taking a look at some images of the dataset.
All the images are contained within the ./data/
directory, notice that in this context the dot (.
) means "the current directory".
This data/
directory contains two subdirectories happy/
and sad/
and each image is saved under the subdirectory related to the class it belongs to, take a look at the following tree for a more detailed view:
It is cool to be able to see examples of the images to better understand the problem-space you are dealing with.
However there is still some relevant information that is missing such as the resolution of the image (although matplotlib renders the images in a grid providing a good idea of these values) and the maximum pixel value (this is important for normalizing these values). For this you can use some tf.keras
utility functions as shown in the next cell:
Looks like the images have a resolution of 150x150. This is very important because this will be the input size of the first layer in your network.
The last dimension refers to each one of the 3 RGB (Red, Green, Blue) channels that are used to represent colored images. So far, in the previous assignments you used black and white images so it is time to introduce some color!
Defining the callback
Since you already have coded the callback responsible for stopping training (once a desired level of accuracy is reached) in the previous two assignments this time it is already provided so you can focus on the other steps:
So far you have implemented an EarlyStoppingCallback
by customizing the on_epoch_end
method but there is a version of this callback already available within tf.keras
. You might want to check out the EarlyStopping callback, which has some extra functionality such as allowing you to save the best weights for your model and while at it take a look at all the other cool callbacks in the docs.
Exercise 1: training_dataset
Up until now, in the previous 3 assignments you have used numpy arrays to hold your training data, which is a valid input for Tensorflow models. However it is often a better practice to use tf.data.Dataset
since this provides extra functionality. You can even create these out of numpy arrays and many other data sources. Be sure to check the docs to learn more about this, as you will use this extensively in the next courses of the specialization.
You have covered some ground already and it is now time for your first task!
You will now use the images of happy and sad faces to create your training dataset. Previously you used some tf.keras
utility functions to work with image data. Now you will use one of the most powerful ones which is image_dataset_from_directory
. Be sure to check out the docs to see how this function is used and how its behaviour can be tweaked by providing different arguments for it. Remember to scale the images using a Rescaling layer and to apply this to the dataset by using the map method as you saw in the ungraded labs!
Expected Output:
Exercise 2: create_and_compile_model
Now that you have the training data ready it is time to define the model you will use to classify the happy and sad faces.
Your model should achieve an accuracy of 99.9% or more before 15 epochs to pass this assignment.
Hints:
The Input of your model should account for the shape of the data, which in this case is the size of each image plus the color dimension.
The last layer of your network should take into account the number of classes you are trying to predict and be compatible with the
label_mode
you defined in the previous exercise.The selection of the loss function should take into consideration the
label_mode
you defined in the previous exercise and the last layer of your network. For a list of available loss functions click here.Remember to set the
accuracy
metric as the callback expects it.You can try any architecture for the network but keep in mind that the model will work best with 3 convolutional layers.
In case you need extra help you can check out some tips at the end of this notebook.
The next cell allows you to check the number of total and trainable parameters of your model and prompts a warning in case these exceeds those of a reference solution, this serves the following 3 purposes listed in order of priority:
Helps you prevent crashing the kernel during training.
Helps you avoid longer-than-necessary training times.
Provides a reasonable estimate of the size of your model. In general you will usually prefer smaller models given that they accomplish their goal successfully.
Notice that this is just informative and may be very well below the actual limit for size of the model necessary to crash the kernel. So even if you exceed this reference you are probably fine. However, if the kernel crashes during training or it is taking a very long time and your model is larger than the reference, come back here and try to get the number of parameters closer to the reference.
Check that the architecture you used is compatible with the dataset:
Expected Output:
Where batch_size
is the one you defined in the previous exercise (should be 10) and n_units
is the number of units of the last layer of your model.
Notice that when using the fit
method to train the model, you can pass in the whole train_data
without explicitly separating features from labels. This is because train_data
is a tf.data.Dataset
and this operation is supported for objects of this class. For more info click here.
Expected Output:
Reached 99.9% accuracy so cancelling training!
printed out before reaching 15 epochs.
Need more help?
Run the following cell to see some extra tips for the model's architecture and compilation parameters:
Congratulations on finishing the last assignment of this course!
You have successfully implemented a CNN to assist you in the classification task for complex images. Nice job!
Keep it up!