Path: blob/main/C4/W4/assignment/C4W4_Assignment.ipynb
2956 views
Week 4: Using real world data
Welcome! So far you have worked exclusively with generated data. This time you will be using the Daily Minimum Temperatures in Melbourne dataset which contains data of the daily minimum temperatures recorded in Melbourne from 1981 to 1990. In addition to be using Tensorflow's layers for processing sequence data such as Recurrent layers or LSTMs you will also use Convolutional layers to improve the model's performance.
All cells are frozen except for the ones where you need to submit your solutions or when explicitly mentioned you can interact with it.
You can add new cells to experiment but these will be omitted by the grader, so don't rely on newly created cells to host your solution code, use the provided places for this.
You can add the comment # grade-up-to-here in any graded cell to signal the grader that it must only evaluate up to that point. This is helpful if you want to check if you are on the right track even if you are not done with the whole assignment. Be sure to remember to delete the comment afterwards!
Avoid using global variables unless you absolutely have to. The grader tests your code in an isolated environment without running all cells from the top. As a result, global variables may be unavailable when scoring your submission. Global variables that are meant to be used will be defined in UPPERCASE.
To submit your notebook, save it and then click on the blue submit button at the beginning of the page.
Let's get started!
Begin by looking at the structure of the csv that contains the data:
As you can see, each data point is composed of the date and the recorded minimum temperature for that date.
In the first exercise you will code a function to read the data from the csv but for now run the next cell to load a helper function to plot the time series.
Parsing the raw data
Exercise 1: parse_data_from_file
Now you need to read the data from the csv file. To do so, complete the parse_data_from_file
function.
A couple of things to note:
You are encouraged to use the function
np.loadtxt
to load the data. Make sure to check out the documentation to learn about useful parameters.The
times
list should contain every timestep (starting at zero), which is just a sequence of ordered numbers with the same length as thetemperatures
list.The values of the
temperatures
should be offloat
type. Make sure to select the correct column to read withnp.loadtxt
.
Now, use this function to create the timestamps, TIME
, and the time series, SERIES
Expected Output:

Defining some useful global variables
Next, you will define some global variables that will come used throughout the assignment. Feel free to reference them in the upcoming exercises:
SPLIT_TIME
: time index to split between train and validation sets
WINDOW_SIZE
: length od the window to use for smoothing the series
BATCH_SIZE
: batch size for training the model
SHUFFLE_BUFFER_SIZE
: number of elements from the dataset used to sample for a new shuffle of the dataset. For more information about the use of this variable you can take a look at the docs.
A note about grading:
When you submit this assignment for grading these same values for these globals will be used so make sure that all your code works well with these values. After submitting and passing this assignment, you are encouraged to come back here and play with these parameters to see the impact they have in the classification process. Since this next cell is frozen, you will need to copy the contents into a new cell and run it to overwrite the values for these globals.
The next cell will use your function to compute the times
and temperatures
and will save these as numpy arrays within the G
dataclass. This cell will also plot the time series:
Processing the data
Since you already coded the train_val_split
and windowed_dataset
functions during past week's assignments, this time they are provided for you. Notice that like in week 3, the windowed_dataset
function has an extra step, which expands the series to have an extra dimension. This is done because you will be working with Conv layers which expect the dimensionality of its inputs to be 3 (including the batch dimension).
Defining the model architecture
Exercise 2: create_uncompiled_model
Now that you have a function that will process the data before it is fed into your neural network for training, it is time to define your model architecture. Just as in last week's assignment you will do the layer definition and compilation in two separate steps. Begin by completing the create_uncompiled_model
function below.
This is done so you can reuse your model's layers for the learning rate adjusting and the actual training.
Hint:
Remember that the original dataset was expanded, so account for this when setting the shape of the
tf.keras.Input
No
Lambda
layers are requiredUse a combination of
Conv1D
andLSTM
layers, followed byDense
.
The next cell allows you to check the number of total and trainable parameters of your model and prompts a warning in case these exceeds those of a reference solution, this serves the following 3 purposes listed in order of priority:
Helps you prevent crashing the kernel during training.
Helps you avoid longer-than-necessary training times.
Provides a reasonable estimate of the size of your model. In general you will usually prefer smaller models given that they accomplish their goal successfully.
Notice that this is just informative and may be very well below the actual limit for size of the model necessary to crash the kernel. So even if you exceed this reference you are probably fine. However, if the kernel crashes during training or it is taking a very long time and your model is larger than the reference, come back here and try to get the number of parameters closer to the reference.
Expected output:
Where NUM_BATCHES
is the number of batches you have set to your dataset.
You can also print a summary of your model to see what the architecture looks like. This can be useful to get a sense of how big your model is.
Adjusting the learning rate - (Optional Exercise)
As you saw in the lectures, you can leverage Tensorflow's callbacks to dinamically vary the learning rate during training. This can be helpful to get a better sense of which learning rate better acommodates to the problem at hand. This is the same function you had on Week 3 Assignment, feel free to reuse it.
Notice that this is only changing the learning rate during the training process to give you an idea of what a reasonable learning rate is and should not be confused with selecting the best learning rate, this is known as hyperparameter optimization and it is outside the scope of this course.
For the optimizers you can try out:
tf.keras.optimizers.Adam
tf.keras.optimizers.SGD with a momentum of 0.9
Compiling the model
Exercise 3: create_model
Now, it is time to do the actual training that will be used to forecast the time series. For this, complete the create_model
function below.
Notice that you are reusing the architecture you defined in the create_uncompiled_model
earlier. Now you only need to compile this model using the appropriate loss, optimizer (and learning rate). If you completed the optional exercise, you should have a better idea of what a good learning rate would be.
Hints:
The training should be really quick so if you notice that each epoch is taking more than a few seconds, consider trying a different architecture.
If after the first epoch you get an output like this: loss: nan - mae: nan it is very likely that your network is suffering from exploding gradients. This is a common problem if you used SGD as optimizer and set a learning rate that is too high. If you encounter this problem consider lowering the learning rate or using Adam with the default learning rate.
If you passed the unittests, go ahead and train your model by running the cell below:
Now plot the training loss so you can monitor the learning process.
Evaluating the forecast
Now it is time to evaluate the performance of the forecast. For this you can use the compute_metrics
function that you coded in a previous assignment:
At this point only the model that will perform the forecast is ready but you still need to compute the actual forecast.
Faster model forecasts
In the previous weeks you used a for loop to compute the forecasts for every point in the sequence. This approach is valid but there is a more efficient way of doing the same thing by using batches of data. The code to implement this is provided in the model_forecast
below. Notice that the code is very similar to the one in the windowed_dataset
function with the differences that:
The dataset is windowed using
window_size
rather thanwindow_size + 1
No shuffle should be used
No need to split the data into features and labels
A model is used to predict batches of the dataset
To pass this assignment your forecast should achieve a MSE of 6 or less and a MAE of 2 or less.
If your forecast didn't achieve this threshold try re-training your model with a different architecture (you will need to re-run both create_uncompiled_model
and create_model
functions) or tweaking the optimizer's parameters.
If your forecast did achieve these thresholds run the following cell to save the metrics in a binary file which will be used for grading. After doing so, submit your assignment for grading.
Congratulations on finishing this week's assignment!
You have successfully implemented a neural network capable of forecasting time series leveraging a combination of Tensorflow's layers such as Convolutional and LSTMs! This resulted in a forecast that surpasses all the ones you did previously.
By finishing this assignment you have finished the specialization! Give yourself a pat on the back!!!