Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/master/C2 - Improving Deep Neural Networks Hyperparameter tuning, Regularization and Optimization/Week 3/Tensorflow_introduction_new.ipynb
Views: 4804
Introduction to TensorFlow
Welcome to this week's programming assignment! Up until now, you've always used Numpy to build neural networks, but this week you'll explore a deep learning framework that allows you to build neural networks more easily. Machine learning frameworks like TensorFlow, PaddlePaddle, Torch, Caffe, Keras, and many others can speed up your machine learning development significantly. TensorFlow 2.3 has made significant improvements over its predecessor, some of which you'll encounter and implement here!
By the end of this assignment, you'll be able to do the following in TensorFlow 2.3:
Use
tf.Variable
to modify the state of a variableExplain the difference between a variable and a constant
Train a Neural Network on a TensorFlow dataset
Programming frameworks like TensorFlow not only cut down on time spent coding, but can also perform optimizations that speed up the code itself.
Table of Contents
2 - Basic Optimization with GradientTape
The beauty of TensorFlow 2 is in its simplicity. Basically, all you need to do is implement forward propagation through a computational graph. TensorFlow will compute the derivatives for you, by moving backwards through the graph recorded with GradientTape
. All that's left for you to do then is specify the cost function and optimizer you want to use!
When writing a TensorFlow program, the main object to get used and transformed is the tf.Tensor
. These tensors are the TensorFlow equivalent of Numpy arrays, i.e. multidimensional arrays of a given data type that also contain information about the computational graph.
Below, you'll use tf.Variable
to store the state of your variables. Variables can only be created once as its initial value defines the variable shape and type. Additionally, the dtype
arg in tf.Variable
can be set to allow data to be converted to that type. But if none is specified, either the datatype will be kept if the initial value is a Tensor, or convert_to_tensor
will decide. It's generally best for you to specify directly, so nothing breaks!
Here you'll call the TensorFlow dataset created on a HDF5 file, which you can use in place of a Numpy array to store your datasets. You can think of this as a TensorFlow data generator!
You will use the Hand sign data set, that is composed of images with shape 64x64x3.
Since TensorFlow Datasets are generators, you can't access directly the contents unless you iterate over them in a for loop, or by explicitly creating a Python iterator using iter
and consuming its elements using next
. Also, you can inspect the shape
and dtype
of each element using the element_spec
attribute.
The dataset that you'll be using during this assignment is a subset of the sign language digits. It contains six different classes representing the digits from 0 to 5.
You can see some of the images in the dataset by running the following cell.
There's one more additional difference between TensorFlow datasets and Numpy arrays: If you need to transform one, you would invoke the map
method to apply the function passed as an argument to each of the elements.
2.1 - Linear Function
Let's begin this programming exercise by computing the following equation: , where and are random matrices and b is a random vector.
Exercise 1 - linear_function
Compute where , and are drawn from a random normal distribution. W is of shape (4, 3), X is (3,1) and b is (4,1). As an example, this is how to define a constant X with the shape (3,1):
Note that the difference between tf.constant
and tf.Variable
is that you can modify the state of a tf.Variable
but cannot change the state of a tf.constant
.
You might find the following functions helpful:
tf.matmul(..., ...) to do a matrix multiplication
tf.add(..., ...) to do an addition
np.random.randn(...) to initialize randomly
tf.Tensor(
[[-2.15657382]
[ 2.95891446]
[-1.08926781]
[-0.84538042]], shape=(4, 1), dtype=float64)
All test passed
Expected Output:
2.2 - Computing the Sigmoid
Amazing! You just implemented a linear function. TensorFlow offers a variety of commonly used neural network functions like tf.sigmoid
and tf.softmax
.
For this exercise, compute the sigmoid of z.
In this exercise, you will: Cast your tensor to type float32
using tf.cast
, then compute the sigmoid using tf.keras.activations.sigmoid
.
Exercise 2 - sigmoid
Implement the sigmoid function below. You should use the following:
tf.cast("...", tf.float32)
tf.keras.activations.sigmoid("...")
type: <class 'tensorflow.python.framework.ops.EagerTensor'>
dtype: <dtype: 'float32'>
sigmoid(-1) = tf.Tensor(0.26894143, shape=(), dtype=float32)
sigmoid(0) = tf.Tensor(0.5, shape=(), dtype=float32)
sigmoid(12) = tf.Tensor(0.9999939, shape=(), dtype=float32)
All test passed
Expected Output:
type | class 'tensorflow.python.framework.ops.EagerTensor' |
dtype | "dtype: 'float32' |
Sigmoid(-1) | 0.2689414 |
Sigmoid(0) | 0.5 |
Sigmoid(12) | 0.999994 |
2.3 - Using One Hot Encodings
Many times in deep learning you will have a vector with numbers ranging from to , where is the number of classes. If is for example 4, then you might have the following y vector which you will need to convert like this:
This is called "one hot" encoding, because in the converted representation, exactly one element of each column is "hot" (meaning set to 1). To do this conversion in numpy, you might have to write a few lines of code. In TensorFlow, you can use one line of code:
axis=0
indicates the new axis is created at dimension 0
Exercise 3 - one_hot_matrix
Implement the function below to take one label and the total number of classes , and return the one hot encoding in a column wise matrix. Use tf.one_hot()
to do this, and tf.reshape()
to reshape your one hot tensor!
tf.reshape(tensor, shape)
Test 1: tf.Tensor([0. 1. 0. 0.], shape=(4,), dtype=float32)
Test 2: tf.Tensor([0. 0. 1. 0.], shape=(4,), dtype=float32)
All test passed
Expected output
2.4 - Initialize the Parameters
Now you'll initialize a vector of numbers with the Glorot initializer. The function you'll be calling is tf.keras.initializers.GlorotNormal
, which draws samples from a truncated normal distribution centered on 0, with stddev = sqrt(2 / (fan_in + fan_out))
, where fan_in
is the number of input units and fan_out
is the number of output units, both in the weight tensor.
To initialize with zeros or ones you could use tf.zeros()
or tf.ones()
instead.
Exercise 4 - initialize_parameters
Implement the function below to take in a shape and to return an array of numbers using the GlorotNormal initializer.
tf.keras.initializers.GlorotNormal(seed=1)
tf.Variable(initializer(shape=())
W1 shape: (25, 12288)
b1 shape: (25, 1)
W2 shape: (12, 25)
b2 shape: (12, 1)
W3 shape: (6, 12)
b3 shape: (6, 1)
All test passed
Expected output
3.1 - Implement Forward Propagation
One of TensorFlow's great strengths lies in the fact that you only need to implement the forward propagation function and it will keep track of the operations you did to calculate the back propagation automatically.
Exercise 5 - forward_propagation
Implement the forward_propagation
function.
Note Use only the TF API.
tf.math.add
tf.linalg.matmul
tf.keras.activations.relu
tf.Tensor(
[[-0.13430887 0.14086473]
[ 0.21588647 -0.02582335]
[ 0.7059658 0.6484556 ]
[-1.1260961 -0.9329492 ]
[-0.20181894 -0.3382722 ]
[ 0.9558965 0.94167566]], shape=(6, 2), dtype=float32)
All test passed
Expected output
3.2 Compute the Cost
All you have to do now is define the loss function that you're going to use. For this case, since we have a classification problem with 6 labels, a categorical cross entropy will work!
Exercise 6 - compute_cost
Implement the cost function below.
It's important to note that the "
y_pred
" and "y_true
" inputs of tf.keras.losses.categorical_crossentropy are expected to be of shape (number of examples, num_classes).tf.reduce_mean
basically does the summation over the examples.
tf.Tensor(0.4051435, shape=(), dtype=float32)
All test passed
Expected output
3.3 - Train the Model
Let's talk optimizers. You'll specify the type of optimizer in one line, in this case tf.keras.optimizers.Adam
(though you can use others such as SGD), and then call it within the training loop.
Notice the tape.gradient
function: this allows you to retrieve the operations recorded for automatic differentiation inside the GradientTape
block. Then, calling the optimizer method apply_gradients
, will apply the optimizer's update rules to each trainable parameter. At the end of this assignment, you'll find some documentation that explains this more in detail, but for now, a simple explanation will do. 😉
Here you should take note of an important extra step that's been added to the batch training process:
tf.Data.dataset = dataset.prefetch(8)
What this does is prevent a memory bottleneck that can occur when reading from disk. prefetch()
sets aside some data and keeps it ready for when it's needed. It does this by creating a source dataset from your input data, applying a transformation to preprocess the data, then iterating over the dataset the specified number of elements at a time. This works because the iteration is streaming, so the data doesn't need to fit into the memory.
Expected output
Numbers you get can be different, just check that your loss is going down and your accuracy going up!
Congratulations! You've made it to the end of this assignment, and to the end of this week's material. Amazing work building a neural network in TensorFlow 2.3!
Here's a quick recap of all you just achieved:
Used
tf.Variable
to modify your variablesTrained a Neural Network on a TensorFlow dataset
You are now able to harness the power of TensorFlow to create cool things, faster. Nice!
4 - Bibliography
In this assignment, you were introducted to tf.GradientTape
, which records operations for differentation. Here are a couple of resources for diving deeper into what it does and why:
Introduction to Gradients and Automatic Differentiation: https://www.tensorflow.org/guide/autodiff
GradientTape documentation: https://www.tensorflow.org/api_docs/python/tf/GradientTape