📚 The CoCalc Library - books, templates and other resources
License: OTHER
Deep Learning with TensorFlow
Credits: Forked from TensorFlow by Google
Setup
Refer to the setup instructions.
Exercise 6
After training a skip-gram model in 5_word2vec.ipynb
, the goal of this exercise is to train a LSTM character model over Text8 data.
Create a small validation set.
Utility functions to map characters to vocabulary IDs and back.
Function to generate a training batch for the LSTM model.
Simple LSTM Model.
Problem 1
You might have noticed that the definition of the LSTM cell involves 4 matrix multiplications with the input, and 4 matrix multiplications with the output. Simplify the expression by using a single matrix multiply for each, and variables that are 4 times larger.
Problem 2
We want to train a LSTM over bigrams, that is pairs of consecutive characters like 'ab' instead of single characters like 'a'. Since the number of possible bigrams is large, feeding them directly to the LSTM using 1-hot encodings will lead to a very sparse representation that is very wasteful computationally.
a- Introduce an embedding lookup on the inputs, and feed the embeddings to the LSTM cell instead of the inputs themselves.
b- Write a bigram-based LSTM, modeled on the character LSTM above.
c- Introduce Dropout. For best practices on how to use Dropout in LSTMs, refer to this article.
Problem 3
(difficult!)
Write a sequence-to-sequence LSTM which mirrors all the words in a sentence. For example, if your input is:
the model should attempt to output:
Reference: http://arxiv.org/abs/1409.3215