Path: blob/master/Gen AI for Intelligent Data Handling/3 Understanding Recurrent Neural Networks (RNNs) and its example in Sequence Generation.ipynb
3370 views
Recurrent Neural Networks (RNNs) are a type of neural network architecture that is particularly well-suited for tasks involving sequential data. Unlike feedforward neural networks, which process data in fixed-size chunks, RNNs can handle input sequences of arbitrary length.
key features of RNNs:
Recurrent Connections: RNNs have recurrent connections that allow information to persist across different time steps in a sequence. This means that information from previous inputs is considered when processing the current input.
Shared Parameters: The same set of weights and biases are applied at each time step. This allows the network to use the same computation for different elements of the sequence.
Time Dependency: RNNs are well-suited for tasks where the order or temporal dependency of data matters, such as time series prediction, language modeling, and speech recognition.
Applications of RNNs:
Language Modeling and Text Generation: RNNs can be used to model the probability distribution of sequences of words. This enables tasks like auto-completion, machine translation, and text generation.
Time Series Prediction: RNNs are effective for tasks like stock price prediction, weather forecasting, and any scenario where the current state depends on previous states.
Speech Recognition: RNNs can be used to convert spoken language into written text. This is crucial for applications like voice assistants (e.g., Siri, Alexa).
Handwriting Recognition: RNNs can recognize handwritten text, enabling applications like digit recognition and signature verification.
Image Captioning: RNNs can be combined with Convolutional Neural Networks (CNNs) to generate captions for images.
Video Analysis: RNNs can process sequences of images or video frames, making them useful for tasks like action recognition, video captioning, and video prediction.
Anomaly Detection: RNNs can be used to detect anomalies in sequences of data, making them valuable for tasks like fraud detection in finance or detecting defects in manufacturing.
Sentiment Analysis: RNNs can analyze sequences of text to determine the sentiment expressed.
Mathematical Implementation:
Terms:
xt: Input at time step at t
ht: Hidden state at time step at t
Whx: Weight matrix for input-to-hidden connections
Whh: Weight matrix for hidden-to-hidden connections
bh:Bias term for hidden layer
Wyh: Weight matrix for hidden-to-output connection
by: Bias term for output layer
Training:
During training, you would use backpropagation through time (BPTT) to compute gradients and update the weights and biases to minimize the loss function. Prediction:
Once the network is trained, you can make predictions by passing a sequence of inputs through the network. This is a basic mathematical interpretation of a simple RNN. In practice, more sophisticated variants like LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) are often used to address issues like vanishing gradients and better capture long-term dependencies.
Below is a basic implementation of a simple RNN using only the NumPy library. This code demonstrates how you can manually perform forward passes through time.
Explanation:
The code defines a basic RNN class (SimpleRNN) with methods for forward pass (forward) and backward pass (backward).
The activation functions (sigmoid and tanh) and their derivatives are defined.
The forward method performs a forward pass through the RNN, storing intermediate values for backpropagation.
The backward method computes gradients and updates the weights and biases using backpropagation through time (BPTT).
Let us use Keras library to create and train a basic RNN for a toy example of sequence prediction. This example uses a very simple sequence (1, 2, 3, 4, 5) and tries to predict the next number in the sequence.
Let's create a simple RNN using Keras with some sam ple data. In this example, we'll use a sequence of numbers to predict the next number in the sequence.
Epoch 1/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 3s 11ms/step - loss: 0.7850
Epoch 2/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3034
Epoch 3/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.1377
Epoch 4/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.1295
Epoch 5/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.1034
Epoch 6/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0655
Epoch 7/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0565
Epoch 8/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0475
Epoch 9/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.0395
Epoch 10/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0312
Epoch 11/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0269
Epoch 12/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0288
Epoch 13/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0274
Epoch 14/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0304
Epoch 15/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0232
Epoch 16/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0184
Epoch 17/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0181
Epoch 18/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0131
Epoch 19/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0151
Epoch 20/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0140
Epoch 21/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0136
Epoch 22/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0124
Epoch 23/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0118
Epoch 24/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0102
Epoch 25/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0107
Epoch 26/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0100
Epoch 27/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.0084
Epoch 28/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0089
Epoch 29/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0083
Epoch 30/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0071
Epoch 31/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.0075
Epoch 32/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0069
Epoch 33/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0066
Epoch 34/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0062
Epoch 35/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0060
Epoch 36/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0066
Epoch 37/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0061
Epoch 38/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0060
Epoch 39/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0042
Epoch 40/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0048
Epoch 41/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0053
Epoch 42/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0050
Epoch 43/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0070
Epoch 44/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0050
Epoch 45/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0042
Epoch 46/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0063
Epoch 47/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0047
Epoch 48/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0041
Epoch 49/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0040
Epoch 50/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.0031
Epoch 51/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0039
Epoch 52/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0033
Epoch 53/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0030
Epoch 54/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0025
Epoch 55/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0026
Epoch 56/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0023
Epoch 57/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0021
Epoch 58/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0024
Epoch 59/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0029
Epoch 60/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0025
Epoch 61/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0024
Epoch 62/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0024
Epoch 63/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0022
Epoch 64/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0019
Epoch 65/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0020
Epoch 66/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0022
Epoch 67/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0017
Epoch 68/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0016
Epoch 69/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0014
Epoch 70/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0017
Epoch 71/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0016
Epoch 72/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0019
Epoch 73/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0016
Epoch 74/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.00122e-0
Epoch 75/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0014
Epoch 76/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0014
Epoch 77/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0013
Epoch 78/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0012
Epoch 79/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0012
Epoch 80/80
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.0012
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 309ms/step