Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/Gen AI for Intelligent Data Handling/Day 6 GAN Fundamentals and Unsupervised Training.ipynb
3370 views
Kernel: Python 3 (ipykernel)

GAN stands for Generative Adversarial Networks.

They are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework. Following points explain GAN:

    1. Two Networks: GANs consist of two neural networks: a generator and a discriminator.

    1. Competitive Training: The generator creates synthetic data samples, while the discriminator distinguishes between real and fake data.

    1. Adversarial Process: The generator aims to generate data that is indistinguishable from real data, while the discriminator aims to correctly classify real and fake samples.

    1. Improvement Iteration: Through iterative training, the generator improves its ability to produce realistic samples, while the discriminator becomes better at distinguishing real from fake.

    1. Wide Applications: GANs have been used successfully in generating realistic images, audio, text, and more, with applications in art generation, data augmentation, and anomaly detection, among others.

The "zero-sum rule" is a concept often used in game theory and economics. It refers to a situation where the gains and losses of one participant are exactly balanced by the gains and losses of another participant. In other words, the total benefit to all participants in the system adds up to zero

Zero-sum rule:

  • Total Gain = Total Loss: In a zero-sum game, the total gains made by all participants in the game equal the total losses suffered by all participants. This means that any gain by one player is directly offset by a loss experienced by another player

  • No Net Benefit: In such scenarios, there is no net benefit gained or lost across all participants. Any advantage gained by one player necessarily comes at the expense of others, resulting in a redistribution rather than a creation of wealth

image.png

Examples

  • In competitive sports matches, where one team's victory corresponds to another team's defeat, the outcome often follows a zero-sum dynamic

  • Economic transactions can sometimes be modeled as zero-sum games in certain contexts. For instance, in a simple barter economy, the exchange of goods between two parties can be seen as a zero-sum game, as the value gained by one party is equivalent to the value lost by the other.

  • Contrast with Non-Zero Sum Games: In contrast to zero-sum games, non-zero sum games allow for scenarios where all participants can benefit simultaneously. These games often involve cooperative strategies where collaboration can lead to outcomes that are more favorable for all parties involved

GAN Components

image-2.png

1.Generator

  • Generative Adversarial Network (GAN), the generator function is a crucial component responsible for creating new data samples that resemble the training data distribution

  • The generator takes random noise as input and produces synthetic data that ideally becomes indistinguishable from real data to an external observer, such as the discriminator in the GAN setup

The components of a typical generator function in a GAN:

  • Input Layer: The generator typically starts with an input layer that takes random noise as input. This noise is usually drawn from a simple probability distribution, such as a Gaussian distribution.

  • Dense Layer: Following the input layer, there is often a dense (fully connected) layer that maps the input noise to a higher-dimensional space. This layer helps to transform the random noise into a format that can be further processed by subsequent layers.

  • Batch Normalization: Batch normalization layers are commonly used in the generator to stabilize and speed up the training process. They normalize the activations of the previous layer across the mini-batch.

  • Activation Function: Activation functions, such as ReLU (Rectified Linear Unit) or Leaky ReLU, are applied after each layer to introduce non-linearity into the network. This allows the generator to learn complex patterns and generate diverse outputs.

  • Reshaping Layer: After several dense layers, there is often a reshaping layer that transforms the output into a 3D tensor, typically representing an image-like structure. This prepares the data for the subsequent convolutional layers.

  • Convolutional Transpose Layers: Convolutional transpose layers, also known as deconvolutional layers, are used to upsample the data, gradually increasing its spatial dimensions. These layers learn to generate high-resolution images from the low-dimensional input noise.

  • Output Layer: Finally, the output layer typically consists of a convolutional transpose layer followed by an activation function, such as Tanh. This layer generates the final synthetic data samples, which ideally resemble the real data samples from the training set.

note: The generator function in a GAN is trained adversarially with the discriminator. The goal of the generator is to produce synthetic data that is realistic enough to fool the discriminator into classifying it as real. Through this adversarial training process, the generator learns to generate increasingly realistic data samples that capture the underlying structure of the training data distribution.

  • Overall, the generator function plays a central role in the GAN framework, driving the generation of new data samples and enabling the model to learn to produce realistic outputs.

2. Discriminator

The discriminator function in a Generative Adversarial Network (GAN) is responsible for distinguishing between real and fake data samples. It learns to classify input data as either coming from the real training data distribution or generated by the generator network.

Components of a discriminator function in a GAN:

  • Input Layer: The discriminator starts with an input layer that receives data samples, which could be either real or generated by the generator. These data samples are typically images, text, or any other type of data that the GAN is designed to generate.

  • Convolutional Layers: Convolutional layers are commonly used in the discriminator to extract features from the input data. These layers consist of multiple filters that slide over the input data, detecting patterns and spatial relationships. Convolutional layers are effective for processing high-dimensional data such as images.

  • Activation Functions: Activation functions like Leaky ReLU (Rectified Linear Unit) are often applied after each convolutional layer to introduce non-linearity into the network. Leaky ReLU helps prevent the vanishing gradient problem by allowing a small, non-zero gradient when the input is negative.

  • Pooling or Strided Convolutions: Pooling layers or strided convolutions are used to downsample the feature maps produced by the convolutional layers. This reduces the spatial dimensions of the data while retaining important features. Pooling layers typically use max-pooling or average-pooling operations.

  • Dropout: Dropout layers may be included to prevent overfitting by randomly dropping a fraction of the neurons during training. This regularization technique helps improve the generalization ability of the discriminator.

  • Flattening Layer: After the convolutional layers, the feature maps are flattened into a one-dimensional vector. This prepares the data for input into the fully connected layers.

  • Fully Connected Layers: Fully connected (dense) layers are used to perform classification based on the extracted features. These layers receive the flattened feature vector as input and output a single value indicating the probability that the input data is real.

  • Output Layer: The output layer typically consists of a single neuron with a sigmoid activation function. This neuron produces a scalar output in the range [0, 1], representing the probability that the input data is real. A value close to 1 indicates a high probability of the input being real, while a value close to 0 indicates a high probability of the input being fake.

The discriminator function is trained alongside the generator in an adversarial manner. It learns to differentiate between real and fake data samples by minimizing a suitable loss function, such as binary cross-entropy loss. As training progresses, the discriminator becomes better at distinguishing between real and fake data, while the generator learns to produce increasingly realistic data samples to fool the discriminator.

image-2.png

Step-by-Step explanation of how GANs function:

  1. **Initialization: The generator and discriminator neural networks are initialized with random weights.

  2. **Training Loop:

  • Generator Input: The generator takes random noise (usually sampled from a Gaussian distribution) as input and generates synthetic data.

  • Real Data: The discriminator is fed with real data samples from the training dataset along with the synthetic data generated by the generator.

  • Discriminator Training: The discriminator is trained to distinguish between real and fake data. It adjusts its weights to improve its ability to classify the input as either real or fake.

  • Generator Training: The generator is trained to generate data that is realistic enough to fool the discriminator. It generates synthetic data and passes it through the discriminator. The generator's weights are adjusted to minimize the discriminator's ability to distinguish between real and fake data. In other words, the generator aims to maximize the probability that the discriminator classifies its generated data as real.

  • Back and Forth Training: This process continues iteratively, with the generator and discriminator improving their performance through successive rounds of training. As the discriminator gets better at distinguishing real from fake data, the generator must also improve to produce more realistic data.

  1. Adversarial Loss Function : The training of GANs is driven by an adversarial loss function. The goal of the generator is to minimize this loss, while the goal of the discriminator is to maximize it. This adversarial setup creates a competitive dynamic where the generator and discriminator are constantly trying to outperform each other.

  2. Convergence: Ideally, with enough training, the generator becomes proficient at generating data that is indistinguishable from real data, and the discriminator becomes unable to differentiate between real and fake data. At this point, the GAN is said to have converged.

  3. Evaluation: The trained generator can be used to generate new data samples that are similar to the training data. These generated samples can be evaluated for quality and realism using various metrics and visual inspection.

GAN%204.jpg

Basic implementation of a GAN using TensorFlow.

  • It defines both the generator and discriminator networks, along with the training loop

import tensorflow as tf from tensorflow.keras import layers import numpy as np import matplotlib.pyplot as plt
import numpy as np import matplotlib.pyplot as plt from keras.models import Sequential from keras.layers import SimpleRNN, Dense from keras.datasets import mnist from keras.utils import to_categorical # Load the MNIST dataset (train_images, train_labels), (test_images, test_labels) = mnist.load_data() # Print a few examples from the dataset plt.figure(figsize=(10, 5)) for i in range(5): plt.subplot(2, 5, i + 1) plt.imshow(train_images[i], cmap='gray') plt.title(f"Label: {train_labels[i]}") plt.axis('off') plt.show()
Image in a Jupyter notebook
  • In a Conv2D generator, Leaky ReLU introduces a small negative slope to prevent dead neurons, enhancing learning by allowing gradients to flow during backpropagation, thereby improving model stability and convergence. This helps capture subtle features and nuances in generated images, enhancing the overall quality of the output

  • In a Conv2D generator, the assert function can be used to verify the dimensions of the input tensor before proceeding with the convolution operation, ensuring compatibility between the input data and the convolutional layer's parameters, thereby preventing runtime errors and ensuring the integrity of the network's operations.

  • Setting use_bias = false in a Conv2D generator means that the convolutional layer won't utilize bias terms. This can be useful in scenarios where introducing bias terms might not be necessary or could potentially lead to overfitting. By excluding bias terms, the model becomes more constrained, potentially reducing the overall number of parameters and computational complexity while still allowing for effective feature extraction and learning.

  • Stride: Strides to reduce the spatial dimensions of the output volume. it also determines the stpe size of convultaion filter.

(10+20+30)/3
20.0
# Define the Generator network def make_generator_model(): model = tf.keras.Sequential() model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,))) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Reshape((7, 7, 256))) assert model.output_shape == (None, 7, 7, 256) # None is the batch size model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False)) assert model.output_shape == (None, 7, 7, 128) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False)) assert model.output_shape == (None, 14, 14, 64) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh')) assert model.output_shape == (None, 28, 28, 1) return model # Create the generator generator = make_generator_model()
C:\Users\Suyashi144893\AppData\Local\anaconda3\Lib\site-packages\keras\src\layers\core\dense.py:87: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
# Define the Discriminator network def make_discriminator_model(): model = tf.keras.Sequential() model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1])) model.add(layers.LeakyReLU()) model.add(layers.Dropout(0.2)) # Techique by reducing the risk of overfitting model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same')) model.add(layers.LeakyReLU()) model.add(layers.Dropout(0.2)) model.add(layers.Flatten()) model.add(layers.Dense(1)) return model # Create the discriminator discriminator = make_discriminator_model()
C:\Users\Suyashi144893\AppData\Local\anaconda3\Lib\site-packages\keras\src\layers\convolutional\base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
# Define the loss functions cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True) # Define the discriminator loss def discriminator_loss(real_output, fake_output): real_loss = cross_entropy(tf.ones_like(real_output), real_output) fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output) total_loss = real_loss + fake_loss return total_loss # Define the generator loss def generator_loss(fake_output): return cross_entropy(tf.ones_like(fake_output), fake_output)
# Define the optimizers generator_optimizer = tf.keras.optimizers.Adam(1e-4) discriminator_optimizer = tf.keras.optimizers.Adam(1e-4) # Define the training step function @tf.function def train_step(images): noise = tf.random.normal([BATCH_SIZE, noise_dim]) with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape: generated_images = generator(noise, training=True) real_output = discriminator(images, training=True) fake_output = discriminator(generated_images, training=True) gen_loss = generator_loss(fake_output) disc_loss = discriminator_loss(real_output, fake_output) gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables) gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables) generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables)) discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables)) return gen_loss, disc_loss
# Define the training loop with visualizations def train(dataset, epochs): for epoch in range(epochs): for image_batch in dataset: gen_loss, disc_loss = train_step(image_batch) print(f'Epoch {epoch+1}/{epochs}, Generator Loss: {gen_loss}, Discriminator Loss: {disc_loss}') # Generate and save images for visualization if epoch % 10 == 0: generate_and_save_images(generator, epoch + 1, seed) # Function to generate and save images def generate_and_save_images(model, epoch, test_input): predictions = model(test_input, training=False) fig = plt.figure(figsize=(4,4)) for i in range(predictions.shape[0]): plt.subplot(4, 4, i+1) plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray') plt.axis('off') plt.savefig('image_at_epoch_{:04d}.png'.format(epoch)) plt.show()
# Load and prepare the dataset ( MNIST) mnist = tf.keras.datasets.mnist (train_images, _), (_, _) = mnist.load_data() train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32') train_images = (train_images - 127.5) / 127.5 # Normalize the images to [-1, 1] # Batch and shuffle the data BUFFER_SIZE = 60000 BATCH_SIZE = 256 train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE) # Define the dimensionality of the random noise vector noise_dim = 100 num_examples_to_generate = 16 seed = tf.random.normal([num_examples_to_generate, noise_dim]) # Train the GAN train(train_dataset, epochs=50)
Epoch 1/50, Generator Loss: 0.8065990209579468, Discriminator Loss: 1.183670997619629
Image in a Jupyter notebook
Epoch 2/50, Generator Loss: 0.6757266521453857, Discriminator Loss: 1.6348075866699219 Epoch 3/50, Generator Loss: 0.7654538154602051, Discriminator Loss: 1.3101272583007812 Epoch 4/50, Generator Loss: 0.7893068790435791, Discriminator Loss: 1.2660752534866333 Epoch 5/50, Generator Loss: 0.6752974987030029, Discriminator Loss: 1.4753272533416748 Epoch 6/50, Generator Loss: 0.9707562327384949, Discriminator Loss: 1.060701847076416 Epoch 7/50, Generator Loss: 0.795372486114502, Discriminator Loss: 1.3027684688568115 Epoch 8/50, Generator Loss: 1.201750636100769, Discriminator Loss: 1.0156729221343994 Epoch 9/50, Generator Loss: 1.1588877439498901, Discriminator Loss: 0.9216384887695312 Epoch 10/50, Generator Loss: 0.8584988713264465, Discriminator Loss: 1.177566409111023 Epoch 11/50, Generator Loss: 1.0871880054473877, Discriminator Loss: 1.1075160503387451
Image in a Jupyter notebook
Epoch 12/50, Generator Loss: 1.4047266244888306, Discriminator Loss: 0.7750818729400635 Epoch 13/50, Generator Loss: 1.1473274230957031, Discriminator Loss: 1.0402584075927734 Epoch 14/50, Generator Loss: 1.0529853105545044, Discriminator Loss: 1.08624267578125 Epoch 15/50, Generator Loss: 1.1805408000946045, Discriminator Loss: 0.9007998704910278 Epoch 16/50, Generator Loss: 1.2025052309036255, Discriminator Loss: 1.01371431350708 Epoch 17/50, Generator Loss: 1.0762717723846436, Discriminator Loss: 1.2584474086761475 Epoch 18/50, Generator Loss: 1.2147921323776245, Discriminator Loss: 1.0057075023651123 Epoch 19/50, Generator Loss: 1.1148513555526733, Discriminator Loss: 1.2771360874176025 Epoch 20/50, Generator Loss: 1.0679996013641357, Discriminator Loss: 1.2670615911483765 Epoch 21/50, Generator Loss: 1.3326334953308105, Discriminator Loss: 0.8241354823112488
Image in a Jupyter notebook
Epoch 22/50, Generator Loss: 1.281818151473999, Discriminator Loss: 1.0473520755767822 Epoch 23/50, Generator Loss: 1.4975351095199585, Discriminator Loss: 0.8451695442199707 Epoch 24/50, Generator Loss: 1.523517370223999, Discriminator Loss: 0.9822924733161926 Epoch 25/50, Generator Loss: 1.2055697441101074, Discriminator Loss: 1.150001883506775 Epoch 26/50, Generator Loss: 1.4875363111495972, Discriminator Loss: 0.9282823801040649 Epoch 27/50, Generator Loss: 1.3401894569396973, Discriminator Loss: 0.8787401914596558 Epoch 28/50, Generator Loss: 1.2346713542938232, Discriminator Loss: 1.1397464275360107 Epoch 29/50, Generator Loss: 1.352311611175537, Discriminator Loss: 1.0197715759277344 Epoch 30/50, Generator Loss: 1.228311538696289, Discriminator Loss: 1.0680782794952393 Epoch 31/50, Generator Loss: 1.1672059297561646, Discriminator Loss: 1.2383060455322266
Image in a Jupyter notebook
Epoch 32/50, Generator Loss: 1.1306791305541992, Discriminator Loss: 1.233073353767395 Epoch 33/50, Generator Loss: 1.2254234552383423, Discriminator Loss: 1.0237202644348145 Epoch 34/50, Generator Loss: 1.1999567747116089, Discriminator Loss: 1.1425186395645142 Epoch 35/50, Generator Loss: 0.9446189403533936, Discriminator Loss: 1.165036678314209 Epoch 36/50, Generator Loss: 0.9458374977111816, Discriminator Loss: 1.3482080698013306 Epoch 37/50, Generator Loss: 0.9849205017089844, Discriminator Loss: 1.4684627056121826 Epoch 38/50, Generator Loss: 0.9356980323791504, Discriminator Loss: 1.392003059387207 Epoch 39/50, Generator Loss: 0.9498552083969116, Discriminator Loss: 1.2292674779891968 Epoch 40/50, Generator Loss: 1.1001766920089722, Discriminator Loss: 1.091678500175476 Epoch 41/50, Generator Loss: 1.2011992931365967, Discriminator Loss: 0.9205547571182251
Image in a Jupyter notebook
Epoch 42/50, Generator Loss: 1.3927195072174072, Discriminator Loss: 0.8923341035842896 Epoch 43/50, Generator Loss: 1.0297582149505615, Discriminator Loss: 1.1231093406677246 Epoch 44/50, Generator Loss: 1.0156135559082031, Discriminator Loss: 1.1072301864624023 Epoch 45/50, Generator Loss: 0.9707326889038086, Discriminator Loss: 1.199002742767334 Epoch 46/50, Generator Loss: 0.9539605379104614, Discriminator Loss: 1.1693193912506104 Epoch 47/50, Generator Loss: 1.024275541305542, Discriminator Loss: 1.1642905473709106 Epoch 48/50, Generator Loss: 1.0258796215057373, Discriminator Loss: 1.2018778324127197 Epoch 49/50, Generator Loss: 0.9595581889152527, Discriminator Loss: 1.1855778694152832 Epoch 50/50, Generator Loss: 1.002139925956726, Discriminator Loss: 1.131898283958435

Explanation:

  • In Epoch 2, the generator loss is 0.896 and the discriminator loss is 1.226. This means the generator is improving but still isn't generating samples that are very close to the real data, and the discriminator is doing a decent job at telling real from fake.

  • In Epoch 9, the generator loss drops to 0.76, indicating improvement in generating realistic samples. Meanwhile, the discriminator is loss increased , indicating it is finding difficult able to differentiate but with a lower error.

  • In Epoch 10, the generator loss increases to 0.957, which might indicate that the generator is finding it harder to generate more realistic samples or that the discriminator has become more effective. The discriminator loss is also high, suggesting it's finding it challenging to distinguish between real and fake samples.