Deep Convolutional Generative Adversarial Network | Python
Generative Adversarial Network is developed recently by Ian Goodfellow. They are a DL model to capture the training data distribution enhancing the generation of new data from a given distribution. Therefore, they transform random noise into meaningful data, thus producing a wide variety of data resembling the training dataset. Thus, they are a robust candidate for the unsupervised learning method.
In a normal deep learning problem, we will change the net weight value to alter the error and loss function. Here in Generative Adversarial Network, the generator feeds data into the discriminator, and then the discriminator will produce the desired output. Then, the generator simply tries to spawn ‘fake’ images that resemble the training images and the discriminator will classify images produced from the generator and label if the image is real or fake. Thus the generator will check if the output produced by the discriminator is real/ fake and depending on the output, back-propagation will take place to alter the weights of the discriminator and will repeat the cycle.
Deep Convolutional Generative Adversarial Network is thus considered as an extension of GAN architecture, where various classes including deep convolutional networks are used alongside the generator and discriminator networks. The DCGAN’s generator comprises convolutional-transpose layers, ReLu activations, and batch norm layers. They are unsupervised algorithms with the supervised loss for training.
Some of the impressive applications of Generative Adversarial Network are 3D Object Generation, Face Aging, Realistic Photographs, Face Frontal View Generation, and Semantic-Image-to-Photo Translation.
In this tutorial, we will create numbers with the help of the MNIST dataset using Deep Convolutional GANs. Thus, increases the stability of the training generator model.
To know more about these MNIST datasets, refer here.
Happy Reading!!!
IMPORTING LIBRARIES
The necessary Python libraries to run the projects are to be inputted.
import matplotlib.pyplot as plt from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf import numpy as np
IMPORTING THE DATASET
The MNIST dataset is loaded for the project.
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
Extracting /tmp/data/train-images-idx3-ubyte.gz Extracting /tmp/data/train-labels-idx1-ubyte.gz Extracting /tmp/data/t10k-images-idx3-ubyte.gz Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
DECLARING PARAMETERS
The training and network parameters are declared in this section. Network parameters include image_dim and noise_dimension. The image_dim represents the 28*28 pixels that are 784. And the noise_dimension represents the noise data points.
num_steps = 10000 batch_size = 128 lr_generator = 0.002 lr_discriminator = 0.002 image_dim = 784 noise_dimension = 100
BUILDING NETWORKS
Processing of the network inputs. istraining is the boolean to specify batch stabilization if it is training or implication time.
noise = tf.placeholder(tf.float32, shape = [None, noise_dimension] ) real_image_input = tf.placeholder(tf.float32, shape = [None, 28, 28, 1]) istraining = tf.placeholder(tf.bool)
LEAKY RELU ACTIVATION
Leaky Relu will print the output directly if it is positive, else will print zero. Two notable advantages of Leaky Relu is that:
- Easy to train
- Better Results
def leakyrelu(x, alpha=0.2): return 0.5x + 0.5*abs(x) - 0.5*alpha*abs(x) + 0.5x*alpha
GENERATOR NETWORK
Input: Noice; Output: Image
The steps involved in the Generator Network are:
- Tf Layers automatically generate variables and estimate its shape, based on the noise given as input.
- Reshaping all the 4D image arrays. Thus, the new shape is of size (batch, 7, 7, 128). Here, the first 7 represents the height, the second 7 represents the width, and 128 represents the channels.
- Deconvolution is to be done for reducing the complication of the image pixel-wise. Conversion of the image shape to (batch, 14, 14, 64) and then to (batch, 28, 28, 1) in the second deconvolution.
- Tanh will be applied for better stability which will be in [-1,1] and will be converted/normalized to [0,1] in the later stage.
def generator ( x, reuse = False): with tf.variable_scope('Generator', reuse = reuse): x = tf.layers.dense(x, units = 7 * 7 * 128) x = tf.layers.batch_normalization(x, training = istraining) x = tf.nn.relu(x) x = tf.reshape(x, shape = [-1, 7, 7, 128]) x = tf.layers.conv2d_transpose(x, 64, 5, strides=3, padding='same') x = tf.layers.batch_normalization(x, training=istraining) x = tf.nn.relu(x) x = tf.layers.conv2d_transpose(x, 1, 5, strides=3, padding='same') x = tf.nn.tanh(x) return x
Note: Placeholder indicates the layer if training is done or not. The batch standardization has diverse performance at training and implication time.
DISCRIMINATOR NETWORK
Input: Image.
Output: Prediction of Original or Forged Image
The steps involved in the Discriminator Network are:
- CNN to classify the image data obtained.
- Flatten the image
- Two output classes: Real images or Fake images
def discriminator(x, reuse=False): with tf.variable_scope('Discriminator', reuse = reuse): x = tf.layers.conv2d(x, 64, 5, strides=3, padding='same') x = tf.layers.batch_normalization(x, training=istraining) x = leakyrelu(x) x = tf.layers.conv2d(x, 128, 10, strides=3, padding='same') x = tf.layers.batch_normalization(x, training=istraining) x = leakyrelu(x) x = tf.reshape(x, shape=[-1, 7 * 7 * 128]) x = tf.layers.dense(x, 1024) x = tf.layers.batch_normalization(x, training=istraining) x = leakyrelu(x) x = tf.layers.dense(x, 2) return x
BUILDING THE NETWORK
The building of the Generator Network using a generator(), initializing with the noise data as input
gen_sample = generator(noise_input)
The building of two Discriminator Networks. Thatis, the:
- Noise input
- Generated samples
disc_real = discriminator(real_image_input) disc_fake = discriminator(gen_sample, reuse=True)
The building of the stacked generator or the discriminator
stacked_gan = discriminator(gen_sample, reuse=True)
Building the Loss function. Discriminator Loss is for differentiating between the original and fake samples.
1 is the label representing original images and 0 represents forged images.
disc_loss_real = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits( logits = disc_real, labels = tf.ones([batch_size], dtype = tf.int32))) disc_loss_fake = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits( logits = disc_fake, labels = tf.zeros([batch_size], dtype = tf.int32)))
Summing up both the losses that are the original loss and the fake loss.
disc_loss = disc_loss_real + disc_loss_fake
The Generator Loss function. The chief role of the generator is to dupe the discriminator. Thus, it tries to label 1 for all the fake data created by the discriminator.
gen_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits( logits=stacked_gan, labels=tf.ones([batch_size], dtype=tf.int32)))
Build the Optimizer with AdamOptimizer as the optimizing function. Few important parameters for the optimizer function are:
- Learning Rate
- Beta1
- Beta2
optimizer_gen = tf.train.AdamOptimizer(learning_rate=lr_generator, beta1=0.5, beta2=0.999) optimizer_disc = tf.train.AdamOptimizer(learning_rate=lr_discriminator, beta1=0.5, beta2=0.999)
Consists of the training variables for each and every optimizer. The generator network has to be updated perfectly, as all the variables will be updated with the help of the optimizer (here AdamOptimizer).
gen_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='Generator')
The Discriminator Network Variables with get_collection().
disc_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='Discriminator')
Creating training operations.
gen_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS, scope='Generator')
`gen_update_ops` should run before the `minimize` op (backpropagation), which will be ensured by the `control_dependencies`.
with tf.control_dependencies(gen_update_ops): train_gen = optimizer_gen.minimize( gen_loss,var_list=gen_vars ) disc_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS, scope='Discriminator') with tf.control_dependencies(disc_update_ops): train_disc = optimizer_disc.minimize( disc_loss,var_list=disc_vars )
Allocate default value to the variables.
initialize = tf.global_variables_initializer()
INITIALIZING TRAINING
Starting the new TF session and running the initializer.
sess = tf.Session() sess.run(initialize)
TRAINING
Discriminator and generator training occurs in this section. The steps involved here are:
- Formulate Input Data
- Get the following batch for processing
- Discriminator Training
- Generator Training
Loss A represents the Discriminator Loss and Loss B represents the Generator Loss.
for i in range(1, num_steps+1): batch_x, _ = mnist.train.next_batch(batch_size) batch_x = np.reshape(batch_x, newshape=[-1, 28, 28, 1]) batch_x = batch_x * 2. - 1. z = np.random.uniform(-1., 1., size=[batch_size, noise_dimension]) _, dl = sess.run([train_disc, disc_loss], feed_dict={real_image_input: batch_x, noise: z, istraining:True}) z = np.random.uniform(-1., 1., size=[batch_size, noise_dimension]) _, gl = sess.run([train_gen, gen_loss], feed_dict={noise: z, istraining:True}) if i % 500 == 0 or i == 1: print('Step %i: Loss A: %f, Loss B: %f' % (i, gl, dl))
Step 1: Loss A: 2.590860, Loss B: 0.907586 Step 500: Loss A: 2.154698, Loss B: 0.895236 Step 1000: Loss A: 1.430409, Loss B: 0.837684 Step 1500: Loss A: 1.962198, Loss B: 0.618827 Step 2000: Loss A: 2.767945, Loss B: 0.378071 Step 2500: Loss A: 2.370605, Loss B: 0.561247 Step 3000: Loss A: 3.427798, Loss B: 0.402951 Step 3500: Loss A: 4.904454, Loss B: 0.554856 Step 4000: Loss A: 4.045284, Loss B: 0.454970 Step 4500: Loss A: 4.577699, Loss B: 0.687195 Step 5000: Loss A: 3.476081, Loss B: 0.210492 Step 5500: Loss A: 3.898139, Loss B: 0.143352 Step 6000: Loss A: 4.089877, Loss B: 1.082561 Step 6500: Loss A: 5.911457, Loss B: 0.154059 Step 7000: Loss A: 3.594872, Loss B: 0.152970 Step 7500: Loss A: 6.067883, Loss B: 0.084864 Step 8000: Loss A: 6.737456, Loss B: 0.402566 Step 8500: Loss A: 6.630128, Loss B: 0.034838 Step 9000: Loss A: 6.480587, Loss B: 0.427419 Step 9500: Loss A: 7.200409, Loss B: 0.124268 Step 10000: Loss A: 5.479313, Loss B: 0.191389
TESTING
The steps involved in this section are:
- Produce images from noise as the input present in the generator network.
- Inputting the Noise data as input.
- Produce images from the noise data.
- The images are in tanh[-1,1]. Thus, rescaled to (0,1)
- Converse colors for improved visuals
- Then, Draw the newly created digits in the canvas
number = 6 canvas = np.empty(28 * number, 28 * number) for a in range(number): z = np.random.uniform(-1., 1., size=[number, noise_dimension]) g = sess.run(gen_sample, feed_dict={noise: z, istraining:False}) g = (g + 1.) / 2. g = -1 * (g - 1) for j in range(number): canvas[a * 28:(a + 1) * 28, j * 28:(j + 1) * 28] = g[j].reshape( [28, 28] ) plt.figure(figsize=(number, number)) plt.imshow(canvas, origin="upper", cmap="gray") plt.show()
FINAL THOUGHTS
DCGANs are a very efficient machine learning algorithm, which is an extension of the GAN architecture. In this article, DCGANs has been explained in detail step-by-step with the example of the MNIST dataset. To explain what we have learned in simple words, we built the networks- generator and discriminator. Then, we declared the loss function, optimizers, training operations, and initializing the variables with their respective default values. Following that, we initialized the training by starting the TF session and running the initializer. Then we trained and tested the model. Thus, they are very ideal for training classifiers.
GANs produce sample images quicker than other networks like WaveNet, NADE, PixelRNN, etc. Thus, there will be no need to generate diverse entries in the sample data chronologically. GANs are comparatively harder to train, as we will need a lot of data continuously to check the accuracy. It has plenty of real-world uses which enhances the quality of images, colorizing images, generating faces, and other interesting tasks.
The code for this DCGAN project can be retrieved from here.
To learn more about DCGANs, refer to this research paper.
To learn from more of my blogs, click here.
Hope this article was helpful. Thank you!!!
Leave a Reply