Transfer Learning approach In Keras | Deep Learning | Python

Hello everyone, In this post, I am going to explain to you the transfer learning approach to deal with your problem statement in deep learning. In our last blogs, we have solved some of the classification and regression problems using a deep neural network. We have walked through many problems, while we were creating the model for our problem statement. I tried to explain it to you there. But we didn’t build any model for the image dataset. Suppose you have an image dataset instead of a numeric dataset so how you will find a way to approach this problem?

You must be thinking to create an ANN Model taking that dataset, yes you are right you can surely create your model with that approach.

But I will tell you another approach to build your new model for the images. Generally, when you are dealing with an image dataset then CNN(Convolutional Neural Network) performs better than normal ANN(Artificial Neural Network). Why?

I have an answer for this one, Go and google the structure of CNN then you will find that there will be a hierarchical structure of CNN. One concept is general in deep learning more the dense layers more the accuracy and more the training data, best your model will be trained.

CNN uses extra layers on the top of ANN as convolutional layers, pooling layers, and then all go together in fully connected layers, and then finally you can get your desired output from the output layers.

Go and Just read about CNN and come back to this article because here we are going to use the approach of transfer learning to train our model. Before moving forward let me tell you, what is transfer learning and why I am interested to explain to you about that?

Transfer learning is the method/process to use the pre-trained model on a new problem. So must have a question that why we will use a pre-trained model, If I can build our own model right?

Yes, you can surely build your own CNN model for your problem statement, but as I have told you for better training of your deep learning model, you need a large number of the dataset, and suppose you have collected the dataset, your next task is to choose the right parameters for this problem as weights, learning rate, etc.

So the overall solution for the above problem is to walk through the approach of transfer learning because transfer learning can train a deep neural network with comparatively little data.

So now, you must have an idea, what is transfer learning. Even in transfer learning, there are many approaches and for that, you can use various pre-trained models such as VGG-16, ResNet-50, etc.

In this tutorial, I am going to use VGG-16 and I will also tell you, how you can use different pre-trained models in the same code by some modifications.

Transfer learning gives us the ability to re-use the pre-trained model in our problem statement. For example, you have a problem to classify images so for this, instead of creating your new model from scratch, you can use a pre-trained model that was trained on the huge number of datasets. Basically, you can transfer the weights of the previous trained model to your problem statement.

Transfer Learning Approach in Computer Vision Applications

So I have introduced the concept of transfer learning and now it’s time to do some practical regarding this. For this tutorial, I have collected some of the images from open data sources and created my folder and two subfolder test and train. these folders are having some class of folders for the different images. With this image dataset, we will use the transfer learning approach to build our first image classification model.

The above block will show you the structure of my folders. You can use any dataset of images. Here I am only taking this dataset to apply the transfer learning approach.

Link to the dataset Dataset

NOTE: To follow this tutorial you can use google colab notebook or you can install all the dependencies in your system.

Now, let’s move forward to build our model to classify these images.

from keras.layers import Input, Lambda, Dense, Flatten
from keras.models import Model
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential

In the above block of code, I am loading the required library to use in our model. You are already familiar with Keras’ deep learning API.

If you will see the above block of code then you will found that I am importing VGG-16 from Keras.applications.I have already told you at the beginning that I am going to use VGG-16 in our problem statement. I am also importing one module named ImageDataGenerator, which is useful to generate random images by modifying the already present images in our dataset (modification means generate images by rotation, flip, shift, and zoom).

The advantage of the image generator is to add more data to our training sets, this technique is also called image augmentation. Image augmentation allows us to create more number of copies of already present images by doing some transformation as flip, rotate, shift, and zoom.

import numpy as np
from glob import glob
import matplotlib.pyplot as plt
IMAGE_SIZE = [224, 224]

train_path = '/content/drive/My Drive/Dataset/train'
valid_path = '/content/drive/My Drive/Dataset/test'
vgg = VGG16(input_shape=IMAGE_SIZE + [3], weights='imagenet', include_top=False)
for layer in vgg.layers:
  layer.trainable = False
folder = glob('/content/drive/My Drive/Dataset/train/*')
x = Flatten()(vgg.output)
prediction = Dense(len(folder), activation='softmax')(x)
model = Model(inputs=vgg.input, outputs=prediction)
model.summary()

Let me explain to you one by one that what I am doing in the above block of code. Look at the first block that I am importing some of the important libraries. You are already familiar with plot and NumPy but I am importing here one new library named glob and it will be useful to retrieve files matching specific patterns.

In the second block, I am using image_size [224, 224] to scale all the images to this size only. And the reason behind this size is that when VGG-16 was trained, Its used image size was [224,224].

In further block, I am storing the path of the train and test.

In the next block of code, I am storing the VGG-16 model into a vgg variable with the imagenet weight. We want to cut the last layers of VGG-16 because the VGG-16 model was used to categorize thousand of images but in our problem statement, we are having only four categories.

If you will clearly see then you will find that I am using input_shape as Image_size+[3] and the reason is that the image is having three channels(RGB), so we need to add that also. Suppose if you are having black and white images then you don’t need to do that because It is having only one channel.

If you will notice one thing, you will find that I am using include_top=False because I don’t want to add the last layer. If you will use it as true It means that you are adding the last layers.

In a further block of code, I am using a for loop in vgg, and the aim to use this loop is that I don’t want to train the existing weights. You can say that  I don’t want to train the VGG-16 layers because It has been already trained. Don’t try to put that as true otherwise, your model will start training itself and you will not get better accuracy.

Next, I am storing the image path into a variable named folder with the help of the glob module and this will result in showing us all the subfolders present in our folder.

In the next code block, I am flattening the last layers of VGG-16 and after that, I am appending my folder as dense layers with an activation function softmax.

In the next block of code, I am converting everything as a model by giving vgg. input and prediction which will be combined, and after we can see the model summary.

In a further block of code, I am looking at the model structure with the help of model.summery.

You can see the output below.

 

You can see in the above block that we have a total of 14,815,044  parameters in which there are 100,356 trainable and 14,714,688 non-trainable parameters.

You can see in the above output, we have 4 output because we are having four categories in our folders.

 

 

model.compile(
  loss='categorical_crossentropy',
  optimizer='adam',
  metrics=['accuracy']
)
from keras.preprocessing.image import ImageDataGenerator
train_gen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)
test_gen = ImageDataGenerator(rescale = 1./255)
train_set=train_gen.flow_from_directory('/content/drive/My Drive/Dataset/train',target_size = (224, 224),batch_size = 32,
                                                 class_mode = 'categorical')
test_set=test_gen.flow_from_directory('/content/drive/My Drive/Dataset/test',target_size = (224, 224),batch_size = 32,
                                            class_mode = 'categorical')
p=model.fit_generator(train_set,validation_data=test_set,epochs=5,steps_per_epoch=len(train_set),validation_steps=len(test_set))

In the above block of code, I am compiling my model passing loss function as categorical_crossentropy and I am using adam optimizer and also I am using accuracy metrics. The overall intention to use this complete is that I am telling my model what kind of optimization and cost function I have to use.

I am also generating images using Image data generator. I have already told you in the beginning that why we have to use an Image data generator.

After performing all the preprocessing, I am storing all the parameters into the train and test set. After storing the data into the train and test set, I am using fit_generator to fit this model.

Why I am using fit_generator() instead of fit()?

The answer is that .fit_generator is used either we have a huge number of datasets in our memory or we have applied data augmentation in our model.

I am using 5 epochs, You can also try to run for more epochs and you can see the change in the accuracy.

You can see the output of each epoch and you can see the loss and accuracy.

Moving forward to plot the loss of this model.

 

 

plt.plot(p.history['loss'], label='train loss')
plt.plot(p.history['val_loss'], label='val loss')
plt.legend()
plt.show()

In the above block of code, I am plotting the loss function. The output can be seen below.

Now finally we have built our model with the help of the transfer learning concept and for that, we have used VGG-16. You can also use another pre-trained model as ResNet50 and you don’t need to change all the code just add ResNet50 In place of VGG-16.

I mean that just import ResNet50 from the Keras library instead of VGG-16.

congratulations to reach here, now you are ready with your new model. Take this example and apply it to your upcoming project, It will really help you.

CONCLUSION:

So finally we have completed our tutorial, transfer learning approach in deep learning. We have encountered a new concept that was image augmentation and this concept is very useful in computer vision applications. By using this you can create a large number of image dataset with your existing image data and you will be ready to train your new transfer learning model.

I have also told you, how you can change different pre-trained models without changing the whole code.

Thanks for your time.

code can be found at the link code (2)

One response to “Transfer Learning approach In Keras | Deep Learning | Python”

  1. Sameer says:

    Well, I don’t have any technical knowledge in this coding field… Still, I came here just to appreciate your work. And yeah, the day I gather some knowledge, I’ll come here for sure????????

Leave a Reply

Your email address will not be published. Required fields are marked *