Malaria Disease Classification using Keras Pretrained Models | Python

Hola amigos!! I am back with another tutorial based on Keras and Neural Networks for image classification. In my previous article, I discussed the transfer learning approach and how to use pre-trained models. So, why not take that knowledge further? Hence, In this article, we will use the same approach for detecting if a person is infected with malaria or not.

Here we will do the Malaria Disease Classification using Keras Pretrained Models in Python.

Okay! If you are new to transfer learning and pretrained models refer to this article. I hope you might have understood the basics, so let’s start building today’s classification model using the VGG-16 model of Keras.

What is the VGG-16 model?

The VGG-16 is considered to be one of the best pre-trained models in terms of accuracy for image classification. It got famous through the 2014 conference, developed at Visual Graphics Group at the University of Oxford.

The basic architecture of the VGG-16 model is:

VGG-16

The layers of the model are:

  • Convolutional layers: 13
  • Pooling layers: 5
  • Dense layers: 3

Each Conv layer is sequential and has an increasing number of filters.

Now, let us start building our model. For this, we start with the importation of libraries.

Import packages/libraries

# importing the libraries

import numpy as np
import pandas as pd
import tensorflow as tf 
from tensorflow.keras.preprocessing.image import ImageDataGenerator 
from tensorflow.keras import layers
from tensorflow.keras import Model 
import matplotlib.pyplot as plt

Next, We do not have a train or test set so let’s prepare that. Also, have a look at our dataset. Our aim is to identify if a person is affected by malaria or not. For this in our dataset, we have 2 folders containing images of cell tissues, 1 – Parasitized (i.e. infected person) and 2 – Uninfected. Let me show you 1 image from both the folders.

Parasitized and Uninfected person’s cell tissues

from google.colab import drive
drive.mount('/content/drive')
import cv2

uninfected_image='/content/drive/MyDrive/cell_images/Final_data/Uninfected/C1_thinF_IMG_20150604_104722_cell_73.png'
infected_image='/content/drive/MyDrive/cell_images/Final_data/Parasitized/C33P1thinF_IMG_20150619_114756a_cell_179.png'
plt.figure(1, figsize = (15 , 7))
plt.subplot(1 , 2 , 1)
plt.imshow(cv2.imread(uninfected_image))
plt.title('Uninfected Cell')
plt.xticks([]) , plt.yticks([])

plt.subplot(1 , 2 , 2)
plt.imshow(cv2.imread(infected_image))
plt.title('Infected Cell')
plt.xticks([]) , plt.yticks([])

plt.show()

malaria_celltissues

So, the above plot shows the infected(right) and uninfected(left) cell tissue images. We will now split the data into training and testing to further perform our analysis.

Split into train and test data

Our dataset contains a total of 13,780 belonging to 2 classes each, which will be divided into 2 categories. For dataset click here.

datagen = ImageDataGenerator(rescale=1/255.0, validation_split=0.2)
trainDatagen = datagen.flow_from_directory(directory='/content/drive/My Drive/cell_images/Final_data/',
                                           target_size=(224, 224),
                                           class_mode = 'binary',
                                           batch_size = 16,
                                           subset='training')
Found 22063 images belonging to 2 classes.
validationDatagen = datagen.flow_from_directory(directory='/content/drive/My Drive/cell_images/Final_data/',
                                           target_size=(224, 224),
                                           class_mode = 'binary',
                                           batch_size = 16,
                                           subset='validation')
Found 5514 images belonging to 2 classes.

Our next step will be to introduce our pretrained VGG model for the main task of identifying images.

Loading the base model

As we know, our target model is VGG-16 so we will import that from Keras application module.

from tensorflow.keras.applications.vgg16 import VGG16

model = VGG16(input_shape = (224, 224, 3), # Shape of our images
include_top = False, # Leave out the last fully connected layer
weights = 'imagenet')
Downloading data from

https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

58892288/58889256 [==============================] - 1s 0us/step

There is no need to train our model as we are using the pre-trained model today.

# no need to train the layers
for layer in model.layers:
    layer.trainable = False

Compile the model and fit

Wohooo we are finally here, to our final step of compiling the pretrained model and then fitting on the data to evaluate it’s performance.

# Flatten the output layer to 1 dimension
x = layers.Flatten()(model.output)

# Add a fully connected layer with 512 hidden units and ReLU activation
x = layers.Dense(512, activation='relu')(x)

# Add a dropout rate of 0.5
x = layers.Dropout(0.5)(x)

# Add a final sigmoid layer for classification
x = layers.Dense(1, activation='sigmoid')(x)

my_model = tf.keras.models.Model(model.input, x)

# Compile the model
my_model.compile(optimizer = tf.keras.optimizers.RMSprop(lr=0.0001), loss = 'binary_crossentropy',metrics = ['acc'])

# model summary
my_model.summary()
Model: "vgg16" 
_________________________________________________________________ 
Layer (type)                 Output Shape             Param # 
================================================================= 
input_1 (InputLayer)        [(None, 224, 224, 3)]    0 
_________________________________________________________________ 
block1_conv1 (Conv2D)       (None, 224, 224, 64)     1792 
_________________________________________________________________ 
block1_conv2 (Conv2D)       (None, 224, 224, 64)     36928 
_________________________________________________________________ 
block1_pool (MaxPooling2D)  (None, 112, 112, 64)     0 
_________________________________________________________________ 
block2_conv1 (Conv2D)       (None, 112, 112, 128)    73856 
_________________________________________________________________ 
block2_conv2 (Conv2D)       (None, 112, 112, 128)    147584 
_________________________________________________________________ 
block2_pool (MaxPooling2D)  (None, 56, 56, 128)      0 
_________________________________________________________________  
block3_conv1 (Conv2D)       (None, 56, 56, 256)      295168 
_________________________________________________________________ 
block3_conv2 (Conv2D)       (None, 56, 56, 256)      590080 
_________________________________________________________________ 
block3_conv3 (Conv2D)       (None, 56, 56, 256)      590080 
_________________________________________________________________ 
block3_pool (MaxPooling2D)  (None, 28, 28, 256)      0 
_________________________________________________________________ 
block4_conv1 (Conv2D)       (None, 28, 28, 512)      1180160 
_________________________________________________________________ 
block4_conv2 (Conv2D)       (None, 28, 28, 512)      2359808 
_________________________________________________________________ 
block4_conv3 (Conv2D)       (None, 28, 28, 512)      2359808 
_________________________________________________________________ 
block4_pool (MaxPooling2D)  (None, 14, 14, 512)      0 
_________________________________________________________________ 
block5_conv1 (Conv2D)       (None, 14, 14, 512)      2359808 
_________________________________________________________________ 
block5_conv2 (Conv2D)       (None, 14, 14, 512)      2359808 
_________________________________________________________________ 
block5_conv3 (Conv2D)       (None, 14, 14, 512)      2359808 
_________________________________________________________________ 
block5_pool (MaxPooling2D)  (None, 7, 7, 512)        0 
================================================================= 
Total params: 14,714,688 
Trainable params: 0 
Non-trainable params: 14,714,688 
_________________________________________________________________
# Fitting the model
vgghist = my_model.fit(trainDatagen, validation_data = validationDatagen, steps_per_epoch = 100, epochs = 5)

Awesome! We have finally detected if a cell is parasitized or not. Do try with a new image on your device to check if our model predicts correctly or not?  Mention your results and accuracy in the comment section.

So we have successfully able to do the Malaria Disease Classification task using Keras in Python.

This is the point when we realize how powerful transfer learning is and how useful pre-trained models for image classification can be. I want you to know that VGG16 takes up a long time to train compared to other models and this can be a disadvantage when we are dealing with huge datasets. Though these models are simple and intuitive.

Keep Learning!!

Thank You for your time 🙂

 

Leave a Reply

Your email address will not be published. Required fields are marked *