Malaria Disease Classification using Keras Pretrained Models | Python
Hola amigos!! I am back with another tutorial based on Keras and Neural Networks for image classification. In my previous article, I discussed the transfer learning approach and how to use pre-trained models. So, why not take that knowledge further? Hence, In this article, we will use the same approach for detecting if a person is infected with malaria or not.
Here we will do the Malaria Disease Classification using Keras Pretrained Models in Python.
Okay! If you are new to transfer learning and pretrained models refer to this article. I hope you might have understood the basics, so let’s start building today’s classification model using the VGG-16 model of Keras.
What is the VGG-16 model?
The VGG-16 is considered to be one of the best pre-trained models in terms of accuracy for image classification. It got famous through the 2014 conference, developed at Visual Graphics Group at the University of Oxford.
The basic architecture of the VGG-16 model is:
The layers of the model are:
- Convolutional layers: 13
- Pooling layers: 5
- Dense layers: 3
Each Conv layer is sequential and has an increasing number of filters.
Now, let us start building our model. For this, we start with the importation of libraries.
# importing the libraries import numpy as np import pandas as pd import tensorflow as tf from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras import layers from tensorflow.keras import Model import matplotlib.pyplot as plt
Next, We do not have a train or test set so let’s prepare that. Also, have a look at our dataset. Our aim is to identify if a person is affected by malaria or not. For this in our dataset, we have 2 folders containing images of cell tissues, 1 – Parasitized (i.e. infected person) and 2 – Uninfected. Let me show you 1 image from both the folders.
Parasitized and Uninfected person’s cell tissues
from google.colab import drive drive.mount('/content/drive')
import cv2 uninfected_image='/content/drive/MyDrive/cell_images/Final_data/Uninfected/C1_thinF_IMG_20150604_104722_cell_73.png' infected_image='/content/drive/MyDrive/cell_images/Final_data/Parasitized/C33P1thinF_IMG_20150619_114756a_cell_179.png' plt.figure(1, figsize = (15 , 7)) plt.subplot(1 , 2 , 1) plt.imshow(cv2.imread(uninfected_image)) plt.title('Uninfected Cell') plt.xticks() , plt.yticks() plt.subplot(1 , 2 , 2) plt.imshow(cv2.imread(infected_image)) plt.title('Infected Cell') plt.xticks() , plt.yticks() plt.show()
So, the above plot shows the infected(right) and uninfected(left) cell tissue images. We will now split the data into training and testing to further perform our analysis.
Split into train and test data
Our dataset contains a total of 13,780 belonging to 2 classes each, which will be divided into 2 categories. For dataset click here.
datagen = ImageDataGenerator(rescale=1/255.0, validation_split=0.2)
trainDatagen = datagen.flow_from_directory(directory='/content/drive/My Drive/cell_images/Final_data/', target_size=(224, 224), class_mode = 'binary', batch_size = 16, subset='training')
Found 22063 images belonging to 2 classes.
validationDatagen = datagen.flow_from_directory(directory='/content/drive/My Drive/cell_images/Final_data/', target_size=(224, 224), class_mode = 'binary', batch_size = 16, subset='validation')
Found 5514 images belonging to 2 classes.
Our next step will be to introduce our pretrained VGG model for the main task of identifying images.
Loading the base model
As we know, our target model is VGG-16 so we will import that from Keras application module.
from tensorflow.keras.applications.vgg16 import VGG16 model = VGG16(input_shape = (224, 224, 3), # Shape of our images include_top = False, # Leave out the last fully connected layer weights = 'imagenet')
Downloading data from
58892288/58889256 [==============================] - 1s 0us/step
There is no need to train our model as we are using the pre-trained model today.
# no need to train the layers for layer in model.layers: layer.trainable = False
Compile the model and fit
Wohooo we are finally here, to our final step of compiling the pretrained model and then fitting on the data to evaluate it’s performance.
# Flatten the output layer to 1 dimension x = layers.Flatten()(model.output) # Add a fully connected layer with 512 hidden units and ReLU activation x = layers.Dense(512, activation='relu')(x) # Add a dropout rate of 0.5 x = layers.Dropout(0.5)(x) # Add a final sigmoid layer for classification x = layers.Dense(1, activation='sigmoid')(x) my_model = tf.keras.models.Model(model.input, x) # Compile the model my_model.compile(optimizer = tf.keras.optimizers.RMSprop(lr=0.0001), loss = 'binary_crossentropy',metrics = ['acc']) # model summary my_model.summary()
Model: "vgg16" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 ================================================================= Total params: 14,714,688 Trainable params: 0 Non-trainable params: 14,714,688 _________________________________________________________________
# Fitting the model vgghist = my_model.fit(trainDatagen, validation_data = validationDatagen, steps_per_epoch = 100, epochs = 5)
Awesome! We have finally detected if a cell is parasitized or not. Do try with a new image on your device to check if our model predicts correctly or not? Mention your results and accuracy in the comment section.
So we have successfully able to do the Malaria Disease Classification task using Keras in Python.
This is the point when we realize how powerful transfer learning is and how useful pre-trained models for image classification can be. I want you to know that VGG16 takes up a long time to train compared to other models and this can be a disadvantage when we are dealing with huge datasets. Though these models are simple and intuitive.
Thank You for your time 🙂