Ensemble deep learning model for pneumonia dataset
INTRODUCTION
Ensemble learning is one of the most powerful deep learning techniques for getting great training accuracy. So in this tutorial, we are going to use ensemble learning for training a pneumonia dataset. Please take a look at my previous tutorial to learn how to train pneumonia dataset using transfer learning. Link for the tutorial is here. We will use google colab for training our dataset. Let’s dive into our tutorial.
OBTAINING THE DATASET
As I explained in my previous tutorial, we will get the pneumonia training dataset from Kaggle. Link to the dataset is here. Download and extract the dataset in the colab. The dataset has over 5000 training images and over 500 validation images. The dataset is divided into normal and pneumonia, so this problem comes under binary classification.
IMPORT THE LIBRARIES
import os import cv2 as cv import pathlib import matplotlib.pyplot as plt import tensorflow as tf from tensorflow import keras import numpy as np import pandas as pd import seaborn as sns sns.set()
Refer to my previous tutorial to learn how to use transfer learning for pneumonia dataset. We will now directly move to the training models. Once again, link to my previous tutorial is here.
PRE-TRAINED MODELS
In my previous tutorial, I used Densenet model to train the dataset. We need more than two models to use ensemble learning to use 5 models (Densenet, InceptionV3, Resnet, InceptionResnet, VGG19). We will train our dataset on these models and combine their results using ensemble learning.
I have already trained the 5 models and downloaded the weights for the output layer to use for my model. You can save model weights using model.save_weights(“file_name.h5”). Since we have imported imagenet weights for our pre-trained model, downloading full model weights takes more data. After importing the model, I have used a 512 dense layer and an output model. So I am going to download these two-layer weights separately.
weightsAndBiases_1 = DenseNet.layers[-1].get_weights() weightsAndBiases_2 = DenseNet.layers[-2].get_weights() import pickle with open("x2.txt", "wb") as fp: pickle.dump(weightsAndBiases_2, fp) with open("pred.txt", "wb") as fp: pickle.dump(weightsAndBiases_1, fp)
Here I first saved the weights of 512 dense layer and output layer. Then using the pickle library, I downloaded the consequences as text files.
Densenet
from tensorflow.keras.applications import DenseNet201 base_model1=DenseNet201(input_shape=[224,224,3],weights='imagenet',include_top=False) x1=base_model1.output base_model1.trainable=False x11=keras.layers.GlobalAveragePooling2D()(x1) x21=keras.layers.Dense(512,activation='relu')(x11) preds1=keras.layers.Dense(3,activation='softmax')(x21) DenseNet=keras.models.Model(inputs=[base_model1.input],outputs=[preds1]) #specify the inputs and outputs with open("/content/DenseNet/x2.txt", "rb") as fp: weightsAndBiases_21=pickle.load(fp) with open("/content/DenseNet/pred.txt", "rb") as fp: weightsAndBiases_11=pickle.load(fp) DenseNet.layers[-1].set_weights(weightsAndBiases_11) DenseNet.layers[-2].set_weights(weightsAndBiases_21) DenseNet.compile(loss='categorical_crossentropy',optimizer=keras.optimizers.Adam(lr=0.001),metrics=['accuracy'])
As you can see, I built the model and used the already downloaded last two layer weights and used it in my dense and output layer. Now I don’t need to train and can use the test set to test my model accuracy.
InceptionResnet
from tensorflow.keras.applications import InceptionResNetV2 base_model2=InceptionResNetV2(input_shape=[224,224,3],weights='imagenet',include_top=False) x2=base_model2.output base_model2.trainable=False x12=keras.layers.GlobalAveragePooling2D()(x2) x22=keras.layers.Dense(512,activation='relu')(x12) preds2=keras.layers.Dense(3,activation='softmax')(x22) IRNet=keras.models.Model(inputs=[base_model2.input],outputs=[preds2]) #specify the inputs and outputs with open("/content/IRNet/x2.txt", "rb") as fp: weightsAndBiases_22=pickle.load(fp) with open("/content/IRNet/pred.txt", "rb") as fp: weightsAndBiases_12=pickle.load(fp) IRNet.layers[-1].set_weights(weightsAndBiases_12) IRNet.layers[-2].set_weights(weightsAndBiases_22) IRNet.compile(loss='categorical_crossentropy',optimizer=keras.optimizers.Adam(lr=0.001),metrics=['accuracy'])
Resnet
from tensorflow.keras.applications import ResNet152V2 base_model3=ResNet152V2(input_shape=[224,224,3],weights='imagenet',include_top=False) x3=base_model3.output base_model3.trainable=False x13=keras.layers.GlobalAveragePooling2D()(x3) x23=keras.layers.Dense(512,activation='relu')(x13) preds3=keras.layers.Dense(3,activation='softmax')(x23) ResNet=keras.models.Model(inputs=[base_model3.input],outputs=[preds3]) #specify the inputs and outputs with open("/content/ResNet/x2.txt", "rb") as fp: weightsAndBiases_23=pickle.load(fp) with open("/content/ResNet/pred.txt", "rb") as fp: weightsAndBiases_13=pickle.load(fp) ResNet.layers[-1].set_weights(weightsAndBiases_13) ResNet.compile(loss='categorical_crossentropy',optimizer=keras.optimizers.Adam(lr=0.001),metrics=['accuracy']
Inception
from tensorflow.keras.applications import InceptionV3 base_model4=InceptionV3(input_shape=[224,224,3],weights='imagenet',include_top=False) x4=base_model4.output base_model4.trainable=False x14=keras.layers.GlobalAveragePooling2D()(x4) x24=keras.layers.Dense(512,activation='relu')(x14) preds4=keras.layers.Dense(3,activation='softmax')(x24) Inception=keras.models.Model(inputs=[base_model4.input],outputs=[preds4]) #specify the inputs and outputs with open("/content/Inception/x2.txt", "rb") as fp: weightsAndBiases_24=pickle.load(fp) with open("/content/Inception/pred.txt", "rb") as fp: weightsAndBiases_14=pickle.load(fp) Inception.layers[-1].set_weights(weightsAndBiases_14) Inception.compile(loss='categorical_crossentropy',optimizer=keras.optimizers.Adam(lr=0.001),metrics=['accuracy'])
VGG19
from tensorflow.keras.applications import VGG19 base_model5=VGG19(input_shape=[224,224,3],weights='imagenet',include_top=False) x5=base_model5.output base_model5.trainable=False x15=keras.layers.GlobalAveragePooling2D()(x5) x25=keras.layers.Dense(512,activation='relu')(x15) preds5=keras.layers.Dense(3,activation='softmax')(x25) VGG=keras.models.Model(inputs=[base_model5.input],outputs=[preds5]) #specify the inputs and outputs with open("/content/VGG/x2.txt", "rb") as fp: weightsAndBiases_25=pickle.load(fp) with open("/content/VGG/pred.txt", "rb") as fp: weightsAndBiases_15=pickle.load(fp) VGG.layers[-1].set_weights(weightsAndBiases_15) VGG.compile(loss='categorical_crossentropy',optimizer=keras.optimizers.Adam(lr=0.001),metrics=['accuracy'])
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5 80142336/80134624 [==============================] - 0s 0us/step
INITIALIZING DATA GENERATOR FOR OUR MODEL
We will preprocess and convert image data into tensor data using ImageDataGenerator function and use flow_from_directory to get data directly from the mentioned directory in specified batches. We will use these functions for our 5 pre-trained models.
Densenet
from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.applications.densenet import preprocess_input as dn dn_train_datagen=ImageDataGenerator(rotation_range=20,width_shift_range=0.3,height_shift_range=0.3,shear_range=0.2,preprocessing_function=dn,validation_split=0.1,horizontal_flip=True,vertical_flip=True,zoom_range=0.2) #train_generator=train_datagen.flow_from_directory(train_dir,target_size=(224,224),class_mode='categorical',subset='training',shuffle=True) dn_val_generator=dn_train_datagen.flow_from_directory(train_dir,target_size=(224,224),class_mode='categorical',subset='validation',batch_size=3,shuffle=False
We can use train_generator only once as we will use the same training data for our 5 models. We will use different shuffled validation data for our 5 models.
InceptionResnet
from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.applications.inception_resnet_v2 import preprocess_input as ir # Get your train and test data ir_train_datagen=ImageDataGenerator(rotation_range=20,width_shift_range=0.3,height_shift_range=0.3,shear_range=0.2,preprocessing_function=ir,validation_split=0.1,horizontal_flip=True,vertical_flip=True,zoom_range=0.2) ir_val_generator=ir_train_datagen.flow_from_directory(train_dir,target_size=(224,224),class_mode='categorical',subset='validation',batch_size=3,shuffle=False)
Inception
from tensorflow.keras.applications.inception_v3 import preprocess_input as i # Get your train and test data i_train_datagen=ImageDataGenerator(rotation_range=20,width_shift_range=0.3,height_shift_range=0.3,shear_range=0.2,preprocessing_function=i,validation_split=0.1,horizontal_flip=True,vertical_flip=True,zoom_range=0.2) i_val_generator=i_train_datagen.flow_from_directory(train_dir,target_size=(224,224),class_mode='categorical',subset='validation',batch_size=3,shuffle=False)
Resnet
from tensorflow.keras.applications.resnet_v2 import preprocess_input as rn # Get your train and test data rn_train_datagen=ImageDataGenerator(rotation_range=20,width_shift_range=0.3,height_shift_range=0.3,shear_range=0.2,preprocessing_function=rn,validation_split=0.1,horizontal_flip=True,vertical_flip=True,zoom_range=0.2) rn_val_generator=rn_train_datagen.flow_from_directory(train_dir,target_size=(224,224),class_mode='categorical',subset='validation',batch_size=3,shuffle=False)
VGG19
from tensorflow.keras.applications.vgg19 import preprocess_input # Get your train and test data vgg_train_datagen=ImageDataGenerator(rotation_range=20,width_shift_range=0.3,height_shift_range=0.3,shear_range=0.2,preprocessing_function=preprocess_input,validation_split=0.1,horizontal_flip=True,vertical_flip=True,zoom_range=0.2) vgg_val_generator=vgg_train_datagen.flow_from_directory(train_dir,target_size=(224,224),class_mode='categorical',subset='validation',batch_size=3,shuffle=False)
ENSEMBLE LEARNING
As we are solving a classifier problem, we will use voting classifier as our ensemble learning technique. We will use soft voting classifier and stack exchange to test our ensembled model accuracies.
SOFT VOTING CLASSIFIER
In soft voting classifier, the output class is the prediction based on the average probability given to the class. Suppose prediction of 1st model is (0.4,0.6) and 2nd model is (0.5,0.5). Then the prediction of the soft voting classifier is average of 1st and 2nd model (0.45,0.55). The highest probability is B class, so the soft voting classifier gives the output as B class.
labels=np.array([0.,0.,0.,0.,0.,0.,0.,1.,1.,1.,1.,1.,1.,1.,0.,0.,1.,0.,0.,0.,0.,]) step_size_validation=dn_val_generator.n//dn_val_generator.batch_size dn_prob=DenseNet.predict(dn_val_generator,steps=step_size_validation) ir_prob=IRNet.predict(ir_val_generator,steps=step_size_validation) i_prob=Inception.predict(i_val_generator,steps=step_size_validation) rn_prob=ResNet.predict(rn_val_generator,steps=step_size_validation) vgg_prob=VGG.predict(vgg_val_generator,steps=step_size_validation)
Here we took some test data and labelled it. Next, we gave the test data input to these 5 models and stored the class probabilities for these 5 models in separate variables.
stack=np.dstack((dn_prob,ir_prob,i_prob,rn_prob,vgg_prob)) avg_ensemble_prob=np.mean(stack,axis=-1) avg_ensemble_pred=np.argmax(avg_ensemble_prob,axis=-1)
Now we average the class probabilities and store the output class prediction in avg_ensemble_pred variable. Let’s check how our ensemble model performed for all the test image using the ROC curve.
from sklearn.preprocessing import label_binarize y_true = label_binarize(labels,classes=[0,1]) n_classes = 2 fpr = dict() tpr = dict() roc_auc = dict() for i in range(n_classes): fpr[i], tpr[i], _ = metrics.roc_curve(y_true[:, i], avg_ensemble_prob[:, i]) roc_auc[i] = metrics.auc(fpr[i], tpr[i]) # Plot of a ROC curve for a specific class for i in range(n_classes): plt.figure() plt.plot(fpr[i], tpr[i], label='ROC curve (area = %0.2f)' % roc_auc[i]) plt.plot([0, 1], [0, 1], 'k--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver operating characteristic curve for class'+str(i)) plt.legend(loc="lower right") plt.show()
STACK ENSEMBLE
The soft voting classifier’s problem is that all models irrelevance to their individual performance is treated equally. The variant of voting classifier called stack ensemble computes the weighted average of model probabilities in which better performing models are given more weights are less performing models are given low weights.
In stacking, the individual model predictions are given as input, and the algorithm learns how to combine the input prediction best to make a better output prediction.
stack=stack.reshape((stack.shape[0],stack.shape[1]*stack.shape[2])) from sklearn.linear_model import LogisticRegression softmax_reg=LogisticRegression(multi_class='multinomial',solver='lbfgs',C=10) softmax_reg.fit(stack,labels) stack_ensemble_prob=softmax_reg.predict_proba(stack) stack_ensemble_pred=softmax_reg.predict(stack)
Here the stack variable represents the individual model predictions. It is given as input to a logistic regression model and the class labels, which then learn each model’s weights. Let’s check the model performance using the ROC curve.
from sklearn.preprocessing import label_binarize y_true = label_binarize(labels,classes=[0,1]) n_classes = 2 fpr = dict() tpr = dict() roc_auc = dict() for i in range(n_classes): fpr[i], tpr[i], _ = metrics.roc_curve(y_true[:, i], stack_ensemble_prob[:, i]) roc_auc[i] = metrics.auc(fpr[i], tpr[i]) # Plot of a ROC curve for a specific class for i in range(n_classes): plt.figure() plt.plot(fpr[i], tpr[i], label='ROC curve (area = %0.2f)' % roc_auc[i]) plt.plot([0, 1], [0, 1], 'k--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver operating characteristic curve for class'+str(i)) plt.legend(loc="lower right") plt.show()
CONCLUSION
We have trained our pneumonia dataset on 5 different pre-trained models and used two ensemble learning techniques to improve our training accuracy. We conclude that the ensembled model accuracy is more remarkable than each of the individual models. By comparing the two above mentioned ensemble techniques, stack ensembles perform better than the soft voting classifiers.
Leave a Reply