Squeeze-Excitation Residual Network using Keras

In general, when the depth of a network increases, accuracy reduces in a convolution. Thus, AI researchers introduced Residual Networks which solves these issues, thereby increasing the accuracy of the CNN. Then, to enhance the performance of the Residual Networks without much loss in the computation cost, Squeeze-Excitation networks (SE Network) were proposed in the year 2018 resulting in the SE-Residual Network.

Squeeze-Excitation or the SE Network adjusts the weights of the features in the feature map. Thus, the sensitivity of useful features will be enhanced and ultimately enhances the efficiency of the network. It is done using the “Squeeze” and “Excitation” phases of the SE Network. The “squeeze” part of the SE Network will use the Global Average Pooling layer to reduce the image size. Then, the “excitation” phase will assign the weights for the image features with respect to the length of the channel vector obtained from the “squeeze” phase.

Thus, SE Network when fused with the Residual Networks enhances the performance of the traditional Residual Networks thereby increasing less than 1% of the cost of the system. Thus, in this article, we will be implementing this for fruit multi-classification and observe the performance.

The flow of the algorithm is as follows:

  • Importing Dataset and libraries
  • Pre-processing and splitting of the dataset
  • Building the SE Residual Network
  • Compilation and Training of the network
  • Plotting and prediction

Let’s dive into the algorithm.

Happy Reading!!!

LIBRARIES

Import necessary libraries into the notebook.

import numpy as np
import pandas as pd 
import os
import sklearn.datasets 
import sklearn.model_selection 
import keras.preprocessing.image 
import keras.utils import matplotlib.pyplot as plt 
from keras.preprocessing.image import ImageDataGenerator 
from skimage import color 
from sklearn.metrics import accuracy_score i
mport keras.callbacks 
import os 
import numpy as np import cv2
print(os.listdir("../input"))
['fruits-360_dataset']

We are using Keras deep learning Python library.

PRE-PROCESSING

Importing the dataset. Conversion of image pixels to array for easy processing.

train_dir = '../input/fruits-360_dataset/fruits-360/Training'
trainData=sklearn.datasets.load_files(train_dir,load_content=False)

test_dir = '../input/fruits-360_dataset/fruits-360/Test'
testData=sklearn.datasets.load_files(test_dir,load_content=False)

y_train = np.array(trainData['target'])
y_train_names = np.array(trainData['target_names'])

y_test = np.array(testData['target'])
y_test_names = np.array(testData['target_names'])

nclasses = len(np.unique(y_train))
target_size=50

x_train=[]
for filename in trainData['filenames']:
    x_train.append(
            keras.preprocessing.image.img_to_array(
                    keras.preprocessing.image.load_img(filename,target_size=(target_size, target_size))
                    )
            )
    
    
x_test=[]
for filename in testData['filenames']:
    x_test.append(
            keras.preprocessing.image.img_to_array(
                    keras.preprocessing.image.load_img(filename,target_size=(target_size, target_size))
                    )
            )
Using TensorFlow backend.

SPLIT

Splitting of the dataset into training and validation set using sklearn() library.

x_train=np.array(x_train)
x_train=x_train/255
y_train=keras.utils.np_utils.to_categorical(y_train,nclasses)

x_test=np.array(x_test)
x_test=x_test/255
y_test=keras.utils.np_utils.to_categorical(y_test,nclasses)
x_train, x_val, y_train, y_val = sklearn.model_selection.train_test_split( x_train, y_train, test_size=0.2 ) 
print(y_train.shape) 
print(y_val.shape)
(48398, 120)
(12100, 120)

BUILD MODEL

The building of the Squeeze-Excitation Residual Network. We have already discussed in the intro section the significance of the SE Network and the implementation of the explained concept is demonstrated here. Application of 3 stages of SE Network with convolution, max-pooling, and ReLU activation layers. Also, a reduction in overfitting errors is solved using the dropout function at the end. Then, a visualization of the summary of the network for the ease of understanding for the readers.

images = keras.layers.Input(x_train.shape[1:])

#inizio blocco 1
x = keras.layers.Conv2D(filters=16, kernel_size=[1, 1], padding='same')(images)
block = keras.layers.Conv2D(filters=16, kernel_size=[3, 3], padding="same")(x)
block = keras.layers.BatchNormalization()(block)
block = keras.layers.Activation("relu")(block)
block = keras.layers.Conv2D(filters=16, kernel_size=[3, 3], padding="same")(block)

#inio Squeeze and Excitation 1
sq = keras.layers.GlobalAveragePooling2D()(block)
sq = keras.layers.Reshape((1,1,16))(sq)
sq = keras.layers.Dense(units=16,activation="sigmoid")(sq)
block = keras.layers.multiply([block,sq])
#fine Squeeze and Excitation 1

net = keras.layers.add([x,block])
net = keras.layers.BatchNormalization()(net)
net = keras.layers.Activation("relu")(net)
net = keras.layers.MaxPooling2D(pool_size=(2, 2),name="block_1")(net)



#fine blocco 1
#inizio blocco 2
x = keras.layers.Conv2D(filters=32, kernel_size=[1, 1], padding='same')(net)
block = keras.layers.Conv2D(filters=32, kernel_size=[3, 3], padding="same")(x)
block = keras.layers.BatchNormalization()(block)
block = keras.layers.Activation("relu")(block)
block = keras.layers.Conv2D(filters=32, kernel_size=[3, 3], padding="same")(block)

#inio Squeeze and Excitation 2
sq = keras.layers.GlobalAveragePooling2D()(block)
sq = keras.layers.Reshape((1,1,32))(sq)
sq = keras.layers.Dense(units=32,activation="sigmoid")(sq)
block = keras.layers.multiply([block,sq])
#fine Squeeze and Excitation 2


net = keras.layers.add([x,block])
net = keras.layers.BatchNormalization()(net)
net = keras.layers.Activation("relu")(net)
net = keras.layers.MaxPooling2D(pool_size=(2, 2),name="block_2")(net)
#fine blocco 2
#inizio blocco 3
x = keras.layers.Conv2D(filters=64, kernel_size=[1, 1], padding='same')(net)
block = keras.layers.Conv2D(filters=64, kernel_size=[3, 3], padding="same")(x)
block = keras.layers.BatchNormalization()(block)
block = keras.layers.Activation("relu")(block)
block = keras.layers.Conv2D(filters=64, kernel_size=[3, 3], padding="same")(block)

#inio Squeeze and Excitation 3
sq = keras.layers.GlobalAveragePooling2D()(block)
sq = keras.layers.Reshape((1,1,64))(sq)
sq = keras.layers.Dense(units=64,activation="sigmoid")(sq)
block = keras.layers.multiply([block,sq])
#fine Squeeze and Excitation 3

net = keras.layers.add([x,block])
net = keras.layers.Activation("relu", name="block_3")(net)



net = keras.layers.BatchNormalization()(net)
net = keras.layers.Dropout(0.25)(net)

net = keras.layers.GlobalAveragePooling2D()(net)
net = keras.layers.Dense(units=nclasses,activation="softmax")(net)

model = keras.models.Model(inputs=images,outputs=net)


model.summary()
print(model.summary())
Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 50, 50, 3)    0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 50, 50, 16)   64          input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 50, 50, 16)   2320        conv2d_1[0][0]                   
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 50, 50, 16)   64          conv2d_2[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 50, 50, 16)   0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 50, 50, 16)   2320        activation_1[0][0]               
__________________________________________________________________________________________________
global_average_pooling2d_1 (Glo (None, 16)           0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
reshape_1 (Reshape)             (None, 1, 1, 16)     0           global_average_pooling2d_1[0][0] 
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1, 1, 16)     272         reshape_1[0][0]                  
__________________________________________________________________________________________________
multiply_1 (Multiply)           (None, 50, 50, 16)   0           conv2d_3[0][0]                   
                                                                 dense_1[0][0]                    
__________________________________________________________________________________________________
add_1 (Add)                     (None, 50, 50, 16)   0           conv2d_1[0][0]                   
                                                                 multiply_1[0][0]                 
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 50, 50, 16)   64          add_1[0][0]                      
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 50, 50, 16)   0           batch_normalization_2[0][0]      
__________________________________________________________________________________________________
block_1 (MaxPooling2D)          (None, 25, 25, 16)   0           activation_2[0][0]               
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 25, 25, 32)   544         block_1[0][0]                    
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 25, 25, 32)   9248        conv2d_4[0][0]                   
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 25, 25, 32)   128         conv2d_5[0][0]                   
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 25, 25, 32)   0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 25, 25, 32)   9248        activation_3[0][0]               
__________________________________________________________________________________________________
global_average_pooling2d_2 (Glo (None, 32)           0           conv2d_6[0][0]                   
__________________________________________________________________________________________________
reshape_2 (Reshape)             (None, 1, 1, 32)     0           global_average_pooling2d_2[0][0] 
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 1, 1, 32)     1056        reshape_2[0][0]                  
__________________________________________________________________________________________________
multiply_2 (Multiply)           (None, 25, 25, 32)   0           conv2d_6[0][0]                   
                                                                 dense_2[0][0]                    
__________________________________________________________________________________________________
add_2 (Add)                     (None, 25, 25, 32)   0           conv2d_4[0][0]                   
                                                                 multiply_2[0][0]                 
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 25, 25, 32)   128         add_2[0][0]                      
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 25, 25, 32)   0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
block_2 (MaxPooling2D)          (None, 12, 12, 32)   0           activation_4[0][0]               
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 12, 12, 64)   2112        block_2[0][0]                    
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 12, 12, 64)   36928       conv2d_7[0][0]                   
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 12, 12, 64)   256         conv2d_8[0][0]                   
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 12, 12, 64)   0           batch_normalization_5[0][0]      
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 12, 12, 64)   36928       activation_5[0][0]               
__________________________________________________________________________________________________
global_average_pooling2d_3 (Glo (None, 64)           0           conv2d_9[0][0]                   
__________________________________________________________________________________________________
reshape_3 (Reshape)             (None, 1, 1, 64)     0           global_average_pooling2d_3[0][0] 
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 1, 1, 64)     4160        reshape_3[0][0]                  
__________________________________________________________________________________________________
multiply_3 (Multiply)           (None, 12, 12, 64)   0           conv2d_9[0][0]                   
                                                                 dense_3[0][0]                    
__________________________________________________________________________________________________
add_3 (Add)                     (None, 12, 12, 64)   0           conv2d_7[0][0]                   
                                                                 multiply_3[0][0]                 
__________________________________________________________________________________________________
block_3 (Activation)            (None, 12, 12, 64)   0           add_3[0][0]                      
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 12, 12, 64)   256         block_3[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 12, 12, 64)   0           batch_normalization_6[0][0]      
__________________________________________________________________________________________________
global_average_pooling2d_4 (Glo (None, 64)           0           dropout_1[0][0]                  
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 120)          7800        global_average_pooling2d_4[0][0] 
==================================================================================================
Total params: 113,896
Trainable params: 113,448
Non-trainable params: 448
__________________________________________________________________________________________________

COMPILING

Compilation and declaration of the callback functions to save computation and time. Categorical cross-entropy and adadelta are the loss function and optimizer of multi-class recognition.

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])
checkpointer = keras.callbacks.ModelCheckpoint(filepath = 'cnn_from_scratch_fruits.hdf5', verbose = 1, save_best_only = True)
earlystopper = keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=0, mode='auto', baseline=None, restore_best_weights=False)

TRAINING

Training of the network with 15 epochs. Simultaneous saving of the best-trained model as a result of the callback functions.

history=model.fit(x_train, y_train, batch_size=64, epochs=15,validation_data=(x_val, y_val), callbacks = [checkpointer,earlystopper], shuffle=True)
Train on 48398 samples, validate on 12100 samples
Epoch 1/15
48398/48398 [==============================] - 19s 391us/step - loss: 1.2028 - accuracy: 0.7227 - val_loss: 2.4619 - val_accuracy: 0.3789

Epoch 00001: val_loss improved from inf to 2.46188, saving model to cnn_from_scratch_fruits.hdf5
Epoch 2/15
48398/48398 [==============================] - 15s 305us/step - loss: 0.1818 - accuracy: 0.9555 - val_loss: 1.0984 - val_accuracy: 0.7123

Epoch 00002: val_loss improved from 2.46188 to 1.09844, saving model to cnn_from_scratch_fruits.hdf5
Epoch 3/15
48398/48398 [==============================] - 15s 304us/step - loss: 0.0852 - accuracy: 0.9776 - val_loss: 0.6442 - val_accuracy: 0.7993

Epoch 00003: val_loss improved from 1.09844 to 0.64418, saving model to cnn_from_scratch_fruits.hdf5
Epoch 4/15
48398/48398 [==============================] - 15s 316us/step - loss: 0.0492 - accuracy: 0.9871 - val_loss: 1.9674 - val_accuracy: 0.6103

Epoch 00004: val_loss did not improve from 0.64418
Epoch 5/15
48398/48398 [==============================] - 15s 305us/step - loss: 0.0393 - accuracy: 0.9898 - val_loss: 0.8358 - val_accuracy: 0.7798

Epoch 00005: val_loss did not improve from 0.64418
Epoch 6/15
48398/48398 [==============================] - 15s 307us/step - loss: 0.0307 - accuracy: 0.9918 - val_loss: 0.9295 - val_accuracy: 0.7955

Epoch 00006: val_loss did not improve from 0.64418
Epoch 7/15
48398/48398 [==============================] - 15s 306us/step - loss: 0.0223 - accuracy: 0.9943 - val_loss: 0.7090 - val_accuracy: 0.8130

Epoch 00007: val_loss did not improve from 0.64418
Epoch 8/15
48398/48398 [==============================] - 15s 305us/step - loss: 0.0171 - accuracy: 0.9959 - val_loss: 0.0398 - val_accuracy: 0.9845

Epoch 00008: val_loss improved from 0.64418 to 0.03983, saving model to cnn_from_scratch_fruits.hdf5
Epoch 9/15
48398/48398 [==============================] - 15s 306us/step - loss: 0.0136 - accuracy: 0.9966 - val_loss: 0.0500 - val_accuracy: 0.9866

Epoch 00009: val_loss did not improve from 0.03983
Epoch 10/15
48398/48398 [==============================] - 15s 306us/step - loss: 0.0141 - accuracy: 0.9967 - val_loss: 0.0182 - val_accuracy: 0.9948

Epoch 00010: val_loss improved from 0.03983 to 0.01823, saving model to cnn_from_scratch_fruits.hdf5
Epoch 11/15
48398/48398 [==============================] - 15s 308us/step - loss: 0.0100 - accuracy: 0.9974 - val_loss: 0.2736 - val_accuracy: 0.9198

Epoch 00011: val_loss did not improve from 0.01823
Epoch 12/15
48398/48398 [==============================] - 15s 306us/step - loss: 0.0092 - accuracy: 0.9980 - val_loss: 0.1821 - val_accuracy: 0.9526

Epoch 00012: val_loss did not improve from 0.01823
Epoch 13/15
48398/48398 [==============================] - 15s 303us/step - loss: 0.0090 - accuracy: 0.9977 - val_loss: 0.7217 - val_accuracy: 0.8621

Epoch 00013: val_loss did not improve from 0.01823
Epoch 14/15
48398/48398 [==============================] - 15s 305us/step - loss: 0.0075 - accuracy: 0.9980 - val_loss: 0.0138 - val_accuracy: 0.9956

Epoch 00014: val_loss improved from 0.01823 to 0.01377, saving model to cnn_from_scratch_fruits.hdf5
Epoch 15/15
48398/48398 [==============================] - 15s 304us/step - loss: 0.0068 - accuracy: 0.9981 - val_loss: 0.1471 - val_accuracy: 0.9557

Epoch 00015: val_loss did not improve from 0.01377

LOAD WEIGHTS

Loading of the weights derived from the trained network.

model.load_weights('cnn_from_scratch_fruits.hdf5')

PLOTTING

Plotting of the accuracy and validation accuracy with the change of epochs of the trained model

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

The plotting of the loss and validation loss with the change in epochs.

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

PREDICTION

Prediction the trained model with a set of test images. Displaying the output. Then, predicting the results of the test images with the trained model.

y_test_pred = model.predict(x_test)
accuracy_score(np.argmax(y_test_pred,axis=1), np.argmax(y_test,axis=1))
0.9739113568034138

FINAL THOUGHTS

In this section, we have discussed a new type of Residual Network that enhances the performance of the network without much deviation in the computation termed as SE-Residual Network. With less than 1% of the increase in the cost, the system increases the accuracy of the network. The basic idea of this network is to adjust the weights of the image features thereby considering useful features for the training process. The introduction section of this article clearly explains the network. Thus, in this article, we discussed SE-Residual Network implementing it on a simple fruit multi-classification computer vision problem and observed the prediction rate and accuracy. The workflow of the network is as follows: first, the building of the network. Then, the compiling and training of the network. This explains the network shown in this article.

The source code for the fruit detection can be found and downloaded from here.

To go through an article based on Residual network, click here.

To learn from my other blogs, refer here.

REFERENCES:

[1]. Squeeze-and-Excitation Networks – The original paper with full insight into the SE Network.

Thank you. Hope this article was helpful!

Leave a Reply

Your email address will not be published. Required fields are marked *