Squeeze-Excitation Residual Network using Keras
In general, when the depth of a network increases, accuracy reduces in a convolution. Thus, AI researchers introduced Residual Networks which solves these issues, thereby increasing the accuracy of the CNN. Then, to enhance the performance of the Residual Networks without much loss in the computation cost, Squeeze-Excitation networks (SE Network) were proposed in the year 2018 resulting in the SE-Residual Network.
Squeeze-Excitation or the SE Network adjusts the weights of the features in the feature map. Thus, the sensitivity of useful features will be enhanced and ultimately enhances the efficiency of the network. It is done using the “Squeeze” and “Excitation” phases of the SE Network. The “squeeze” part of the SE Network will use the Global Average Pooling layer to reduce the image size. Then, the “excitation” phase will assign the weights for the image features with respect to the length of the channel vector obtained from the “squeeze” phase.
Thus, SE Network when fused with the Residual Networks enhances the performance of the traditional Residual Networks thereby increasing less than 1% of the cost of the system. Thus, in this article, we will be implementing this for fruit multi-classification and observe the performance.
The flow of the algorithm is as follows:
- Importing Dataset and libraries
- Pre-processing and splitting of the dataset
- Building the SE Residual Network
- Compilation and Training of the network
- Plotting and prediction
Let’s dive into the algorithm.
Happy Reading!!!
LIBRARIES
Import necessary libraries into the notebook.
import numpy as np import pandas as pd import os import sklearn.datasets import sklearn.model_selection import keras.preprocessing.image import keras.utils import matplotlib.pyplot as plt from keras.preprocessing.image import ImageDataGenerator from skimage import color from sklearn.metrics import accuracy_score i mport keras.callbacks import os import numpy as np import cv2 print(os.listdir("../input"))
['fruits-360_dataset']
We are using Keras deep learning Python library.
PRE-PROCESSING
Importing the dataset. Conversion of image pixels to array for easy processing.
train_dir = '../input/fruits-360_dataset/fruits-360/Training' trainData=sklearn.datasets.load_files(train_dir,load_content=False) test_dir = '../input/fruits-360_dataset/fruits-360/Test' testData=sklearn.datasets.load_files(test_dir,load_content=False) y_train = np.array(trainData['target']) y_train_names = np.array(trainData['target_names']) y_test = np.array(testData['target']) y_test_names = np.array(testData['target_names']) nclasses = len(np.unique(y_train)) target_size=50 x_train=[] for filename in trainData['filenames']: x_train.append( keras.preprocessing.image.img_to_array( keras.preprocessing.image.load_img(filename,target_size=(target_size, target_size)) ) ) x_test=[] for filename in testData['filenames']: x_test.append( keras.preprocessing.image.img_to_array( keras.preprocessing.image.load_img(filename,target_size=(target_size, target_size)) ) )
Using TensorFlow backend.
SPLIT
Splitting of the dataset into training and validation set using sklearn() library.
x_train=np.array(x_train) x_train=x_train/255 y_train=keras.utils.np_utils.to_categorical(y_train,nclasses) x_test=np.array(x_test) x_test=x_test/255 y_test=keras.utils.np_utils.to_categorical(y_test,nclasses) x_train, x_val, y_train, y_val = sklearn.model_selection.train_test_split( x_train, y_train, test_size=0.2 ) print(y_train.shape) print(y_val.shape)
(48398, 120) (12100, 120)
BUILD MODEL
The building of the Squeeze-Excitation Residual Network. We have already discussed in the intro section the significance of the SE Network and the implementation of the explained concept is demonstrated here. Application of 3 stages of SE Network with convolution, max-pooling, and ReLU activation layers. Also, a reduction in overfitting errors is solved using the dropout function at the end. Then, a visualization of the summary of the network for the ease of understanding for the readers.
images = keras.layers.Input(x_train.shape[1:]) #inizio blocco 1 x = keras.layers.Conv2D(filters=16, kernel_size=[1, 1], padding='same')(images) block = keras.layers.Conv2D(filters=16, kernel_size=[3, 3], padding="same")(x) block = keras.layers.BatchNormalization()(block) block = keras.layers.Activation("relu")(block) block = keras.layers.Conv2D(filters=16, kernel_size=[3, 3], padding="same")(block) #inio Squeeze and Excitation 1 sq = keras.layers.GlobalAveragePooling2D()(block) sq = keras.layers.Reshape((1,1,16))(sq) sq = keras.layers.Dense(units=16,activation="sigmoid")(sq) block = keras.layers.multiply([block,sq]) #fine Squeeze and Excitation 1 net = keras.layers.add([x,block]) net = keras.layers.BatchNormalization()(net) net = keras.layers.Activation("relu")(net) net = keras.layers.MaxPooling2D(pool_size=(2, 2),name="block_1")(net) #fine blocco 1 #inizio blocco 2 x = keras.layers.Conv2D(filters=32, kernel_size=[1, 1], padding='same')(net) block = keras.layers.Conv2D(filters=32, kernel_size=[3, 3], padding="same")(x) block = keras.layers.BatchNormalization()(block) block = keras.layers.Activation("relu")(block) block = keras.layers.Conv2D(filters=32, kernel_size=[3, 3], padding="same")(block) #inio Squeeze and Excitation 2 sq = keras.layers.GlobalAveragePooling2D()(block) sq = keras.layers.Reshape((1,1,32))(sq) sq = keras.layers.Dense(units=32,activation="sigmoid")(sq) block = keras.layers.multiply([block,sq]) #fine Squeeze and Excitation 2 net = keras.layers.add([x,block]) net = keras.layers.BatchNormalization()(net) net = keras.layers.Activation("relu")(net) net = keras.layers.MaxPooling2D(pool_size=(2, 2),name="block_2")(net) #fine blocco 2 #inizio blocco 3 x = keras.layers.Conv2D(filters=64, kernel_size=[1, 1], padding='same')(net) block = keras.layers.Conv2D(filters=64, kernel_size=[3, 3], padding="same")(x) block = keras.layers.BatchNormalization()(block) block = keras.layers.Activation("relu")(block) block = keras.layers.Conv2D(filters=64, kernel_size=[3, 3], padding="same")(block) #inio Squeeze and Excitation 3 sq = keras.layers.GlobalAveragePooling2D()(block) sq = keras.layers.Reshape((1,1,64))(sq) sq = keras.layers.Dense(units=64,activation="sigmoid")(sq) block = keras.layers.multiply([block,sq]) #fine Squeeze and Excitation 3 net = keras.layers.add([x,block]) net = keras.layers.Activation("relu", name="block_3")(net) net = keras.layers.BatchNormalization()(net) net = keras.layers.Dropout(0.25)(net) net = keras.layers.GlobalAveragePooling2D()(net) net = keras.layers.Dense(units=nclasses,activation="softmax")(net) model = keras.models.Model(inputs=images,outputs=net) model.summary() print(model.summary())
Model: "model_1" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 50, 50, 3) 0 __________________________________________________________________________________________________ conv2d_1 (Conv2D) (None, 50, 50, 16) 64 input_1[0][0] __________________________________________________________________________________________________ conv2d_2 (Conv2D) (None, 50, 50, 16) 2320 conv2d_1[0][0] __________________________________________________________________________________________________ batch_normalization_1 (BatchNor (None, 50, 50, 16) 64 conv2d_2[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 50, 50, 16) 0 batch_normalization_1[0][0] __________________________________________________________________________________________________ conv2d_3 (Conv2D) (None, 50, 50, 16) 2320 activation_1[0][0] __________________________________________________________________________________________________ global_average_pooling2d_1 (Glo (None, 16) 0 conv2d_3[0][0] __________________________________________________________________________________________________ reshape_1 (Reshape) (None, 1, 1, 16) 0 global_average_pooling2d_1[0][0] __________________________________________________________________________________________________ dense_1 (Dense) (None, 1, 1, 16) 272 reshape_1[0][0] __________________________________________________________________________________________________ multiply_1 (Multiply) (None, 50, 50, 16) 0 conv2d_3[0][0] dense_1[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 50, 50, 16) 0 conv2d_1[0][0] multiply_1[0][0] __________________________________________________________________________________________________ batch_normalization_2 (BatchNor (None, 50, 50, 16) 64 add_1[0][0] __________________________________________________________________________________________________ activation_2 (Activation) (None, 50, 50, 16) 0 batch_normalization_2[0][0] __________________________________________________________________________________________________ block_1 (MaxPooling2D) (None, 25, 25, 16) 0 activation_2[0][0] __________________________________________________________________________________________________ conv2d_4 (Conv2D) (None, 25, 25, 32) 544 block_1[0][0] __________________________________________________________________________________________________ conv2d_5 (Conv2D) (None, 25, 25, 32) 9248 conv2d_4[0][0] __________________________________________________________________________________________________ batch_normalization_3 (BatchNor (None, 25, 25, 32) 128 conv2d_5[0][0] __________________________________________________________________________________________________ activation_3 (Activation) (None, 25, 25, 32) 0 batch_normalization_3[0][0] __________________________________________________________________________________________________ conv2d_6 (Conv2D) (None, 25, 25, 32) 9248 activation_3[0][0] __________________________________________________________________________________________________ global_average_pooling2d_2 (Glo (None, 32) 0 conv2d_6[0][0] __________________________________________________________________________________________________ reshape_2 (Reshape) (None, 1, 1, 32) 0 global_average_pooling2d_2[0][0] __________________________________________________________________________________________________ dense_2 (Dense) (None, 1, 1, 32) 1056 reshape_2[0][0] __________________________________________________________________________________________________ multiply_2 (Multiply) (None, 25, 25, 32) 0 conv2d_6[0][0] dense_2[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 25, 25, 32) 0 conv2d_4[0][0] multiply_2[0][0] __________________________________________________________________________________________________ batch_normalization_4 (BatchNor (None, 25, 25, 32) 128 add_2[0][0] __________________________________________________________________________________________________ activation_4 (Activation) (None, 25, 25, 32) 0 batch_normalization_4[0][0] __________________________________________________________________________________________________ block_2 (MaxPooling2D) (None, 12, 12, 32) 0 activation_4[0][0] __________________________________________________________________________________________________ conv2d_7 (Conv2D) (None, 12, 12, 64) 2112 block_2[0][0] __________________________________________________________________________________________________ conv2d_8 (Conv2D) (None, 12, 12, 64) 36928 conv2d_7[0][0] __________________________________________________________________________________________________ batch_normalization_5 (BatchNor (None, 12, 12, 64) 256 conv2d_8[0][0] __________________________________________________________________________________________________ activation_5 (Activation) (None, 12, 12, 64) 0 batch_normalization_5[0][0] __________________________________________________________________________________________________ conv2d_9 (Conv2D) (None, 12, 12, 64) 36928 activation_5[0][0] __________________________________________________________________________________________________ global_average_pooling2d_3 (Glo (None, 64) 0 conv2d_9[0][0] __________________________________________________________________________________________________ reshape_3 (Reshape) (None, 1, 1, 64) 0 global_average_pooling2d_3[0][0] __________________________________________________________________________________________________ dense_3 (Dense) (None, 1, 1, 64) 4160 reshape_3[0][0] __________________________________________________________________________________________________ multiply_3 (Multiply) (None, 12, 12, 64) 0 conv2d_9[0][0] dense_3[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 12, 12, 64) 0 conv2d_7[0][0] multiply_3[0][0] __________________________________________________________________________________________________ block_3 (Activation) (None, 12, 12, 64) 0 add_3[0][0] __________________________________________________________________________________________________ batch_normalization_6 (BatchNor (None, 12, 12, 64) 256 block_3[0][0] __________________________________________________________________________________________________ dropout_1 (Dropout) (None, 12, 12, 64) 0 batch_normalization_6[0][0] __________________________________________________________________________________________________ global_average_pooling2d_4 (Glo (None, 64) 0 dropout_1[0][0] __________________________________________________________________________________________________ dense_4 (Dense) (None, 120) 7800 global_average_pooling2d_4[0][0] ================================================================================================== Total params: 113,896 Trainable params: 113,448 Non-trainable params: 448 __________________________________________________________________________________________________
COMPILING
Compilation and declaration of the callback functions to save computation and time. Categorical cross-entropy and adadelta are the loss function and optimizer of multi-class recognition.
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy']) checkpointer = keras.callbacks.ModelCheckpoint(filepath = 'cnn_from_scratch_fruits.hdf5', verbose = 1, save_best_only = True) earlystopper = keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=0, mode='auto', baseline=None, restore_best_weights=False)
TRAINING
Training of the network with 15 epochs. Simultaneous saving of the best-trained model as a result of the callback functions.
history=model.fit(x_train, y_train, batch_size=64, epochs=15,validation_data=(x_val, y_val), callbacks = [checkpointer,earlystopper], shuffle=True)
Train on 48398 samples, validate on 12100 samples Epoch 1/15 48398/48398 [==============================] - 19s 391us/step - loss: 1.2028 - accuracy: 0.7227 - val_loss: 2.4619 - val_accuracy: 0.3789 Epoch 00001: val_loss improved from inf to 2.46188, saving model to cnn_from_scratch_fruits.hdf5 Epoch 2/15 48398/48398 [==============================] - 15s 305us/step - loss: 0.1818 - accuracy: 0.9555 - val_loss: 1.0984 - val_accuracy: 0.7123 Epoch 00002: val_loss improved from 2.46188 to 1.09844, saving model to cnn_from_scratch_fruits.hdf5 Epoch 3/15 48398/48398 [==============================] - 15s 304us/step - loss: 0.0852 - accuracy: 0.9776 - val_loss: 0.6442 - val_accuracy: 0.7993 Epoch 00003: val_loss improved from 1.09844 to 0.64418, saving model to cnn_from_scratch_fruits.hdf5 Epoch 4/15 48398/48398 [==============================] - 15s 316us/step - loss: 0.0492 - accuracy: 0.9871 - val_loss: 1.9674 - val_accuracy: 0.6103 Epoch 00004: val_loss did not improve from 0.64418 Epoch 5/15 48398/48398 [==============================] - 15s 305us/step - loss: 0.0393 - accuracy: 0.9898 - val_loss: 0.8358 - val_accuracy: 0.7798 Epoch 00005: val_loss did not improve from 0.64418 Epoch 6/15 48398/48398 [==============================] - 15s 307us/step - loss: 0.0307 - accuracy: 0.9918 - val_loss: 0.9295 - val_accuracy: 0.7955 Epoch 00006: val_loss did not improve from 0.64418 Epoch 7/15 48398/48398 [==============================] - 15s 306us/step - loss: 0.0223 - accuracy: 0.9943 - val_loss: 0.7090 - val_accuracy: 0.8130 Epoch 00007: val_loss did not improve from 0.64418 Epoch 8/15 48398/48398 [==============================] - 15s 305us/step - loss: 0.0171 - accuracy: 0.9959 - val_loss: 0.0398 - val_accuracy: 0.9845 Epoch 00008: val_loss improved from 0.64418 to 0.03983, saving model to cnn_from_scratch_fruits.hdf5 Epoch 9/15 48398/48398 [==============================] - 15s 306us/step - loss: 0.0136 - accuracy: 0.9966 - val_loss: 0.0500 - val_accuracy: 0.9866 Epoch 00009: val_loss did not improve from 0.03983 Epoch 10/15 48398/48398 [==============================] - 15s 306us/step - loss: 0.0141 - accuracy: 0.9967 - val_loss: 0.0182 - val_accuracy: 0.9948 Epoch 00010: val_loss improved from 0.03983 to 0.01823, saving model to cnn_from_scratch_fruits.hdf5 Epoch 11/15 48398/48398 [==============================] - 15s 308us/step - loss: 0.0100 - accuracy: 0.9974 - val_loss: 0.2736 - val_accuracy: 0.9198 Epoch 00011: val_loss did not improve from 0.01823 Epoch 12/15 48398/48398 [==============================] - 15s 306us/step - loss: 0.0092 - accuracy: 0.9980 - val_loss: 0.1821 - val_accuracy: 0.9526 Epoch 00012: val_loss did not improve from 0.01823 Epoch 13/15 48398/48398 [==============================] - 15s 303us/step - loss: 0.0090 - accuracy: 0.9977 - val_loss: 0.7217 - val_accuracy: 0.8621 Epoch 00013: val_loss did not improve from 0.01823 Epoch 14/15 48398/48398 [==============================] - 15s 305us/step - loss: 0.0075 - accuracy: 0.9980 - val_loss: 0.0138 - val_accuracy: 0.9956 Epoch 00014: val_loss improved from 0.01823 to 0.01377, saving model to cnn_from_scratch_fruits.hdf5 Epoch 15/15 48398/48398 [==============================] - 15s 304us/step - loss: 0.0068 - accuracy: 0.9981 - val_loss: 0.1471 - val_accuracy: 0.9557 Epoch 00015: val_loss did not improve from 0.01377
LOAD WEIGHTS
Loading of the weights derived from the trained network.
model.load_weights('cnn_from_scratch_fruits.hdf5')
PLOTTING
Plotting of the accuracy and validation accuracy with the change of epochs of the trained model
plt.plot(history.history['accuracy']) plt.plot(history.history['val_accuracy']) plt.title('Model accuracy') plt.ylabel('Accuracy') plt.xlabel('Epoch') plt.legend(['Train', 'Validation'], loc='upper left') plt.show()
The plotting of the loss and validation loss with the change in epochs.
plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('Model loss') plt.ylabel('Loss') plt.xlabel('Epoch') plt.legend(['Train', 'Validation'], loc='upper left') plt.show()
PREDICTION
Prediction the trained model with a set of test images. Displaying the output. Then, predicting the results of the test images with the trained model.
y_test_pred = model.predict(x_test) accuracy_score(np.argmax(y_test_pred,axis=1), np.argmax(y_test,axis=1))
0.9739113568034138
FINAL THOUGHTS
In this section, we have discussed a new type of Residual Network that enhances the performance of the network without much deviation in the computation termed as SE-Residual Network. With less than 1% of the increase in the cost, the system increases the accuracy of the network. The basic idea of this network is to adjust the weights of the image features thereby considering useful features for the training process. The introduction section of this article clearly explains the network. Thus, in this article, we discussed SE-Residual Network implementing it on a simple fruit multi-classification computer vision problem and observed the prediction rate and accuracy. The workflow of the network is as follows: first, the building of the network. Then, the compiling and training of the network. This explains the network shown in this article.
The source code for the fruit detection can be found and downloaded from here.
To go through an article based on Residual network, click here.
To learn from my other blogs, refer here.
REFERENCES:
[1]. Squeeze-and-Excitation Networks – The original paper with full insight into the SE Network.
Thank you. Hope this article was helpful!
Leave a Reply