Image Classification Using Convolution Neural Network (CNN) in Python
In this article, we are going to explore image classification. For this task, we are going to use horses or humans dataset. Our goal here is to build a binary classifier using CNN to categorize the images correctly as horses or humans with the help of Python programming.
In addition to this, the dataset consists of 500 images of horses and 527 images of humans accounting for a total of 1027 images to train on. The dataset is available on https://www.kaggle.com/sanikamal/horses-or-humans-dataset.
I hope you already know about Convolution Neural Network. Now let’s continue and starts with the introduction first…
INTRODUCTION
- The major advantage of using CNN is to use it for feature extraction from images. The computer perceives an image in the form of pixels range from 0 to 255. Therefore to get useful results the next step is to take an input image, a filter to apply to the input image, this filter extracts certain features essential for training. We find that these entities are numeric and are 2D or 3D arrays depending on the input image.
- We observe that images can either be black and white-2D or RGB(Red, Blue, Green) colored images-3D.
Data Preparation
Importing the data is the first step. Two zip files are used for extracting the data to perform training and validation on different image batches. Below is our Python code:
import os import zipfile local_zip = '/tmp/humans-horses.zip' zip_ref = zipfile.ZipFile(local_zip, 'r') zip_ref.extractall('/tmp/humans-horses') local_zip = '/tmp/validation-humans-horses.zip' zip_ref = zipfile.ZipFile(local_zip, 'r') zip_ref.extractall('/tmp/validation-humans-horses') zip_ref.close()
import tensorflow as tf
Building the Model
Now we will use Keras to build the model. The Python program for doing this is given below:
model=tf.keras.models.Sequential([ tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(32,(3,3),activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(64,(3,3),activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(64,(3,3),activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(64,(3,3),activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(512,activation='relu'), tf.keras.layers.Dense(1,activation='sigmoid')])
Explanation: Let’s improve our understanding of the layers:
- Conv2D: This is the input layer where feature extraction takes place.
- MaxPooling: This step gets the maximum pixel value therefore it reduces the size of the input image.
- Flatten: Also, flatten is used to transform the input into a vector and feed it into the fully connected layer.
- Sigmoid activation: Used in the output layer for binary classification.
- ReLU activation: To handle non-linearity.
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 298, 298, 16) 448 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 149, 149, 16) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 147, 147, 32) 4640 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 73, 73, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 71, 71, 64) 18496 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 35, 35, 64) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 33, 33, 64) 36928 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 16, 16, 64) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 14, 14, 64) 36928 _________________________________________________________________ max_pooling2d_4 (MaxPooling2 (None, 7, 7, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 3136) 0 _________________________________________________________________ dense (Dense) (None, 512) 1606144 _________________________________________________________________ dense_1 (Dense) (None, 1) 513 ================================================================= Total params: 1,704,097 Trainable params: 1,704,097 Non-trainable params: 0
Compiling The Model
from tensorflow.keras.optimizers import RMSprop model.compile(loss='binary_crossentropy',optimizer=RMSprop(lr=0.001),metrics=['accuracy'])
The loss function is ‘binary_crossentropy’ to deal with binary classification. The RMS prop optimizer is used and the learning rate is used for converging to get the global minima.
Using The Image Generator and Image Augmentation
from tensorflow.keras.preprocessing.image import ImageDataGenerator train_datagen=ImageDataGenerator(rescale=1/255, rotation_range=40, width_shift_range=0.2, shear_range=0.2, horizontal_flip=True) validation_datagen=ImageDataGenerator(rescale=1/255, rotation_range=40, width_shift_range=0.2, shear_range=0.2, horizontal_flip=True) train_generator=train_datagen.flow_from_directory( '/tmp/horse-or-human/', target_size=(300,300), batch_size=128, class_mode='binary' ) validation_generator=validation_datagen.flow_from_directory( '/tmp/validation-horse-or-human/', target_size=(300,300), batch_size=32, class_mode='binary' )
Output
Found 1027 images belonging to 2 classes. Found 256 images belonging to 2 classes.
Explanation: There are a training generator and a validation generator for training and image validation which happens in batches therefore the batch_size is specified. Image rescaling is done to normalize it to the pixel range. (horizontal_flip,rotation_range shear_range )are specified to handle various types of images which is a case of image augmentation. Class_mode is binary for binary_classification.
Testing The Model
test=model.fit(train_generator,steps_per_epoch=8,epochs=15,verbose=1,validation_data=validation_generator,validation_steps=8)
Explanation: We are now going to fit the model on the training set. Also, specify the epochs which are the number of training steps.
Output
Epoch 1/15 8/8 [==============================] - 7s 933ms/step - loss: 0.8665 - accuracy: 0.5061 - val_loss: 0.6717 - val_accuracy: 0.5078 Epoch 2/15 8/8 [==============================] - 7s 851ms/step - loss: 0.5905 - accuracy: 0.7175 - val_loss: 5.5931 - val_accuracy: 0.5000 Epoch 3/15 8/8 [==============================] - 7s 847ms/step - loss: 1.2182 - accuracy: 0.7942 - val_loss: 0.4081 - val_accuracy: 0.8711 Epoch 4/15 8/8 [==============================] - 7s 858ms/step - loss: 0.2603 - accuracy: 0.8954 - val_loss: 0.7934 - val_accuracy: 0.8047 Epoch 5/15 8/8 [==============================] - 8s 953ms/step - loss: 0.1816 - accuracy: 0.9377 - val_loss: 1.5857 - val_accuracy: 0.7891 Epoch 6/15 8/8 [==============================] - 7s 859ms/step - loss: 0.1765 - accuracy: 0.9288 - val_loss: 0.4917 - val_accuracy: 0.8867 Epoch 7/15 8/8 [==============================] - 7s 849ms/step - loss: 0.1663 - accuracy: 0.9333 - val_loss: 0.5318 - val_accuracy: 0.8633 Epoch 8/15 8/8 [==============================] - 7s 851ms/step - loss: 0.3250 - accuracy: 0.8888 - val_loss: 0.9239 - val_accuracy: 0.8438 Epoch 9/15 8/8 [==============================] - 7s 857ms/step - loss: 0.0661 - accuracy: 0.9800 - val_loss: 1.0600 - val_accuracy: 0.8555 Epoch 10/15 8/8 [==============================] - 7s 858ms/step - loss: 0.1797 - accuracy: 0.9377 - val_loss: 13.9662 - val_accuracy: 0.5000 Epoch 11/15 8/8 [==============================] - 7s 853ms/step - loss: 2.0378 - accuracy: 0.8676 - val_loss: 1.3739 - val_accuracy: 0.7852 Epoch 12/15 8/8 [==============================] - 8s 959ms/step - loss: 0.0597 - accuracy: 0.9822 - val_loss: 1.5067 - val_accuracy: 0.8008 Epoch 13/15 8/8 [==============================] - 7s 860ms/step - loss: 0.0518 - accuracy: 0.9789 - val_loss: 2.4157 - val_accuracy: 0.7773 Epoch 14/15 8/8 [==============================] - 7s 858ms/step - loss: 0.0391 - accuracy: 0.9889 - val_loss: 1.2087 - val_accuracy: 0.8398 Epoch 15/15 8/8 [==============================] - 7s 901ms/step - loss: 0.0088 - accuracy: 0.9990 - val_loss: 1.6605 - val_accuracy: 0.8398
We get a validation accuracy of around 84% which is the model’s ability to deal with new images. For instance, let’s see how the model responds to an image we feed.
We will use the image of this horse given below.
import numpy as np from google.colab import files from keras.preprocessing import image uploaded = files.upload() for fn in uploaded.keys(): # predicting images path = '/content/' + fn img = image.load_img(path, target_size=(300, 300)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) images = np.vstack([x]) classes = model.predict(images, batch_size=10) print(classes[0]) if classes[0]>0.5: print(fn + " IS A HUMAN") else: print(fn + " IS A HORSE")
Output
white-3010129_1920.jpg(image/jpeg) - 319543 bytes, last modified: 9/23/2020 - 100% done Saving white-3010129_1920.jpg to white-3010129_1920 (3).jpg [1.] white-3010129_1920.jpg IS A HORSE
0ur model categorizes the image as a horse.
This marks the end of the article.
Thanks for reading!
Leave a Reply