Sign Language Recognition Using Tensorflow in Python
In this article, we are going to explore the process of building a multiclass classifier using CNN and Keras from TensorFlow with Python programming. Our goal here is to build a classifier to categorize hand gestures.
In addition to this, we will be using the dataset from https://www.kaggle.com/datamunge/sign-language-mnist.
The dataset contains hand gesture images for English alphabets. It has 24 classes, as the letters J and Z require motion. The images are grayscale 28X28 having a pixel range between 0-255.
DATA PREPARATION
Importing Libraries
import csv import numpy as np import tensorflow as tf from tensorflow.keras.preprocessing.image import ImageDataGenerator from os import getcwd
Now we will extract the data and check the shape of our labels and images.
def get_data(filename): with open(filename) as training_file: csv_reader = csv.reader(training_file, delimiter=',') first_line = True temp_labels= [] temp_images = [] for row in csv_reader: if first_line: first_line = False else: temp_labels.append(row[0]) image_data = row[1:785] image_array = np.array_split(image_data, 28) temp_images.append(image_array) images = np.array(temp_images).astype('float') labels = np.array(temp_labels).astype('float') return images, labels path_sign_mnist_train = f"{getcwd()}/../tmp2/sign_mnist_train.csv" path_sign_mnist_test = f"{getcwd()}/../tmp2/sign_mnist_test.csv" training_images, training_labels = get_data(path_sign_mnist_train) testing_images, testing_labels = get_data(path_sign_mnist_test) print(training_images.shape) print(training_labels.shape) print(testing_images.shape) print(testing_labels.shape)
OUTPUT
(27455, 28, 28) (27455,) (7172, 28, 28) (7172,)
Image Generator And Augmentation
In this step, we are going to use the Image generator to perform augmentation. We are going to rescale the images.
training_images = np.expand_dims(training_images, axis=-1) testing_images = np.expand_dims(testing_images, axis=-1) train_datagen = ImageDataGenerator( rescale = 1./255, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest') validation_datagen = ImageDataGenerator(rescale = 1./255) print(training_images.shape) print(testing_images.shape)
OUTPUT
(27455, 28, 28, 1) (7172, 28, 28, 1)
Observing The Training Images
x_train = training_images.reshape(-1,28,28,1) x_test =testing_images.reshape(-1,28,28,1)
Now let’s move to the next step as in this task we will have a look at the training images.
f, ax = plt.subplots(2,5) f.set_size_inches(12, 10) k = 0 for i in range(2): for j in range(5): ax[i,j].imshow(x_train[k].reshape(28, 28) , cmap = "gray") k += 1 plt.tight_layout()
OUTPUT
BUILDING THE MODEL
We are going to use Convolutions. The input shape is (28,28,1) as it is grayscale. We apply max pooling to get the maximum pixel value for reducing the image size. Also, flatten converts the image into a single vector.
The input is then passed to the fully connected layers. The output has 26 neurons for each English alphabet, therefore, we use ‘softmax’ activation.
model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D(2, 2), tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dense(26, activation='softmax') ]) model.summary()
OUTPUT
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 64) 640 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 13, 13, 64) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 11, 11, 64) 36928 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 1600) 0 _________________________________________________________________ dense (Dense) (None, 128) 204928 _________________________________________________________________ dense_1 (Dense) (None, 26) 3354 ================================================================= Total params: 245,850 Trainable params: 245,850 Non-trainable params: 0 _________________________________________________________________
Compiling And Training The Model
model.compile(loss = 'sparse_categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy']) history = model.fit_generator(train_datagen.flow(training_images, training_labels, batch_size=32), steps_per_epoch=len(training_images) / 32, epochs=15, validation_data=validation_datagen.flow(testing_images, testing_labels, batch_size=32), validation_steps=len(testing_images) / 32) model.evaluate(testing_images, testing_labels)
Here we also see that the loss function is ‘sparse_categorical_crossentropy’ for multiclass classification. We then fit the training data.
OUTPUT
Epoch 1/15 858/857 [==============================] - 74s 86ms/step - loss: 2.8322 - accuracy: 0.1489 - val_loss: 1.8764 - val_accuracy: 0.4380 Epoch 2/15 858/857 [==============================] - 73s 85ms/step - loss: 2.1461 - accuracy: 0.3344 - val_loss: 1.3909 - val_accuracy: 0.5569 Epoch 3/15 858/857 [==============================] - 76s 89ms/step - loss: 1.7422 - accuracy: 0.4503 - val_loss: 1.0818 - val_accuracy: 0.6247 Epoch 4/15 858/857 [==============================] - 72s 84ms/step - loss: 1.4632 - accuracy: 0.5297 - val_loss: 0.8895 - val_accuracy: 0.6980 Epoch 5/15 858/857 [==============================] - 73s 85ms/step - loss: 1.2390 - accuracy: 0.5966 - val_loss: 0.7915 - val_accuracy: 0.7310 Epoch 6/15 858/857 [==============================] - 74s 86ms/step - loss: 1.0986 - accuracy: 0.6432 - val_loss: 0.7142 - val_accuracy: 0.7568 Epoch 7/15 858/857 [==============================] - 74s 86ms/step - loss: 0.9707 - accuracy: 0.6825 - val_loss: 0.5822 - val_accuracy: 0.7943 Epoch 8/15 858/857 [==============================] - 72s 83ms/step - loss: 0.8814 - accuracy: 0.7093 - val_loss: 0.4680 - val_accuracy: 0.8391 Epoch 9/15 858/857 [==============================] - 73s 85ms/step - loss: 0.8113 - accuracy: 0.7322 - val_loss: 0.4106 - val_accuracy: 0.8448 Epoch 10/15 858/857 [==============================] - 72s 84ms/step - loss: 0.7423 - accuracy: 0.7602 - val_loss: 0.3527 - val_accuracy: 0.8643 Epoch 11/15 858/857 [==============================] - 74s 86ms/step - loss: 0.7019 - accuracy: 0.7712 - val_loss: 0.3366 - val_accuracy: 0.8787 Epoch 12/15 858/857 [==============================] - 71s 82ms/step - loss: 0.6674 - accuracy: 0.7821 - val_loss: 0.2687 - val_accuracy: 0.8922 Epoch 13/15 858/857 [==============================] - 73s 85ms/step - loss: 0.6297 - accuracy: 0.7941 - val_loss: 0.3121 - val_accuracy: 0.8823 Epoch 14/15 858/857 [==============================] - 69s 81ms/step - loss: 0.5916 - accuracy: 0.8029 - val_loss: 0.5081 - val_accuracy: 0.8289 Epoch 15/15 858/857 [==============================] - 73s 85ms/step - loss: 0.5574 - accuracy: 0.8163 - val_loss: 0.2541 - val_accuracy: 0.9046
We get an accuracy of around 90% on the validation data which is good.
Plotting And Visualising The Results
%matplotlib inline import matplotlib.pyplot as plt acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs = range(len(acc)) plt.plot(epochs, acc, 'r', label='Training accuracy') plt.plot(epochs, val_acc, 'b', label='Validation accuracy') plt.title('Training and validation accuracy') plt.legend() plt.figure() plt.plot(epochs, loss, 'r', label='Training Loss') plt.plot(epochs, val_loss, 'b', label='Validation Loss') plt.title('Training and validation loss') plt.legend() plt.show()
OUTPUT
We can see that there is an improvement in the validation accuracy towards the end of the epochs. Therefore our classifier gives good results on the validation set.
To learn about binary classifiers in addition to multiclass classifiers you can also check out
Image Classification Using Convolution Neural Network (CNN) in Python
Thanks for reading!
Leave a Reply