Number plate detection using MNIST using Keras in Python

Number plate detection using MNIST in Keras in Python

In this article, autonomous number plate detection with the MNIST dataset is done and explained in detail from scratch starting from the training to the development of the User Interface with the help of Python programming using the Keras TensorFlow API.

We have trained the numbers using the MNIST dataset which consists of 60,000 images each of 28×28 sized handwritten images commonly used for training various image processing systems. Thus, the trained model will be deployed into a customized graphical user interface, where testing will be done with various widgets and contouring tools. Therefore, these algorithms will help improve security and upgrade surveillance accuracy and ease.

The two main constituents of this article are:

  1. Training the MNIST dataset, resulting in 99.26% accuracy.
  2. Creating the Custom User Interface, that’s the widget, contouring, and finding the digits with the trained MNIST model from Part 1.

Hope this tutorial will be helpful to the readers.

Let’s dive into the code and stay tuned.

Happy Reading!!!

PART I

Here in this part, we will train the model using the MNIST dataset for handwritten digits. The MNIST dataset consists of 60,000 handwritten digits with each image constituting 28×28 pixels.

The steps involved in this PART I are:

  • Load the dataset into the project
  • Add layers/ built the model
  • Normalization of the accumulated images
  • Compiling, and Training of the model.
  • Evaluation of the trained model is done and studied
  • Saving the model for easy use in PART II.

Thus, this part consists of a total of 10 sections, built for easy understanding and ease of explanation.

IMPORTING LIBRARIES

Required Python libraries for this section are imported.

from keras import layers
from keras import models
from keras.datasets import mnist
from keras.utils import to_categorical

LOADING THE DATASET AND ADDING LAYERS

Downloading the dataset from Keras and storing it in the images and label folders for ease. Creating the model layers using convolutional 2D layers, max-pooling, and dense layers.

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 576)               0         
_________________________________________________________________
dense (Dense)                (None, 64)                36928     
_________________________________________________________________
dense_1 (Dense)              (None, 10)                650       
=================================================================
Total params: 93,322
Trainable params: 93,322
Non-trainable params: 0
_________________________________________________________________

NORMALIZING IMAGES

Normalizing image is an important step to consider when using big datasets especially. It is in simple words, resizing and reshaping images based on a particular scale.

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

COMPILING AND TRAINING THE MODEL

Compiling and training of the model with datasets with model.compile() and model.fit().

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)
Epoch 1/5
938/938 [==============================] - 52s 56ms/step - loss: 0.1651 - accuracy: 0.9480
Epoch 2/5
938/938 [==============================] - 54s 57ms/step - loss: 0.0467 - accuracy: 0.9859
Epoch 3/5
938/938 [==============================] - 56s 60ms/step - loss: 0.0328 - accuracy: 0.9891
Epoch 4/5
938/938 [==============================] - 52s 55ms/step - loss: 0.0245 - accuracy: 0.9924
Epoch 5/5
938/938 [==============================] - 51s 55ms/step - loss: 0.0194 - accuracy: 0.9942

EVALUATION OF THE MODEL

This trained model is 99.26% accurate when evaluated. By increasing the number of iterations, precision can be improved.

test_loss, test_acc = model.evaluate(test_images, test_labels)
print(test_acc)
313/313 [==============================] - 3s 10ms/step - loss: 0.0258 - accuracy: 0.9927
0.9926999807357788

SAVING THE MODEL

save() saves the trained model.

model.save('mnist.h5')

PART II

The prime goal of this PART II is to build a customized graphical user interface. The steps involved in this project are listed below:

  • Setting up the Graphical user interface.
  • Creating the canvas.
  • Widgets making; including the drawing lines, clearing the canvas, predicting, etc.
  • Then, contouring of the digits via OpenCv and NumPy tools
  • Connecting the widget buttons to the functions

Thus, this part consists of a total of 10 sections, split for the ease and understanding of the reader.

IMPORTING THE LIBRARIES

Required libraries are imported based on the needs of the project.

from tkinter import *

import cv2
import numpy as np
from PIL import ImageGrab
from keras.models import load_model

LOADING THE MODEL

load_model loads the saved model in Part I.

model = load_model('C:/Users/Jerrin/Desktop/REC/mnist.h5')
image_folder = "C:/Users/Jerrin/Desktop/REC"

GUI SETUP

Tk() is the python interface for the graphical user interface. Running these commands opens a simple window demonstrating a Tk interface.

root = Tk()
root.resizable(0, 0)
root.title("HDR")

lastx, lasty = None, None
image_number = 0

DECLARATION OF CANVAS

Canvas() sets the layout of the window created by the Tk command. The layout includes space where drawing can be done, place graphics, etc.

Grid() formats a table-like structure for the Clear Widget section and the Recognize Digit section.

cv = Canvas(root, width=640, height=480, bg='white')
cv.grid(row=0, column=0, pady=2, sticky=W, columnspan=2)

CLEARING WIDGETS

Function to delete the handwritten digits from the grid window.

def clear_widget():
    global cv
    cv.delete('all')

DRAWING LINES

create.line() draws the lines in the grid window, thus helps in the origination of the handwritten digits.

def draw_lines(event):
    global lastx, lasty
    x, y = event.x, event.y
    cv.create_line((lastx, lasty, x, y), width=8, fill='black', capstyle=ROUND, smooth=TRUE, splinesteps=12)
    lastx, lasty = x, y

ACTIVATING THE EVENT

Activating the event by extracting the point clouds. Point clouds are the coordinates of the event.

def activate_event(event):
    global lastx, lasty
    cv.bind('<B1-Motion>', draw_lines)
    lastx, lasty = event.x, event.y
cv.bind('<Button-1>', activate_event)

IDENTIFYING AND CROPPING DIGITS

Isolating the digits from the window and filtering of the isolated image. Then, save() saves the isolated image.

def Recognize_Digit():
    global image_number
    filename = '/predict1.jpg'
    widget = cv

    x = root.winfo_rootx() + widget.winfo_rootx()
    y = root.winfo_rooty() + widget.winfo_rooty()
    x1 = x + widget.winfo_width()
    y1 = y + widget.winfo_height()
    print(x, y, x1, y1)

    # get image and save
    ImageGrab.grab().crop((x, y, x1, y1)).save(image_folder + filename)

    image = cv2.imread(image_folder + filename, cv2.IMREAD_COLOR)
    gray = cv2.cvtColor(image.copy(), cv2.COLOR_BGR2GRAY)
    ret, th = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

    contours = cv2.findContours(th, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0]

CONTOURING OF DIGITS

Contouring of images occurs in a couple of distinctive steps. Thus, the four important steps in this contouring section are:

  • Cropping out the digit from the image corresponding to the current contours in the for loop
  • Resizing that digit to (18, 18)
  • Padding the digit with 5 pixels of black color (zeros) on each side to finally produce the image of (28, 28)
  • Prediction of the handwritten digits
for cnt in contours:
       x, y, w, h = cv2.boundingRect(cnt)
       # make a rectangle box around each curve
       cv2.rectangle(image, (x, y), (x + w, y + h), (255, 0, 0), 1)

       digit = th[y:y + h, x:x + w]

       resized_digit = cv2.resize(digit, (18, 18))

       padded_digit = np.pad(resized_digit, ((5, 5), (5, 5)), "constant", constant_values=0)

       digit = padded_digit.reshape(1, 28, 28, 1)
       digit = digit / 255.0

       pred = model.predict([digit])[0]
       final_pred = np.argmax(pred)

       data = str(final_pred) + ' ' + str(int(max(pred) * 100)) + '%'

       font = cv2.FONT_HERSHEY_SIMPLEX
       fontScale = 0.5
       color = (255, 0, 0)
       thickness = 1
       cv2.putText(image, data, (x, y - 5), font, fontScale, color, thickness)

   cv2.imshow('image', image)
   cv2.waitKey(0)

 BUTTONS

Button() creates the buttons for the interface. Thus, the function has two parameters:

  • text – The text to display on the button also termed as the label of the button.
  • command – Function or the method to be provoked when the button is clicked.
btn_save = Button(text='Recognize Digit', command=Recognize_Digit)
btn_save.grid(row=2, column=0, pady=1, padx=1)
button_clear = Button(text='Clear Widget', command=clear_widget)
button_clear.grid(row=2, column=1, pady=1, padx=1)

root.mainloop()

FINAL THOUGHTS

Number plate detection with the help of the MNSIT dataset is thus done and elaborated in this article. Summarizing what we studied here, we trained an MNIST dataset consisting of 60,000 images using customized training methods with convolution, max-pooling techniques. Then, we saved the model. Following that, we created a custom user interface to draw our digits, recognize the digit, and to delete the digits drawn using the grid window. Thus, this technique directly enhances the autonomous detection of the number plate with live feed supporting the surveillance system in day-to-day circumstances.

Hope this tutorial, in general, helped the readers in understanding how to train datasets, create a custom user interface, and deploy the trained model in a customized graphical user interface.

To download the source code, click here.

Click here to check out more of my blogs.

Thank you!!!

 

Leave a Reply

Your email address will not be published. Required fields are marked *