Number plate detection using MNIST using Keras in Python

In this article, autonomous number plate detection with the MNIST dataset is done and explained in detail from scratch starting from the training to the development of the User Interface with the help of Python programming using the Keras TensorFlow API.
We have trained the numbers using the MNIST dataset which consists of 60,000 images each of 28×28 sized handwritten images commonly used for training various image processing systems. Thus, the trained model will be deployed into a customized graphical user interface, where testing will be done with various widgets and contouring tools. Therefore, these algorithms will help improve security and upgrade surveillance accuracy and ease.
The two main constituents of this article are:
- Training the MNIST dataset, resulting in 99.26% accuracy.
- Creating the Custom User Interface, that’s the widget, contouring, and finding the digits with the trained MNIST model from Part 1.
Hope this tutorial will be helpful to the readers.
Let’s dive into the code and stay tuned.
Happy Reading!!!
PART I
Here in this part, we will train the model using the MNIST dataset for handwritten digits. The MNIST dataset consists of 60,000 handwritten digits with each image constituting 28×28 pixels.
The steps involved in this PART I are:
- Load the dataset into the project
- Add layers/ built the model
- Normalization of the accumulated images
- Compiling, and Training of the model.
- Evaluation of the trained model is done and studied
- Saving the model for easy use in PART II.
Thus, this part consists of a total of 10 sections, built for easy understanding and ease of explanation.
IMPORTING LIBRARIES
Required Python libraries for this section are imported.
from keras import layers from keras import models from keras.datasets import mnist from keras.utils import to_categorical
LOADING THE DATASET AND ADDING LAYERS
Downloading the dataset from Keras and storing it in the images and label folders for ease. Creating the model layers using convolutional 2D layers, max-pooling, and dense layers.
(train_images, train_labels), (test_images, test_labels) = mnist.load_data() model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.Flatten()) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(10, activation='softmax')) model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz 11493376/11490434 [==============================] - 0s 0us/step Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 11, 11, 64) 18496 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 3, 3, 64) 36928 _________________________________________________________________ flatten (Flatten) (None, 576) 0 _________________________________________________________________ dense (Dense) (None, 64) 36928 _________________________________________________________________ dense_1 (Dense) (None, 10) 650 ================================================================= Total params: 93,322 Trainable params: 93,322 Non-trainable params: 0 _________________________________________________________________
NORMALIZING IMAGES
Normalizing image is an important step to consider when using big datasets especially. It is in simple words, resizing and reshaping images based on a particular scale.
train_images = train_images.reshape((60000, 28, 28, 1)) train_images = train_images.astype('float32') / 255 test_images = test_images.reshape((10000, 28, 28, 1)) test_images = test_images.astype('float32') / 255 train_labels = to_categorical(train_labels) test_labels = to_categorical(test_labels)
COMPILING AND TRAINING THE MODEL
Compiling and training of the model with datasets with model.compile() and model.fit().
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) model.fit(train_images, train_labels, epochs=5, batch_size=64)
Epoch 1/5 938/938 [==============================] - 52s 56ms/step - loss: 0.1651 - accuracy: 0.9480 Epoch 2/5 938/938 [==============================] - 54s 57ms/step - loss: 0.0467 - accuracy: 0.9859 Epoch 3/5 938/938 [==============================] - 56s 60ms/step - loss: 0.0328 - accuracy: 0.9891 Epoch 4/5 938/938 [==============================] - 52s 55ms/step - loss: 0.0245 - accuracy: 0.9924 Epoch 5/5 938/938 [==============================] - 51s 55ms/step - loss: 0.0194 - accuracy: 0.9942
EVALUATION OF THE MODEL
This trained model is 99.26% accurate when evaluated. By increasing the number of iterations, precision can be improved.
test_loss, test_acc = model.evaluate(test_images, test_labels) print(test_acc)
313/313 [==============================] - 3s 10ms/step - loss: 0.0258 - accuracy: 0.9927 0.9926999807357788
SAVING THE MODEL
save() saves the trained model.
model.save('mnist.h5')
PART II
The prime goal of this PART II is to build a customized graphical user interface. The steps involved in this project are listed below:
- Setting up the Graphical user interface.
- Creating the canvas.
- Widgets making; including the drawing lines, clearing the canvas, predicting, etc.
- Then, contouring of the digits via OpenCv and NumPy tools
- Connecting the widget buttons to the functions
Thus, this part consists of a total of 10 sections, split for the ease and understanding of the reader.
IMPORTING THE LIBRARIES
Required libraries are imported based on the needs of the project.
from tkinter import * import cv2 import numpy as np from PIL import ImageGrab from keras.models import load_model
LOADING THE MODEL
load_model loads the saved model in Part I.
model = load_model('C:/Users/Jerrin/Desktop/REC/mnist.h5') image_folder = "C:/Users/Jerrin/Desktop/REC"
GUI SETUP
Tk() is the python interface for the graphical user interface. Running these commands opens a simple window demonstrating a Tk interface.
root = Tk() root.resizable(0, 0) root.title("HDR") lastx, lasty = None, None image_number = 0
DECLARATION OF CANVAS
Canvas() sets the layout of the window created by the Tk command. The layout includes space where drawing can be done, place graphics, etc.
Grid() formats a table-like structure for the Clear Widget section and the Recognize Digit section.
cv = Canvas(root, width=640, height=480, bg='white') cv.grid(row=0, column=0, pady=2, sticky=W, columnspan=2)
CLEARING WIDGETS
Function to delete the handwritten digits from the grid window.
def clear_widget(): global cv cv.delete('all')
DRAWING LINES
create.line() draws the lines in the grid window, thus helps in the origination of the handwritten digits.
def draw_lines(event): global lastx, lasty x, y = event.x, event.y cv.create_line((lastx, lasty, x, y), width=8, fill='black', capstyle=ROUND, smooth=TRUE, splinesteps=12) lastx, lasty = x, y
ACTIVATING THE EVENT
Activating the event by extracting the point clouds. Point clouds are the coordinates of the event.
def activate_event(event): global lastx, lasty cv.bind('<B1-Motion>', draw_lines) lastx, lasty = event.x, event.y cv.bind('<Button-1>', activate_event)
IDENTIFYING AND CROPPING DIGITS
Isolating the digits from the window and filtering of the isolated image. Then, save() saves the isolated image.
def Recognize_Digit(): global image_number filename = '/predict1.jpg' widget = cv x = root.winfo_rootx() + widget.winfo_rootx() y = root.winfo_rooty() + widget.winfo_rooty() x1 = x + widget.winfo_width() y1 = y + widget.winfo_height() print(x, y, x1, y1) # get image and save ImageGrab.grab().crop((x, y, x1, y1)).save(image_folder + filename) image = cv2.imread(image_folder + filename, cv2.IMREAD_COLOR) gray = cv2.cvtColor(image.copy(), cv2.COLOR_BGR2GRAY) ret, th = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) contours = cv2.findContours(th, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0]
CONTOURING OF DIGITS
Contouring of images occurs in a couple of distinctive steps. Thus, the four important steps in this contouring section are:
- Cropping out the digit from the image corresponding to the current contours in the for loop
- Resizing that digit to (18, 18)
- Padding the digit with 5 pixels of black color (zeros) on each side to finally produce the image of (28, 28)
- Prediction of the handwritten digits
for cnt in contours: x, y, w, h = cv2.boundingRect(cnt) # make a rectangle box around each curve cv2.rectangle(image, (x, y), (x + w, y + h), (255, 0, 0), 1) digit = th[y:y + h, x:x + w] resized_digit = cv2.resize(digit, (18, 18)) padded_digit = np.pad(resized_digit, ((5, 5), (5, 5)), "constant", constant_values=0) digit = padded_digit.reshape(1, 28, 28, 1) digit = digit / 255.0 pred = model.predict([digit])[0] final_pred = np.argmax(pred) data = str(final_pred) + ' ' + str(int(max(pred) * 100)) + '%' font = cv2.FONT_HERSHEY_SIMPLEX fontScale = 0.5 color = (255, 0, 0) thickness = 1 cv2.putText(image, data, (x, y - 5), font, fontScale, color, thickness) cv2.imshow('image', image) cv2.waitKey(0)
BUTTONS
Button() creates the buttons for the interface. Thus, the function has two parameters:
- text – The text to display on the button also termed as the label of the button.
- command – Function or the method to be provoked when the button is clicked.
btn_save = Button(text='Recognize Digit', command=Recognize_Digit) btn_save.grid(row=2, column=0, pady=1, padx=1) button_clear = Button(text='Clear Widget', command=clear_widget) button_clear.grid(row=2, column=1, pady=1, padx=1) root.mainloop()
FINAL THOUGHTS
Number plate detection with the help of the MNSIT dataset is thus done and elaborated in this article. Summarizing what we studied here, we trained an MNIST dataset consisting of 60,000 images using customized training methods with convolution, max-pooling techniques. Then, we saved the model. Following that, we created a custom user interface to draw our digits, recognize the digit, and to delete the digits drawn using the grid window. Thus, this technique directly enhances the autonomous detection of the number plate with live feed supporting the surveillance system in day-to-day circumstances.
Hope this tutorial, in general, helped the readers in understanding how to train datasets, create a custom user interface, and deploy the trained model in a customized graphical user interface.
To download the source code, click here.
Click here to check out more of my blogs.
Thank you!!!
Leave a Reply