Implementation of captch solver using Keras

INTRODUCTION

Hi everyone. In the previous tutorial, we have seen how to build a captcha solver basic in TensorFlow using Keras. In this tutorial, we will look at how to implement this for a real captcha solver. Previously we just built an MNIST image classification model using a custom-built CNN model in Keras. We will use that model to implement a real-time captcha solver.

IMPORT THE REQUIRED LIBRARIES

import cv2 
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import pandas as pd
import glob
import imutils
from imutils import paths
import os
import os.path
import collections
import tensorflow as tf
from tensorflow import keras

As usual, we have used famous data science libraries, NumPy and pandas. Matplotlib for visualisations and Keras for building our deep learning models. cv2 is commonly used for image analysis. We have used a new library called fetch_openml, an extensive collection of open-source datasets widely used for beginners.

PREPROCESSING INPUTS

img = cv2.imread("64.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 
ret, thresh1 = cv2.threshold(gray, 10, 255, cv2.THRESH_BINARY_INV)
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1) 
contour, contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

First, download any random captcha from the internet and load it into your colab environment or in a jupyter notebook. Now convert the RGB image into a grayscale image using convert colour function is cv2. Our idea is to find the individual numbers from the captcha and extract it from the captcha for identification. We have used a threshold that defines the bounding boxes, and contours are used to identify the bounding boxes. Crop out the remaining regions that are of interest to us and finally copy the final image to another image.

im2=img.copy()

EXTRACTING INDIVIDUAL DIGITS

letters=dict()
position=list()
for cnt in contours: 
    x, y, w, h = cv2.boundingRect(cnt) 
    #rect = cv2.rectangle(im2, (x, y), (x + w, y + h), (0, 255, 0), 2)  
    cropped = im2[y:y + h, x:x + w]  
    #text = pytesseract.image_to_string(cropped)
    #print(text)
    position.append(x)
    letters[x]=cropped

. We created a dictionary to keep track of the digits’ order in the input image. The boundingRect is used to extract the bounding box coordinates from the contour and store it in the variables. Now the cropped image is created, and the digits are sorted in the increasing order of the x-coordinate of the contour in our dictionary.

plt.imshow(thresh1,cmap='binary')

The captcha image we downloaded can be found here.

position=sorted(position)
letters=collections.OrderedDict(sorted(letters.items()))
len(position)
5

The number of individual digits in the captcha is given as an output in the above code.

CHECKING THE EXTRACTED DIGITS

plt.imshow(cv2.resize(letters[position[0]],(28,28),interpolation=cv2.INTER_LINEAR))

plt.imshow(cv2.resize(letters[position[1]],(28,28),interpolation=cv2.INTER_LINEAR))

plt.imshow(cv2.resize(letters[position[2]],(28,28),interpolation=cv2.INTER_LINEAR))

plt.imshow(cv2.resize(letters[position[3]],(28,28),interpolation=cv2.INTER_LINEAR))

plt.imshow(cv2.resize(letters[position[4]],(28,28),interpolation=cv2.INTER_LINEAR))

The above code plots the extracted image whose coordinates were stored in the dictionary we created earlier. The images can be found in

digit1 digit2 digit3 digit4 digit5 .

HELPER FUNCTION

def solver(model,weights,letters,position):
    n=len(position)
    model.load_weights(weights)
    for i in range(n):
        im=letters[position[i]]
        im=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY) 
        im=cv2.resize(im,(28,28),interpolation=cv2.INTER_LINEAR)
        im=np.expand_dims(im,axis=-1)
        im=np.expand_dims(im,axis=0)
        print(np.argmax(model.predict(im),axis=-1))

We created a helper function for our captcha solver. It loads the model weights that we downloaded in our previous tutorial. Then for each of the extracted image, it converts them into a tensor that can be given as an input for our model. It finally predicts the image and then gives the predicted output.

MODEL

model=keras.models.Sequential()
model.add(keras.layers.Conv2D(64,7,activation='relu',padding='same',input_shape=[28,28,1]))
model.add(keras.layers.MaxPooling2D(2))
model.add(keras.layers.Conv2D(128,3,activation='relu',padding='same'))
model.add(keras.layers.Conv2D(128,3,activation='relu',padding='same'))
model.add(keras.layers.MaxPooling2D(2))
model.add(keras.layers.Conv2D(256,3,activation='relu',padding='same'))
model.add(keras.layers.Conv2D(256,3,activation='relu',padding='same'))
model.add(keras.layers.MaxPooling2D(2))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(128,activation='relu'))
model.add(keras.layers.Dropout(0.5))
model.add(keras.layers.Dense(64,activation='relu'))
model.add(keras.layers.Dropout(0.5))
model.add(keras.layers.Dense(10,activation='softmax'))

This is the model that we created in our previous tutorial. So no explanation is needed on how we built the model. Refer to my previous tutorial to know more about the model-building procedure. The link can be found here.

model.compile(loss='sparse_categorical_crossentropy',optimizer=keras.optimizers.Adam(lr=0.001),metrics='accuracy')

We compiled our model using sparse_categorical_entropy as our loss function and Adam optimiser. We used accuracy as our evaluation metric. We don’t need to train our model as we already trained it in my previous tutorial and downloaded the model weights. We can upload the weight in your colab notebook and use it anytime you need.

PREDICTION

weights='Captcha Solver.h5'
solver(model,weights,letters,position)
[2]
[1]
[4]
[8]
[5]

We mentioned our model weights file and made predictions on the extracted individual digits. Our model perfectly predicted all the digits, and that’s great. Now our captcha solver is ready.

CONCLUSION

In this tutorial, we implemented the captcha solver that we were talking about in our previous tutorial. We extracted individual digits from our captcha and fed it the model that we previously trained on the MNIST digits dataset, and our model perfectly predicted the captcha. You can also look at my previous tutorial to know more about how we built our custom CNN model and trained it.

Training a Captcha solver using TensorFlow CNN architecture

 

Leave a Reply

Your email address will not be published. Required fields are marked *