Implementation of captch solver using Keras
Hi everyone. In the previous tutorial, we have seen how to build a captcha solver basic in TensorFlow using Keras. In this tutorial, we will look at how to implement this for a real captcha solver. Previously we just built an MNIST image classification model using a custom-built CNN model in Keras. We will use that model to implement a real-time captcha solver.
IMPORT THE REQUIRED LIBRARIES
import cv2 import matplotlib.pyplot as plt import numpy as np from PIL import Image import pandas as pd import glob import imutils from imutils import paths import os import os.path import collections import tensorflow as tf from tensorflow import keras
As usual, we have used famous data science libraries, NumPy and pandas. Matplotlib for visualisations and Keras for building our deep learning models. cv2 is commonly used for image analysis. We have used a new library called fetch_openml, an extensive collection of open-source datasets widely used for beginners.
img = cv2.imread("64.png") gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, thresh1 = cv2.threshold(gray, 10, 255, cv2.THRESH_BINARY_INV) rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5)) dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1) contour, contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
First, download any random captcha from the internet and load it into your colab environment or in a jupyter notebook. Now convert the RGB image into a grayscale image using convert colour function is cv2. Our idea is to find the individual numbers from the captcha and extract it from the captcha for identification. We have used a threshold that defines the bounding boxes, and contours are used to identify the bounding boxes. Crop out the remaining regions that are of interest to us and finally copy the final image to another image.
EXTRACTING INDIVIDUAL DIGITS
letters=dict() position=list() for cnt in contours: x, y, w, h = cv2.boundingRect(cnt) #rect = cv2.rectangle(im2, (x, y), (x + w, y + h), (0, 255, 0), 2) cropped = im2[y:y + h, x:x + w] #text = pytesseract.image_to_string(cropped) #print(text) position.append(x) letters[x]=cropped
. We created a dictionary to keep track of the digits’ order in the input image. The boundingRect is used to extract the bounding box coordinates from the contour and store it in the variables. Now the cropped image is created, and the digits are sorted in the increasing order of the x-coordinate of the contour in our dictionary.
The captcha image we downloaded can be found here.
position=sorted(position) letters=collections.OrderedDict(sorted(letters.items())) len(position)
The number of individual digits in the captcha is given as an output in the above code.
CHECKING THE EXTRACTED DIGITS
plt.imshow(cv2.resize(letters[position],(28,28),interpolation=cv2.INTER_LINEAR)) plt.imshow(cv2.resize(letters[position],(28,28),interpolation=cv2.INTER_LINEAR)) plt.imshow(cv2.resize(letters[position],(28,28),interpolation=cv2.INTER_LINEAR)) plt.imshow(cv2.resize(letters[position],(28,28),interpolation=cv2.INTER_LINEAR)) plt.imshow(cv2.resize(letters[position],(28,28),interpolation=cv2.INTER_LINEAR))
The above code plots the extracted image whose coordinates were stored in the dictionary we created earlier. The images can be found in
def solver(model,weights,letters,position): n=len(position) model.load_weights(weights) for i in range(n): im=letters[position[i]] im=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY) im=cv2.resize(im,(28,28),interpolation=cv2.INTER_LINEAR) im=np.expand_dims(im,axis=-1) im=np.expand_dims(im,axis=0) print(np.argmax(model.predict(im),axis=-1))
We created a helper function for our captcha solver. It loads the model weights that we downloaded in our previous tutorial. Then for each of the extracted image, it converts them into a tensor that can be given as an input for our model. It finally predicts the image and then gives the predicted output.
model=keras.models.Sequential() model.add(keras.layers.Conv2D(64,7,activation='relu',padding='same',input_shape=[28,28,1])) model.add(keras.layers.MaxPooling2D(2)) model.add(keras.layers.Conv2D(128,3,activation='relu',padding='same')) model.add(keras.layers.Conv2D(128,3,activation='relu',padding='same')) model.add(keras.layers.MaxPooling2D(2)) model.add(keras.layers.Conv2D(256,3,activation='relu',padding='same')) model.add(keras.layers.Conv2D(256,3,activation='relu',padding='same')) model.add(keras.layers.MaxPooling2D(2)) model.add(keras.layers.Flatten()) model.add(keras.layers.Dense(128,activation='relu')) model.add(keras.layers.Dropout(0.5)) model.add(keras.layers.Dense(64,activation='relu')) model.add(keras.layers.Dropout(0.5)) model.add(keras.layers.Dense(10,activation='softmax'))
This is the model that we created in our previous tutorial. So no explanation is needed on how we built the model. Refer to my previous tutorial to know more about the model-building procedure. The link can be found here.
We compiled our model using sparse_categorical_entropy as our loss function and Adam optimiser. We used accuracy as our evaluation metric. We don’t need to train our model as we already trained it in my previous tutorial and downloaded the model weights. We can upload the weight in your colab notebook and use it anytime you need.
weights='Captcha Solver.h5' solver(model,weights,letters,position)
    
We mentioned our model weights file and made predictions on the extracted individual digits. Our model perfectly predicted all the digits, and that’s great. Now our captcha solver is ready.
In this tutorial, we implemented the captcha solver that we were talking about in our previous tutorial. We extracted individual digits from our captcha and fed it the model that we previously trained on the MNIST digits dataset, and our model perfectly predicted the captcha. You can also look at my previous tutorial to know more about how we built our custom CNN model and trained it.