Hand tracking system using OpenCV, Mediapipe, and Pyfirmata

Wave your hand in front of your smart TV and instantly access your favorite show for the evening. Set an alarm for that essential Monday morning meeting by tapping your finger to your smartwatch. It’s no longer so far-fetched: hand tracking and gesture recognition technology are permeating a variety of industries.

But there are many obstacles, starting with the requirement that you wave your hands in front of your smartphone’s tiny screen and ending with the sophisticated machine learning algorithms required to distinguish gestures other than the typical hand wave.

In this tutorial, we’ll look at how to use OpenCV, Mediapipe, and Pyfirmata to build our own hand tracking system in Python. Simply clicking on them will take you to a more in-depth description.

So, let’s get this tutorial started…

Prior to writing any code, we must first add the required libraries in Python to our system.

!pip install mediapipe
!pip install pyfirmata
!pip install opencv-python

To create the project, we’ll use the following imported lmodules in our code.

#import necessary modules
import mediapipe as mp
import pyfirmata
import numpy as np
import cv2

We’ll declare an object from mp.solutions called ‘hands.’
A minimal detection confidence of 0.8 is set inside the class ‘Hands()’ for the purpose of detecting hands.

mpHand = mp.solutions.hands
mpDraw = mp.solutions.drawing_utils
hands = mpHand.Hands(min_detection_confidence=0.8)

To run our program, we’ll build a while loop after uploading a file or employing the webcam code. Here, we read the frames from the video file or webcam and convert them to RGB. We then use the “hands.process()” function to find any hands in the frame. After detecting the hands, we’ll look for the landmarks and use cv2.circle to highlight their corresponding dots, and mpDraw.draw landmarks to link those dots together.

#cap = cv2.VideoCapture(0)- for webcam
cap = cv2.VideoCapture("vid.mp4")

while cap.isOpened():
    success, img = cap.read()
    img = cv2.flip(img, 1)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    results = hands.process(img)
    img = cv2.cvtColor(img,cv2.COLOR_RGB2BGR)

    #Detecting Hands
    multiHandDetection = results.multi_hand_landmarks 
    lmList = []

    if multiHandDetection:
        #Hand Visualization
        for id, lm in enumerate(multiHandDetection):
            mpDraw.draw_landmarks(img, lm, mpHand.HAND_CONNECTIONS,
                                  mpDraw.DrawingSpec(color=(0, 255,255), thickness=5, circle_radius=8),
                                  mpDraw.DrawingSpec(color=(0, 0, 0), thickness = 5))

        #Hand Tracking process
        singleHandDetection = multiHandDetection[0]
        for lm in singleHandDetection.landmark:
            h, w, c = img.shape
            lm_x, lm_y = int(lm.x*w), int(lm.y*h)
            lmList.append([lm_x, lm_y])


        # drawing  point
        myLP = lmList[8]
        px, py = myLP[0], myLP[1]
        cv2.circle(img, (px, py), 15, (255, 0, 255), cv2.FILLED)
        cv2.putText(img, str((px, py)), (px + 10, py - 10), cv2.FONT_HERSHEY_PLAIN, 2, (0, 0, 255), 3)
        cv2.line(img, (0, py), (ws, py), (0, 0, 0), 2)  # x line
        cv2.line(img, (px, hs), (px, 0), (0, 0, 0), 2)  # y line

        print(f'Hand Position x: {px} y: {py}')
We have completed the hand detection process and are now writing the output code.
#The cv2.imshow() function is used to draw the picture on the window
cv2.imshow("Image", img)
    key = cv2.waitKey(1)
    if key == ord('q'):

It’s all done now, hurrah! Let’s have a look at the results.

So, we have finally learned how to build a handtracking system. You now understand and have all of the concepts required to create a hand tracking programme. We hope you had fun with this tutorial.

Leave a Reply

Your email address will not be published.