Pytesseract is very slow for real time OCR, any way to optimise my code?

  ocr, opencv, python, python-tesseract, tesseract

I’m trying to create a real time OCR in python using mss and pytesseract.

So far, I’ve been able to capture my entire screen which has a steady FPS of 30. If I wanted to capture a smaller area of around 500×500, I’ve been able to get 100+ FPS.

However, as soon as I include this line of code, text = pytesseract.image_to_string(img), boom 0.8 FPS. Is there any way I could optimise my code to get a better FPS? Also the code is able to detect text, its just extremely slow.

from mss import mss
import cv2
import numpy as np
from time import time
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:UsersVamsiAppDataLocalProgramsTesseract-OCRtesseract.exe'

with mss() as sct:
    # Part of the screen to capture
    monitor = {"top": 200, "left": 200, "width": 500, "height": 500}

    while "Screen capturing":
        begin_time = time()

        # Get raw pixels from the screen, save it to a Numpy array
        img = np.array(sct.grab(monitor))

        # Finds text from the images
        text = pytesseract.image_to_string(img)

        # Display the picture
        cv2.imshow("Screen Capture", img)

        # Display FPS
        print('FPS {}'.format(1 / (time() - begin_time)))

        # Press "q" to quit
        if cv2.waitKey(25) & 0xFF == ord("q"):
            cv2.destroyAllWindows()
            break

Source: Python Questions

LEAVE A COMMENT