PYTHON FOR CHARACTER RECOGNITION – TESSERACT

codabas54 (25)in #opencv • 3 years ago

Tesseract is an optical character recognition tool in Python. It is used to detect embedded characters in an image. Tesseract, when integrated with powerful libraries like OpenCV, can be used to combine the tasks of localizing text (Text detection) in an image along with understanding what the text is (Text recognition).

INSTALLATION PYTHON (3.X):
Open terminal/ command prompt and type:
~pip install pytesseract
~pip install opencv-python

OPENING A SIMPLE IMAGE:

Import cv2.
Import pytesseract.
Save the test image in the same directory.
Create a variable to store the image using cv2.imread() function and pass the name of the image as parameter.
To resize the image use cv2.resize() function and pass the required resolution.
Use cv2.imshow(‘window_name’, image_name).

Add a cv2.waitKey(0) to display image for infinity.

 import pytesseract
 import cv2
 img = cv2.imread('test.jpg')
 img = cv2.resize(img, (720, 480))
 cv2.imshow('Result', img)
 cv2.waitKey(0)

CONVERTING IMAGE TO STRING

Import cv2, pytesseract.
Save the test image in the same directory.
Create a variable to store the image using cv2.imread() function and pass the name of the image as parameter.
Use cv2.imshow(‘window_name’, Image_name).
To convert to string use pytesseract.image_to_string(‘image_name’) and store it in a variable.
Print the string.
Add a cv2.waitKey(0) to display image for infinity.

    import pytesseract
    import cv2
    img = cv2.imread('test.jpg')
    img = cv2.resize(img, (600, 360))
    print(pytesseract.image_to_string(img))
    cv2.imshow('Result', img)
    cv2.waitKey(0)

CONVERTING IMAGE-TEXT TO AUDIO
To convert image to audio we first need to convert image to text and text to audio.

Import tesseract and cv2
Import os.
Open command prompt and type ~pip install gtts.
From gtts import gTTS.
Follow the above steps to convert image to string.
Store the extracted string in a variable.
Play the audio using gTTS() function and pass the parameter as text, language.
Save the audio using save() function.

Play the audio using os.system(‘file_name’)

 import pytesseract
 import cv2
 from gtts
 import gTTS
 import os
 img = cv2.imread('test.jpg')

 img = cv2.resize(img, (600, 360))
 hImg, wImg, _ = img.shape

 boxes = pytesseract.image_to_boxes(img)
 xy = pytesseract.image_to_string(img)
 for b in boxes.splitlines():
 b = b.split(' ')

 x, y, w, h = int(b[1]), int(b[2]), int(b[3]), int(b[4])
 cv2.rectangle(img, (x, hImg - y), (w, hImg - h), (50, 50, 255), 1)
 cv2.putText(img, b[0], (x, hImg - y + 13), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (50, 205, 50), 1)

 cv2.imshow('Detected text', img)

 audio = gTTS(text = xy, lang = 'en', slow = False)
 audio.save("saved_audio.wav")
 os.system("saved_audio.wav")

#phyton #image2text #image2audio #robot #electronic #raspberrypi

3 years ago in #opencv by codabas54 (25)

$0.00

STEEM 0.18

TRX 0.13

JST 0.027

BTC 60462.58

ETH 2636.31

USDT 1.00

SBD 2.58

PYTHON FOR CHARACTER RECOGNITION – TESSERACT

Coin Marketplace