How to Use Machine Learning to Recognize Text in Images

in #programming10 years ago

Here I will show you how to use IBM Watson and Python to read text from images. Practicality? Yes, I'm glad you asked:

  • license plate recognition (from real time video feed - cc cams, ip cams, etc.)
  • recognizing and storing id information from photos (sites that use photo-id verification)
  • converting photos of pages from books to documents, to pdf, etc.
  • basically, any kind of text information gathering from images and video feeds.

You could create a script that would run over a video-feed (your ip cam) that looks over your parking spot and whenever it finds a car with a different plate number (that has illegally parked in your spot) than your car it takes a snapshot. You could further use the information at your disposal :)

Text Recognition with Watson and Python

We will use the IBM Watson Visual Recognition API and we'll call it from Python. I'm doing this under Windows 8 64-bit OS.

Watson is an IBM product. It's basically a complex computer algorithm that makes use of machine learning technology to reveal insights from unstructured data, usually very large amounts of data. To avoid repeating myself, please see the first tutorial I did for Watson and Python. It is a prerequisite if you want to follow along with this one.

Hence, I'm going to make a couple of assumptions:

  • you have a Bluemix account (free)
  • you have Python installed
  • you have the watson_developer_cloud module in Python
  • you have setup the Visual Recognition API in Bluemix

All these pre-requisites are explained in detail in part 1.

Now, here comes the easy part.

1. Open the Python command line and run the following

from watson_developer_cloud import VisualRecognitionV3 as vr
instance = vr(api_key='paste your api _key here', version='2016-05-20')

2. Select an image (local or url) and run the text recognition feature. For this tutorial I'm using this image:

And I run the following commands:

img = instance.recognize_text(images_url='url-path-to-img.jpg')

If I type in 'img' in the command-line I get the full result:

It gives us the location of the characters and other relevant information. But we only need the text. So, to get it, I run the following command:

img['images'][0]['text']

Which returns the text from the image ('mos pj 15').

Additionally...

There are other features of the Visual Recognition API that you can use:

  • 'instance.classify(...' to recognize objects and themes in the images (tutorial 1)
  • 'instance.detect_faces('...'
  • and others.

In your Python Command line, you can run:

help(instance)

And you'll get the full documentation, often with examples. Happy coding!


To stay in touch, follow @cristi

#programming #machinelearning #python

Image Credit


Cristi Vlad, Self-Experimenter and Author

Sort:  

lately I started thinking about two problems:

a) matching same images with different resolution
b) matching images, which one is part of another

Do you know any solutions, which could help me with that?

I have a vague idea about what you're trying to say. could you give me a specific example please?

@cristi This is an awesome post! So simple and understandable. Hope you'll consider joining us at STEEMBOTS we could use a mind like yours! Upvoted and following you now!

I added the post to my reading list. will get into it later. thank you @williambanks

More more! this is excellent @cristi!

great! i've to brush up on python and start playing with watson api! followed ya!

it's not difficult. you can catch up easily. plus, there are so many api in watson that can be used. it's crazy! followed you back!

thanks! watson is just incredible! jeopardy is one thing, but all the deep learning and ai he does in lots of different fields is incredible

Upped and followed, thanks! Bookmarking with cashtags $b.machinelearning $b.ibm $b.python $b.dev $b.gpu

thanks! followed you back! :)

did someone say bashtags? #!dev #!gpu LOL

nicely done, following you now to see more about this topic

thank you!

Great article!

thank you Luke! :)

Coin Marketplace

STEEM 0.04
TRX 0.33
JST 0.093
BTC 62724.02
ETH 1779.38
USDT 1.00
SBD 0.39