Using Image Processing for Business Card Layout and Text Recognition

Using Image Processing for Business Card Layout and Text Recognition

Image processing, which is a generalizing title for all kinds and methods of applying the computer power to extract data from pictures, is definitely a hot topic among developers. Over the last decades, image processing has greatly impacted the medical industry, space exploration, geology, oceanography, and today, with a powerful digital camera in every phone, its applications are quickly becoming close not only to scientists but to everyone. Having accomplished a number of projects using image processing techniques (here are just some examples):

We can confidently state that image and video processing has become the most interesting area in digital signal processing. This article is dedicated to an interesting Image Processing technique used to automatically recognize and extract information from business cards.

Business cards recognition systems accuracy

Although there already exist several automatic recognition systems and business card reader apps showing 80% recognition accuracy with a recall of 80% and a precision of 70%, (we also recommend you to read the article about the Best Business Card Reader App for the iPhone), we dare to say that the algorithm our team suggest can show 90% accuracy.

Stages of business cards recognition

Since the text recognition part of the process takes up quite a bit of CPU power, we suggest using a standard client-server methodology where a client (e.g., mobile device) takes a photo of a certain business card, finds the text areas on the image and sends them to the powerful server for optical recognition. Once recognized, this textual content can be sent back to the phone, stored in a database at the server, passed on to some third party app, etc. The task thus has two main stages:

  1. Business card layout recognition
  2. Optical character recognition (OCR)

Note: As an OCR engine we propose to use Tesseract — probably the most accurate open source OCR engine available. Tesseract’s OCR accuracy is near 98% for character recognition and 95-97% for word recognition.

How the system works: step-by-step explanation

Now we are going to describe a simple algorithm implemented in MATLAB to recognize a business card layout. The algorithm will work with a grayscaled image. That’s why we start the process from transforming a color image into a grayscaled one. To detect text areas we use a special filter — a modified method of standard deviation on sliding window calculation. The result of this filter is converted into a binary image by means of the Otsu Thresholding algorithm. After that, we pick the blobs satisfying certain criteria for length, width, and direction. For each blob which satisfies these criteria, we build a bounding box. Having a set of bounding boxes we obtain a mask for finding text areas. The pictures below illustrate the effectiveness of the suggested approach. Note that it is only a prototype and all control parameters are hardcoded for this type of image.

1.1. Original color image

1.1. Original color image

1.2. Gray-scaled image

1.2. Gray-scaled image

1.3. Filtered image

1.3. Filtered image

1.4. Thresholded image

1.4. Thresholded image

1.5. Filtered blobs

1.5. Filtered blobs

1.6. Bounding boxes for found blobs

1.6. Bounding boxes for found blobs

1.7. Found text areas

1.7. Found text areas

Here are a few examples of the algorithm at work:

Card 1 - horizontal layout, dark text on light background

Card 1 – horizontal layout, dark text on light background

Card 2 - horizontal layout, light text on dark background

Card 2 – horizontal layout, light text on dark background

Card 3 - vetical layout, dark text on light background

Card 3 – vertical layout, combination of dark/light text and backgrounds

With the algorithm described above we can efficiently find text areas on business cards building reasonable guesses on the purpose of each text area. And then, using Tesseract for Optical character recognition (the 2nd stage of our task), we can reliably achieve 90% precision of business card layout and text recognition.

Contact us

Tell your idea, request a quote or ask us a question