Image processing, which is a generalizing title for all kinds and methods of applying the computer power to extract data from pictures, is definitely a hot topic among developers. Over the last decades, image processing has greatly impacted the medical industry, space exploration, geology, oceanography, and today, with a powerful digital camera in every phone, its applications are quickly becoming close not only to scientists but to everyone. Having accomplished a number of projects using image processing techniques (here are just some examples:

2. Optical character recognition (OCR)

Note: As an OCR engine we propose to use Tesseract — probably the most accurate open source OCR engine available. Tesseract’s OCR accuracy is near 98% for character recognition and 95-97% for word recognition.

Now we are going to describe a simple algorithm implemented in MATLAB to recognize a business card layout. The algorithm will work with a grayscaled image. That’s why we start the process from transforming a color image into a grayscaled one. To detect text areas we use a special filter — a modified method of standard deviation on sliding window calculation. The result of this filter is converted into a binary image by means of the Otsu Thresholding algorithm. After that, we pick the blobs satisfying certain criteria for length, width, and direction. For each blob which satisfies these criteria, we build a bounding box. Having a set of bounding boxes we obtain a mask for finding text areas. The pictures below illustrate the effectiveness of the suggested approach. Note that it is only a prototype and all control parameters are hardcoded for this type of image.

1.1. Original color image

1.2. Gray-scaled image

1.3. Filtered image

1.4. Thresholded image

1.5. Filtered blobs

1.6. Bounding boxes for found blobs

1.7. Found text areas

Here are a few examples of the algorithm at work:

Card 1 – horizontal layout, dark text on light background

Card 2 – horizontal layout, light text on dark background

Card 3 – vertical layout, combination of dark/light text and backgrounds

With the algorithm described above we can efficiently find text areas on business cards building reasonable guesses on the purpose of each text area. And then, using Tesseract for Optical character recognition (the 2nd stage of our task), we can reliably achieve 90% precision of business card layout and text recognition.

$${}$$