Get the Right Number: Image Processing for Object Counting
Automated counting applications for production lines are designed and developed to track, identify, separate and count products, and all sorts of objects in a bounded image area, and provide fast and highly accurate results. Many of today’s systems are built following traditional approaches to image processing which lowers their efficiency and accuracy. The automated counting system developed by A-Grade IT dedicated developers team for the Bakery factory is designed to locate, identify, and count objects in the input image we get from video processing. One of the major advantages of this system is that it’s capable of separating touching products prior to counting, thus the results are highly accurate at 99.5%.
Today’s post from one of our image processing experts describes the problem of traditional approaches and introduces a new one that eliminated the issues that would arise otherwise, and touches upon other areas it could be applied to identify and count products.
Generally speaking, image processing is considered as any form of signal processing with an image input, such as a photo or video frames. The counting problem in image/video processing is the estimation of the number of objects in a still image or video frame. It arises in many real-world applications including cell counting in microscopic images, monitoring crowds in surveillance systems, or in our case, it’s a problem of counting the baked products on the production lines.
The initial version of the production counting system was designed and developed in MATLAB and had certain limits and restrictions.
- Rather a low counting accuracy that was dropping even more, if the bakery products were touching and in uneven rows on the production line.
- Poorly optimized, the system required too many resources.
- Limited to processing only 4 video channels at once.
So, how can we solve it to get an accurate result?
I. Traditional image processing approach
Here we cover object counting following a traditional approach to image processing, and what challenges we may face, if we were to do it.
The first thing we need to do is to separate the foreground and background. The easiest way would be to use color segmentation, however then we need to resolve the problem of uneven illumination. To avoid it, it is better to apply adaptive thresholding with the Otsu segmentation algorithm for simple cases, using the RGB channel. As a result, we get a binary image (where every pixel equals 1 to refer to an object, and 0 otherwise). This binary image will have a lot of artificial objects with a little square (about 1-10 pixels) which is why we have to remove them using median filtering or methods of the mathematic morphology. In our case, for counting objects on the production line, the morphological opening operation will be a better choice.
At this stage, it is crucial to separate all objects into individual 4-connected areas. We define a pixel set as a 4-connected area where every pixel has at least one northern, southern, eastern, or western neighbour.
Now that we have cleared the binary image that corresponds to the given frame, we move to the issue of counting the objects themselves. If we were to have one video frame, we could use a wave algorithm to count the products. However, we need to count them during a certain time; hence every object will be present at several sequent frames. Therefore, we have to count every object that crosses a certain imaginary line that is perpendicular to the moving direction.
The main challenge for the traditional approach is to identify every object and track it to determine whether it has crossed given a line or not. In other words, instead of counting the objects, we need to track the object trajectories. And to do that you can apply one of the standard tracking techniques, Kalman filter etc., though it will require large computational resources and results are not that accurate. For example, the change in illumination causes errors in tracking or object recapturing. Those are clear disadvantages of the standard approach.
Sometimes they suggest using rectangular areas of interest instead of lines. The rectangle height is bigger than the object height, allowing us to count objects that are inside of the area completely. Still, this method works for counting similar objects that lay in a single row only. Therefore, we have to look for another way to count the objects on the production line.
II. Our approach to image processing for object counting
Let’s consider the video as a sequence of frames that are arranged in a stack of paper sheets. As it was mentioned before, we have to count every object that crosses the given line. Therefore, only pixels that are on this line for each frame should be taken into account. That is, we need to consider the frames cross-sectioning along the line. This method is known as a slit-scan camera algorithm and is widely used in the sport for registering a winner crossing the finish line, as well as for artistic purposes.
How does it work?
1) During a certain period of time, we just copy pixels along the given line from every frame and append them to the resulting image. Thus, we should place the camera in a way that would record objects moving strictly horizontally or vertically. In this case, it is enough to copy a row (or column, respectively) from every frame.
From the object counting perspective, the resulting image is equivalent to the source video and can be considered as its hash, and it is something quite peculiar.
2) The binary image is built from the resulting image in a way that was described above. Now it is easy to estimate background as a mean or median for each resulting image column. Every column is the image of the same point, but at different points in time. To clear binary image and separate every object we can use the usual way.
3) Now we have to find 4-connected areas amount using a wave algorithm.
And it wraps it up! We have a solution for objects counting on production lines.
In practice, though, we face some additional problems such as stitching of the resulting images. It is necessary to take into account the objects that are between resulting images. This task is solved efficiently using mathematical morphology, too.
The main goal of this article is to show a new promising approach to implementing an automated counting system for production lines. What are the main advantages of the suggested approach?
- Low computational cost
- High accuracy: the counting accuracy reaches 99.5%, as it is capable of separating touching products in uneven rows
- Flexible and universal solution: our automated counting system can be applied to counting other manufactured goods on the production lines, including Pharmaceutical products, Food & beverages, Can counting, Part and component counting, etc.
Overall, the system developed with this approach proved to be simple, flexible, and highly accurate and could be applied to solve issues in other areas, such as road traffic identification and monitoring. And here is a little sneak peek into our future post that touches upon video processing, and how can we apply it to road traffic analysis.
If you found this interesting, contact us by filling the form below.