Violence Detection for Smart Surveillance Systems
Computer Vision Ensures High Security Levels
We live in a society that relies heavily on the usage of CCTV cameras for ensuring high-security levels. However, such an approach is highly controversial as we usually use CCTV footages only hours or even days after the incident have already happened. It provides valuable evidence in court but is rarely used to prevent crime or react to it in real-time. The reason for such inefficiency is that the task of monitoring huge quantities of CCTV footage is mainly performed by a limited number of security staff members. Fatigue, worker boredom, and discontinuity of observation make human supervision unreliable. At Abto Software we are applying our extensive technical expertise in computer vision to automate and facilitate the processing of visual information in terms of security monitoring and detection of violent scenes.
Main Concepts of Smart Surveillance Systems
An efficient surveillance system is a large-scale computer vision, data analysis, and decision making challenge. Hence the ‘smart’ approach to extracting information from the surveillance footages employs the synergy of several innovative technologies that power the next key components of the Smart Surveillance Systems:
- Object and face detection: detection of abandoned or stolen objects; face recognition and identification;
- Vehicle identification: license plate recognition; traffic flow monitoring and analyzing;
- Action detection: violence detection; motion detection; theft detection; fire and smoke detection; crowd behavior monitoring and analyzing;
- Video archives processing: finding the associated videos where the person or object of interest appear;
- Detection of camera repositioning or blinding;
- Real-time alarms and notifications on any of the aforementioned events: mobile text messaging; e-mail notifications; direct police alerts;
- Regular reports on the security level in the area.
Through the application of image processing and machine learning techniques, Smart Surveillance Systems are able to extract and interpret the information from the CCTV footages faster and much more efficiently than any human observer. Moreover, all of the components work together in a flexible environment so they can be customized to serve a specific purpose that can also change over time.
Violence Detection with Computer Vision
The rise of criminal activities, their unexpectedness, and the scope of harm inflicted made businesses, government and law enforcement agencies motivated to use comprehensive surveillance systems to identify dangerous environments and respond effectively to violent interactions. Still, automatic violence detection remains an unresolved issue for most security systems due to its complex nature and specific features that have to be attended to.
The state-of-the-art approaches for violence detection in surveillance videos consider audio features extracting, spatiotemporal analysis with and MoSIFT action descriptors, or definition of optical flow motion vectors. Despite the promising results with near 85% accuracy rates, current methods still lack precision, are memory demanding and bear the considerable computational cost which makes them inapplicable for real purposes, particularly surveillance, where high accuracy must be accompanied by the timeliness of the results.
R&D engineers at Abto Software suggested a computer vision approach that detects violent behaviour with up to 95% accuracy for selected datasets and delivers results in real-time. It employs established computer vision and image processing techniques as well as cutting-edge deep learning algorithms.
Abto Violence Detection Technology
We have evaluated our violence detection technology on a 1000-video hockey fight dataset and a 200-clip collection of action movies scenes from E. Bermejo, O. Deniz, G. Bueno, R. Sukthankar. Violence Detection in Video using Computer Vision Techniques, 2011. After splitting the data into 80:20 training and testing subsets we received the next confusion matrices:
The 92,7% of hockey clips are correctly classified as fights. Similarly, 93,0% of videos are correctly classified as non-fighting scenes. The false-positive rate equals 3,5% and false-negative – 5,5%. Overall, 91,0% of the predictions are correct and 9,0% are wrong. As for the action movies dataset, the average accuracy of fight detections equals 95,0% as both false positive and false negative rates equal 2,5%.
The next video visualizes how our technology works in real-time for the BEHAVE dataset. After the fight or any other violent activity is detected the system sends the corresponding notification to the security staff members or directly to the police.
Technologies & Instruments
The described violence detection algorithm was developed in MATLAB and employs the next technologies:
- Computer vision
- Video & image processing
- Deep Learning
Benefits of Abto Violence Detection Technology
The integration of Abto violence detection technology into the surveillance system allows to:
- dramatically reduce the processing time of the CCTV footages;
- free surveillance staff to cope with major non-monitoring security tasks;
- improve decision-making;
- help in law enforcement and crime prevention;
- extract revealing insights from visual data.
Scalable and cost-effective our technology for real-time fight and violence detection can come to aid in the range of fields from retail and healthcare to disaster management and public safety.
Areas of Application
The scope of possible application areas of the Abto violence detection technology includes:
- Law enforcement;
- Public safety ensuring;
- Crowd monitoring and behavioral analysis;
- Surveillance systems installed in:
- border crossing points;
- parking lots;
- shopping malls;
- entertainment venues;
- other indoor and outdoor public access areas.
The primary function of this technology is to help law enforcement agencies prevent breaking out of violence through real-time detection of violent scenes and automatic security alerts. Moreover, our algorithm can be further developed to perform in-depth video analysis for content moderation.