CCTV footage has made leaps and bounds in the past ten years. Not only with the quality of of images, which were often so grainy that one wondered why even bother. What you can do with the footage and audio you capture is expanding all the time and advanced through intelligent analytics – face detection, tampering detection, appearing detection, and audio detection just to name a few.
Hanwha Techwin who, among other things, produce the Samsung cameras we use in our security solutions for our customers, is continuing to invest in the research and development of intelligent analysis technology. Below we will highlight one of the featured technologies that can be found in Hanwha Techwin network cameras.
Intelligent audio and video analytics is a technology which alerts the operator of abnormal activities, detected through analyzing video and audio information, designed to prevent accidental or intentional actions and to minimize damage through prompt response.
Audio Detection – the basics
Audio detection is a technology that detects audio levels which exceed the user-defined levels. As audio levels are greater in abnormal situations than in normal situations, audio levels exceeding set levels are detected as being an abnormal situation. Through audio detection technology, the camera is able to detect abnormal situations, then notify the operator via event signals allowing the operator to take suitable measures.
Hanwha Techwin’s audio detection technology calculates the absolute level of actual audio signals collected using the microphone, then normalizes the levels in steps of 1 to 100. It defines the normalized level as the audio size, and audio levels exceeding the set level are detected as an event. Note that the audio size used for this purpose does not correlate to specific decibels (dB) values.
Configuration involves adjusting the audio level of detection threshold as needed.
Audio Source Classification
Audio source classification is a technology to classifying audio being input to the camera. Since the audio detection technology previously discussed generates alarms based simply on audio size, it may generate events even under normal situations. To overcome such limitations, technologies to classify audio source types have being developed.
When the camera classifies the audio source type satisfying the criteria defined by the operator, it then notifies the operator via event trigger allowing a suitable response to be taken.
Hanwha Techwin features an audio source database which supports the classification of:
The camera extracts the characteristics of the audio source collected using the camera’s internal or externally connected microphone and calculates its likelihood based on the pre-defined database. It selects the audio source with the highest likelihood and generates an event. The algorithm flow classifies audio sources as follows:
Hanwha Techwin’s audio source classification technology available in X Series cameras features three customizable settings for category, noise cancellation and detection level for optimum performance in a variety of installation environments. It also provides a graph which visualizes audio source levels to allow for the intuitive checking of noise cancellation and detection levels setup.
Generates events based on audio source type detection. An operator can select the type of audio source for detection, and multiple audio sources can be selected.
- Scream: Generates events based on detections of loud voices such as screaming and yelling of adults and children. (90% accuracy distance 53ft)
- Gunshot: Generates events based on the detection of non-continuous gunshot sounds. (80% accuracy distance 1,969ft)
- Explosion: Generates events based on the detection of explosion sounds. (90% accuracy distance 1.49 mi)
- Crashing Glass: Generates events based on the detection of crashing glass sounds. (80% accuracy distance 26ft)
2) Noise Reduction
Depending on the environment where the microphone is installed, the operator can enable the Noise Reduction function. This function can reduce background noise greater than 55dB-65dB for increased detection accuracy.
Using the level of detection graph, the user can enable or disable the noise reduction function to view the result and validate the optimum configuration.
With noise reduction enabled, the system analyzes the attenuated audio source. As such, the audio source classification performance may be hindered or generate errors.
3) Detection Level
The detection level specifies the audio source volume levels at which to perform audio source classification. Audio volume levels of the audio source are updated continuously and displayed on a graph with the most recent indications on the right. Audio source classification is performed only on audio sources exceeding the set level. Thus, only input audio sources with volume levels exceeding the threshold undergo audio source classification.
Lower thresholds results in greater audio source classification data and possibly a greater misdetection probability. Higher thresholds results in less audio source classification data and greater non-detection probability. The threshold must be set appropriate to the surrounding noise level of the camera.