Design Article
Digital Video Surveillance Integrates Many Technologies
Jack Shandle
2/10/2005 12:00 AM EST
High-end digital-signal processing has another killer app: digital video surveillance.
The overall video surveillance market will grow to approximately $3 billion by 2007, according to market research firm J.P. Freeman & Co. The fastest growing segment will be digital surveillance. This growth will have a significant impact on several technologies including DSPs, imaging software, and mass storage.
The primary reason for moving from analog to digital video is that intelligence can be embedded in the surveillance system itselfas well as in the networkinstead of relying on highly caffeinated human beings watching TV monitors, as is generally the case in traditional analog video surveillance.
Although national security is the primary motivation for video surveillance's projected rapid market growth, there are compelling reasons for corporations to migrate to digital video as well.
The average casino, for example, has between 2000 and 3000 surveillance cameras capturing data 24 hours a day, 365 days a yearand a small army of security personnel to monitor the cameras. Federal and state laws require that data to be stored for up to a year, which requires literally terabytes of mass storage.
To cite a simple example, if a camera monitoring a casino exit or entrance can turn itself on when motion is detected and off when there is no motion in the frame, the savings in storage alone makes digital video a profitable proposition, says Yvonne Cager, Video Surveillance Marketing Manager for Texas Instruments.
Applications closely associated with national security and anti-terrorism, including the contentious issue of facial recognition, can stretch the limits of image recognition and motion detection. Examples of this include keeping track of an unattended suitcase or parcel in an airport and reporting that it has been unattended for a specified amount of time or monitoring how a parcel may move from place to place inside a secure area.
Simply put, there is plenty of legacy coaxial cable, cameras, analog muxes, and decoders in the existing infrastructure, says Noam Levine, Business Development Manager at Analog Devices. Each video surveillance application requires a careful assessment of how to marry digital intelligence with analog infrastructure, but in many instances, the answer is to implement the intelligence into the camera itself.
A relatively straightforward application that can be handled by embedding intelligence in the camera is motion detection and identification. A digital camera used to monitor an outdoor scene for suspicious activity might, for example, execute algorithms that can distinguish between trees or shrubs swaying in the wind and an individual or vehicle entering the scene.
An application that probably requires processing requirements beyond those that could be built into a camera economically might be aerial surveillance in which the camera and the target are moving. In some high-end instances, camera resolution could be high-definition.
In these relatively early days of digital video surveillance rollout, it is not unusual for system integrators to acquire data digitally (digital camera) and execute some image processingperhaps followed by decision making (turn or elevate the camera). Then the data is converted to analog so it can be distributed over the existing infrastructure only to be recovered to digital data for more sophisticated analysis and, finally, storage.
In the future, we can expect to see end-to-end digital systems that can utilize the IP network where transport is almost free. Each surveillance camera will become a WebCam, and intelligence will be applied to the video data along the communication path.
Contrast is a good example. Since motion detection and image recognition algorithms depend often depend heavily on identifying the edges of objects, a high contrast image may be the best one to work withand that edge-sharpening algorithm could be implemented within the camera system.
Machine interpretation of these images leads directly to decision making, which can also be accomplished within the local camera system. Tracking motion through the frame of an object that has already been identified as being of interest can lead to the imaging systems giving instructions to the camera positioning system to pan, zoom, or turn the camera so the object can be tracked beyond of the initial field of view.
This and other scenarios lead to an important hardware consideration that is virtually unknown in conventional video processingbalancing image processing with control functions.
Most applications that use analog cameras seem to favor MPEG4 encoding of CIF image size, says ADI's Levine. For compression on networked cameras, requirements range from CIF images at 15 frames/sec right up to DIF images at 30 fps. More recent requests are for MPEG4, but some OEMs are transitioning to H.264.
From a DSP perspective, the two leading vendorsAnalog Devices and Texas Instrumentshave taken somewhat different approaches to digital video surveillance.
Since the processor is programmable, it can be upgraded easily to accommodate new algorithms. Its flexibility is further enhanced by integrating a multithreaded operating system supplied by Quadros Systems.
Although the Blackfin/Quadros RTXC/dm RTOS is a general-purpose RTOS, it offers valuable features for video surveillance, says Marketing Director Steve Martin. It is scalable and can run on multiple cores or multiple processors. This is useful in operations where multiple video feeds are all fed back to a mainframe for processing. It is also useful when the surveillance system itself is expanded, such as adding an additional floor to the security area.
The biggest advantage, however, is multithreading, which matches up with the dual nature of the Blackfin processor. Information can follow one of two paths, each with its own OS kernel. One path uses a kernel characterized by a single stack. This path is sometime referred to as a lightweight thread because it does not save a context for the operations. This is the path utilized by DSP operations that run until completionor until they are interrupted.
Figure 1 shows the RTOS architecture. The right side utilizes the RTXC/ss (single-stack) kernel. The right side utilizes the RTXC/ms (multi-stack) kernel.
The RTOS also specifies three priority levels of priority, as shown in Figure 1. Interrupt service routines (ISRs) have the highest priority. Information coming from any peripheral devices, including cameras and other security sensors can initiate an ISR which might cause an ongoing DSP operation to stop and reset itself. DSP operations, such as executing algorithms, are in the middle priority level and control functions that do not have to happen in hard real timesuch as tilting the camerahave the lowest priority.
The TMS32DM64x Digital Media Processors have on-chip video ports and can handle both video and audio. Peripherals are added to fit specific applications. The DM642 has three video ports; the DM641 has two video ports; and the DM640 has one video port. TCP/IP stacks are common to all three chips.
As a guide to the expected requirements for this sort of baseline performance, Texas Instruments' 600-MHz DM642 video processor can handle multiple D1 decodes for MPEG-2, MPEG-4, and H.264. Encoding is more compute intensive and the DM642 drops down to a single channel for encoding all but the MPEG-4 algorithm. The 400-MHz DM640 is capable of handling CIF resolution.
TI has also included its power over Ethernet solution with its digital video surveillance platforms. TI refers to its platforms as IP cameras because they are capable of running over TCP/IP links.
While DSP chips and RTOSs are essential to a functioning digital video surveillance system, the heart of these systems are the imaging algorithms. These are used to parse a video frame, interpret the processed video data, and take specific actions as a result of inferences the software creates from the data and how it is interacting with a set of rules.
Software specific to digital video surveillance has three main components: video analysis, content analysis, and taking action on the results of the content analysis. Video compression for the purpose of transmitting video data over cable also contributes to the computational load on the DSP. But since this utilization of compression is not specific to surveillance applications, it will not be discussed here.
Using TI's 600-MHz, 4800-MIPS DM642 as a benchmark, the image processing part "is a fraction" of the computational power the designer has at his or her disposal, says Paul Brewer, Vice President of Technology at ObjectVideo, Inc. The actual amount varies widely among implementation but 30% to 40% is a reasonable ballpark figure.
The software that ObjectVideo already has operating in the field runs on a PC but the company has a functioning prototype of an embedded product that utilizes the DM642.
The first step is video analysis. Examples of this functionality include separating out and discarding background image data such as rain or trees and identifying and extracting moving objects. Once the raw video data is stripped down to its video surveillance essentials, it is converted to metadata. The metadata is derived from MPEG-4 compressed data for object-oriented coding.
Metadata is sent to a content analysis engine where it is evaluated by a set of rules set up by the security officer responsible for the site. This is largely a matter of evaluating subsets. For example, if a moving object is identified as a person and if the person moves past a certain point in the scene designated by the security authority then the software will initiate an action.
ObjectVideo's application software uses a simple point-and-click tool interface to help security personnel create the rule set. Video images of the security area are used to draw security boundaries which are annotated in several ways to implement a set of rules. A sample screen is shown in Figure 2.
The video image in the foreground of Figure 2 shows a shoreline near Reagan Airport in Washington DC. The red line running roughly parallel to the shoreline, drawn with simple graphics tools, represents a virtual trip wire.
The yellow arrow indicates the direction of motion that will initiate a rule violation. If an object of the right class crosses the trip wire in the direction indicated, at least part of the rule sequence will have been violated. Another part of the rule sequence could be the type of object that crosses the tripwire.
In addition to tripwires, the software can implement other types of rules, such as loitering rules.
When a rule is violated, an alert appears on the ObjectVideo console. In addition, an email that includes a still image can be sent to the security manager and automated calls can be made to cell phones. The software can require a specific type of acknowledgement that an alert has been received.
The software identifies potential threats, says Brewer. Only security professionals can determine the true nature of the activity that resulted in one or more rules being broken.
Digital video surveillance requires the integration of several technologies that have seldom been integrated before. The trend to embedded systems where much of the intelligence is in or near the camera itself will provide a rapidly growing market for DSPs capable of handling the job.
Contributing writer Jack Shandle is a former chief editor of both Electronic Design magazine and ChipCenter.com. He holds a BSEE degree and has written hundreds of articles on all aspects of the electronics OEM industry. Jack is president of e-ContentWorks, a consultancy that creates high-value content for publishers, eOEM corporations, and industry associations. His email address is jshandle@earthlink.net.



