FORT LAUDERDALE, Fla. A French technology research company has unwrapped a vision-processing system-on-a-chip that it claims can be manufactured as inexpensively as a microcontroller. The $6 Generic Visual Perception Processor (GVPP) can automatically detect objects and track their movement in real-time, according to Bureau d'Etudes Vision (BEV).
The rights to manufacture the GVPP will be up for grabs at a technology auction slated for next month. If properly commercialized, auctioneers PricewaterhouseCoopers LLP estimate a multibillion dollar gross revenue stream for the GVPP based on 100 proposed applications in 10 industries.
"We couldn't manage multiple licenses to competing companies," said Nabeel Al-Adsani, director of operations at BEV. "Instead we hope to interest major semiconductor manufacturers in licensing the GVPP so that they can supply the application-specific companies with chips."
The GVPP, which crunches 20 billion instructions per second (Bips), models the human perceptual process at the hardware level by mimicking the separate temporal and spatial functions of the eye-to-brain system. The processor sees its environment as a stream of histograms regarding the location and velocity of objects. Those objects could be the white lines on a highway, the football in a televised game or the annotated movement of enemy ground forces from satellite telemetry.
Alongside a CMOS imager on its 2 x 4-inch evaluation board, the GVPP has been demonstrated as capable of learning-in-place to solve a variety of pattern recognition problems. It boasts automatic normalization for varying object size, orientation and lighting conditions, and can function in daylight or darkness.
A complete GVPP system including the charge-coupled device and all support circuity should cost less than $50, the company said. BEV also claimed that the software it provides with the chips permits engineers to develop applications for the GVPP in just a few weeks.
The GVPP was invented in 1992, when BEV founder Patric Pirim saw that it would be relatively simple for a CMOS chip to implement in hardware the separate contributions of temporal and spatial processing in the brain. The brain-eye system uses layers of parallel-processing neurons that pass the signal through a series of preprocessing steps, resulting in real-time tracking of multiple moving objects within a visual scene.
Pirim created a chip architecture that mimicked the work of the neurons, with the help of multiplexing and memory. The result is an inexpensive device that can autonomously "perceive" and then track up to eight user-specified objects in a video stream based on hue, luminance, saturation, spatial orientation, speed and direction of motion, the company claims .
The GVPP tracks an "object," defined as a certain set of hue, luminance and saturation values in a specific shape, from frame to frame in a video stream by anticipating where its leading and trailing edges make "differences" with the background. That means it can track an object through varying light sources or changes in size, as when an object gets closer to the viewer or moves farther away.
The chip houses 23 neural blocks, both temporal and spatial, each consisting of 20 hardware input and output "synaptic" connections. The GVPP multiplexes this neural hardware with off-chip scratchpad memory to simulate as many as 100,000 synaptic connections per neuron. Each of these synapses can be changed through the on-chip microprocessor for a combined processing total of over 6.2 billion synaptic connections/second.
In executing up to 20 Bips to analyze successive frames of a video stream, the temporal neurons identify pixels that have changed over time and generate a 3-bit value indicative of the magnitude of that change. The spatial-processing system analyzes the resulting "difference" histogram to calculate the speed and direction of the motion.
The GVPP's major performance strength over current-day $10,000 vision systems is its automatic adaptation to varying lighting conditions. Today's vision systems dictate uniform, shadowless illumination, and even next-generation prototype systems, designed to work under "normal" lighting conditions, can be used only from dawn to dusk. The GVPP, on the other hand, adapts to real-time changes in lighting without recalibration, day or night.
Divide and conquer
Since processing in each module on the GVPP runs in parallel out of its own memory space, multiple GVPP chips can be cascaded to expand the number of objects that can be recognized and tracked. When set in master-slave mode, any number of GVPP chips can divide and conquer, for instance, complex stereoscopic vision applications.
On the software side, a host operating system running on an external PC communicates with the GVPP's evaluation board via an OS kernel within the on-chip microprocessor. BEV dubs the neural-learning capability of its development environment "programming by seeing and doing," because of its ease of use. The engineer needs no knowledge of the internal workings of the GVPP, the company said, only application-specific domain knowledge.
"Programming the GVPP is as simple as setting a few registers, and then testing the results to gauge the application's success," said Steve Rowe, BEV's director of research and development. "Once debugged, these tiny application programs are loaded directly into the GVPP's internal ROM."
Application programs themselves can use C++, which makes calls to a library of assembly language algorithms for visual perception and tracking of objects. The system's modular approach permits the developer to create a hierarchy of application building blocks that simplify problems with inheritable software characteristics.
"Simple applications can be quickly prototyped in a few days, with medium-size applications taking a few weeks and even big applications only a couple of months," said Rowe.
In applications, each pixel may be described with respect to any of the six domains of information available to it hue, luminance, saturation, speed, direction of motion and spatial orientation. The GVPP further subcategorizes pixels by ranges, for instance luminance within 10 percent and 65 percent, hue of blue, saturation between 20 and 25 percent, and moving upward in scene.
A set of second-level pattern recognition commands permits the GVPP to search for different objects in different parts of the scene for instance, to look for a closed eyelid only within the rectangle bordered by the corners of the eye. Since some applications may also require multiple levels of recognition, the GVPP has software hooks to pass along the recognition task from level to level.
For instance, to detect when a driver is falling asleep a capability that could find use in California, which is about to mandate that cars sound an "alarm" when drowsy drivers begin to nod off the GVPP is first programmed to detect the driver's head, for which it creates histograms of head movement. The microprocessor reads these histograms to identify the area for the eye.
Then the recognition task passes to the next level, which searches only within the eye area rectangles. High-speed movement there, normally indicative of blinking, is discounted, but when blinks become slower than a predetermined level, they are interpreted as the driver nodding off, and trigger an alarm.
Pirim has long-term plans out to 2006 for the GVPP. "We have a very clear set of upgrades to take advantage of putting more transistors onto our system-on-a-chip," said Pirim.
First, a CMOS imager will be integrated on-chip with the GVPP, enabling watch-size vision systems by 2002. After that, Pirim plans to integrate flash memory that will enable a system the size of a pinky ring by 2004. And by 2006, Pirim has slated an expanded on-chip DRAM plus beefed up on-chip processing to solve multisensor fusion applications in hat-pin-size vision systems.
Application-specific software libraries are also planned, including optical character recognition, 3-D analysis and spatial organization.
BEV lists possible applications for the GVPP in process monitoring, quality control and assembly; automotive systems such as intelligent air bags that monitor passenger size and traffic congestion monitors; pedestrian detection, license plate recognition, electronic toll collection, automatic parking management, automatic inspection; and medical uses including disease identification. The chip could also prove useful in unmanned air vehicles, miniature smart weapons, ground reconnaissance and other military applications, as well as in security access using facial, iris, fingerprint, or height and gait identification.