Portland, Ore. As the long development of charge-coupled device (CCD) and CMOS active-pixel sensor technology begins to pay off in the form of affordable all-electronic still and video cameras, a second wave of solid-state imaging chips with very different capabilities is emerging from research labs around the world.
Called "vision chips," these silicon imaging devices are typically parallel computers on a chip implementing a processor per pixel to mimic neural processing circuitry in the retina. Rather than striving for high resolution and faithful color reproduction, vision chips capture other aspects of the eye and brain functions, such as edge and motion detection. Target applications include security systems, autonomous robots, artificial implantable retinas and biochemical analysis.
A few projects have reached the commercial stage, including a real-time in vivo glucose-monitoring system from Array Vision Engineering Co. (Alachua, Fla.) and a security camera being marketed by Pixim Inc. (Mountain View, Calif.), which has commercialized research from a project at Stanford University.
An example of state-of-the-art vision chip technology surfaced at last month's International Symposium on Circuits and Systems in Kobe, Japan, where EE professor Piotr Dudek of the University of Manchester (England) demonstrated a third-generation device with 16,384 pixel-processors that mimics the retina. Called Scamp, for "SIMD current-mode analog-matrix processor," the chip integrates an arithmetic-logic unit, memory, control logic and an input/output circuit behind each and every pixel. The Scamp-3 vision chip promises to enable robots and automated inspection, surveillance and vehicle-guidance systems to "see" in a manner similar to human sight.
"Our smaller, previous-generation device, the Scamp-2, comprised 1,872 processors and is working very well in the lab," said Dudek. "We had some problems with the fabrication of the Scamp-3 chip, which we believe we have corrected when we submitted it for another silicon run. We now expect to obtain fully functional Scamp-3 devices in July."
The Scamp-3 is a 1-cm2 chip fabricated in 0.35-micron CMOS that arranges its 16,384 pixel-processors in a 128 x 128 array. Each pixel-processor measures 50 microns2 and consumes 12 microwatts when running at 1.25 MHz, giving the chip a computational power efficiency of 104 billion instructions per second per watt.
As is often typical of vision chips, the processors are implemented with analog circuitry. In this case, each analog processor is programmable, so that true single-instruction multiple-data operations can be performed as an image is acquired. Current-mode logic is used throughout the pixel-processor design.
A multiplier, a flag register and six general-purpose registers form the heart of the analog processor. Photodetector circuitry, I/O and four communication registers that address neighboring pixels are also implemented in current-mode logic. All circuit components are connected by an analog bus.
"We plan to use Scamp-3 as the eyes for an intelligent robotic system that can react to its environment and correct itself without human intervention," said Dudek.
In this project, called Reverse-Engineering the Vertebrate Brain (Reverb), the Scamp-3 chip will serve as the eyes for a BAE Systems Plc platform with the goal of enabling robots to respond to novel events in a manner similar to humans. The Reverb project head, professor Kevin Gurney at the University of Sheffield, will coordinate the efforts of EEs and computational scientists at the University of Sheffield, University of Wales (Aberystwyth), Bristol University, University of Oxford, University of Cambridge, University of Dundee and BAE Systems. The five-year project is funded with about $3.5 million by the U.K. Engineering and Physical Sciences Research Council, a funding agency for research and training in engineering and the physical sciences.
"Scamp silicon retinas are based on the human eye and work in a similar way to give robots excellent peripheral and central vision," said Dudek. "Like the human eye, our chips process very complex images at rapid rates, thereby filtering the raw data before passing its results along to the robot's brain. We hope this will enable robots to react in real-time."
Unlike a "dumb" video camera that merely records light intensity, the human eye integrates neural networks into each retinal cell, providing real-time information processing as photons arrive. As a result, only high-level information, such as edges and motion, are passed up to the brain. Conventional CCD and CMOS pixel sensors can be used to perform the same motion or edge detection, but at the cost of large amounts of digital computation after the images are acquired. By mimicking the retina, vision chips can eliminate most of that added computing equipment.
Currently, researchers with the Reverb project are attempting to blueprint the manner in which the human eye-brain system translates what it sees into control signals that provoke an immediate, effective response by actuators. This model will be used to wire the Scamp "eyes" to the robot's "brain."
"Reverb enables researchers from a number of disciplines and institutions across the U.K. to work together to understand how the human nervous system integrates sensory information to guide behavior," said project leader Gurney. "Hopefully, we can then transfer these insights to the architecture of robotic platforms."
BAE Systems already has plans to utilize the technology in its laser-guided Crawler robot, which can autonomously machine and inspect high-precision aircraft parts.
"Our basic premise is that nature builds systems very well and if we can mimic those systems, then we have a chance to build better robots ones that combine the best of both the computer and the human worlds," said Dudek.
The third-generation Scamp is the result of seven years of research by Dudek and his colleagues. Reverb currently plans to augment Scamp vision chips with two traditional video cameras: a high-resolution camera for taking "close looks" and a lower-resolution peripheral camera for "keeping an eye out."
The prototypes could be easily scaled using commercially available CMOS fab lines, the team said. Dudek estimated that a 256 x 256-pixel array built with an 0.18-micron process would perform 500 giga instructions per second while dissipating only 2 watts.