LONDON -- Embedded robotic vision has reached a turning point in its development. Not only is equipment of all sorts becoming smarter, but it is becoming increasingly aware of its place in the world, according to Jeff Bier, president of Berkeley Design Technology Inc. and founder of the recently formed Embedded Vision Alliance.
And that has lot to do with the rapidly reducing cost of sensors—particularly image sensors— and the processing logic to make sense of the data.
BDTI has spent a lot of time benchmarking digital signal processors and has noticed that there tends to be one main application driver at a time, at least thus far, Bier said. “In the early 1990s, it was digital wireless; in the late 1990s, it was consumer digital audio; in the early 2000s, it was consumer digital video,” he said. “Embedded vision is ready to be one of the next big drivers.”
Another view is that the 1980s and earlier were an era of military application of DSPs, while the next 30 years were an era of industrial and enterprise applications, such as production lines and computer surveillance.
The next era promises to be one of consumerization, in applications such as gaming and automotive drive-by-wire. The potential applications are exploding as the performance of application processors increases and the cost comes down, said Bier. Vision systems are already being used in safety-critical and lifesaving applications, such as lane departure warning and collision avoidance systems in automobiles, and swimming pool alarms to prevent drowning incidents.
Mobileye NV provides vision-based advanced driver assistance systems. The company was founded in 1999 and developed proprietary image processing algorithms that run on a proprietary processor called the EyeQ, originally manufactured for Mobileye by STMicroelectronics in 0.18-micron CMOS. The chip and software algorithms were first sold to OEM customers such as BMW and Volvo.
The solution has been available as an aftermarket product since about 2007. The architecture comprises two 32-bit ARM946E processor cores, four vision computing engines (VCEs), a multichannel DMA and several peripherals. One of the ARM946Es manages the four VCEs and the multichannel DMA, as well as other peripherals. The four VCEs and the other ARM946E perform all the intensive vision computations required by such applications as tracking and pattern classification.
Bier made the point that for myriad applications, from augmented reality to pick-and-place machines, embedded vision “is not a field that has to be invented; there is a solid body of 30 or 40 years of academic work that can be built upon.” EE Times tipped gesture recognition as one of 10 technologies to watch in 2011, and Microsoft Kinect is bidding to be the game changer in robotic vision.
Kinect, of course, is the 3-D motion-detecting add-on to the Xbox 360. The Xbox solution uses a combination of visible-spectrum image sensing, IR sensing and local processing to identify people and depth in a scene. The hardware essentially comes from Israeli company PrimeSense Ltd.; Microsoft developed the recognition software that can fold up the information into a game.
In June, Microsoft announced a free beta release of the Kinect for Windows software development kit. Developers, academic researchers and enthusiasts can use the SDK to create applications that enable depth sensing, human motion tracking, and voice and object recognition using Kinect technology running Windows 7.
Bier sees three roles for the alliance: to raise awareness of the technology’s possibilities; share practical know-how, including proven approaches to problems and even algorithms and code; and provide a forum in which interested parties can network. “Standardization is needed and may end up part of EVA,” said Bier.
The inclusion of image sensors in all sorts of computer equipment has brought about a “democratization” of computer vision, making every notebook computer a potential development platform, said Bier. “Things like OpenCV are fueling this,” he said. OpenCV (for open-source computer vision) is a library of programming functions aimed mainly at realtime computer vision.
Originated by Intel in 1999, the library is now supported by Willow Garage Inc. (Menlo Park, Calif.), a robotics research laboratory and technology incubator. It is free for use under the opensource Berkeley Software Distribution license. OpenCV encompasses more than 500 functions, including general image processing, camera stabilization, stereo and 3-D functions, detection, recognition, fitting, tracking and other machine learning functions.
Although its origins are with Intel, the library is crossplatform and has C++, C, Python and, soon, Java interfaces running on Windows, Linux, Android and Mac. Willow Garage also hosts the Robotic Operating System (ROS), which has OpenCV built in, and is the developer of the PR2 personal robot.
"Digital wireless" (late 1990s), "consumer digital audio" (early 2000s), and "consumer digital video" are basic consumer functions. "Embedded vision" requires sophisticated artificial intelligence software, I'd expect that deployment will be more challenging (and the safety implications are also significant).
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.