Editor's Note: This article is republished from Issue #83 of the Xilinx Xcell Journal with the kind permission of Xilinx.
If you have seen a demonstration of Audi's Automated Parking technology in which the car autonomously finds a parking spot and parks itself without a driver – or if you have played an Xbox 360 game with its Kinect controller or even just bitten into a flawless piece of fruit from your local grocery store – then you can count yourself as an eyewitness to the dawning of the era of smarter vision systems. All manner of products, from the most sophisticated electronic systems down to the humble apple, are affected by smarter vision technologies. And while today's systems are impressive enough, some experts predict that in 10 years' time, a vast majority of electronics systems – from automotive to factory automation, medical, as well as surveillance, consumer, aerospace and defense – will include smarter vision technologies with even more remarkable capabilities.
As smarter vision systems increase in complexity, we'll very likely become passengers in autonomous automobiles flowing in networked highways. Medical equipment such as Intuitive Surgical's amazing robotic-assisted surgical system will advance even further and may enable surgeons to perform procedures from remote locations. Television and telepresence will reach new levels of immersion and interactivity, while the content on screens in theaters, homes and stores will cater to each individual consumer's interests, even our moods.
Xilinx All Programmable solutions for Smarter Vision are at the forefront of this revolution. With the Zynq-7000 All Programmable SoC – the first device to marry an ARM dual-core Cortex-A9 MPCore, programmable logic and key peripherals on a single chip – as the foundation, Xilinx has fielded a supporting infrastructure of tools and IP that will play a pivotal role in enabling the development and faster delivery of these innovations in vision. The supporting infrastructure includes Vivado HLS (high-level synthesis), the new IP Integrator tools, OpenCV (computer vision) libraries, SmartCORE IP and specialized development kits.
"Through Xilinx's All Programmable Smarter Vision technologies, we are enabling our customers to pioneer the next generation of smarter vision systems," said Steve Glaser, senior vice president of corporate strategy and marketing at Xilinx. "Over the last decade, customers have leveraged our FPGAs to speed up functions that wouldn't run fast enough in the processors they were using in their systems. With the Zynq-7000 All Programmable SoC, the processor and FPGA logic are on the same chip, which means developers now have a silicon platform ideally suited for smarter vision applications."
In support of the device, said Glaser, "We've complemented the Zynq-7000 All Programmable SoC with a robust development environment consisting of Vivado HLS, new IP Integrator tools, OpenCV libraries, SmartCORE IP and development kits. With these Smarter Vision technologies, our customers will get a jump on their next design and be able to achieve new levels of efficiency, lower system power, increase system performance and drastically reduce the bill of materials – enriching and even saving lives while increasing profitability as these innovations roll out at an ever faster pace."
From dumb cameras to smarter vision At the root of Smarter Vision systems is embedded vision. As defined by the rapidly growing industry group the Embedded Vision Alliance (www.embedded-vision.com), embedded vision is the merging of two technologies: embedded systems (any electronic system other than a computer that uses a processor) and computer vision (also sometimes referred to as machine vision).
Jeff Bier, founder of the Embedded Vision Alliance and CEO of consulting firm BDTI, said embedded vision technology has had a tremendous impact on several industries as the discipline has evolved beyond motorized pan-tilt-zoom analog camera-based systems. "We have all been living in the digital age for some time now, and we have seen embedded vision rapidly evolve from early digital systems that excelled in compressing, storing or enhancing the appearance of what cameras are looking at into today's smarter embedded vision systems that are now able to know what they are looking at," said Bier.
Cutting-edge embedded vision systems not only enhance and analyze images, but also trigger actions based on those analyses. As such, the amount of processing and compute power, and the sophistication of the algorithms, have spiked dramatically. A case in point is the rapidly advancing market of surveillance.
Twenty years ago, surveillance systems vendors were in a race to provide the best lenses enhanced by mechanical systems that performed autofocus and tilting for a clearer and wider field of view. These systems were essentially analog video cameras connected via coaxial cables to analog monitors, coupled with video-recording devices monitored by security guards. The clarity, reliability and thus effectiveness of these systems were only as good as the quality of the optics and lenses, and the diligence of the security guards in monitoring what the cameras displayed.
With embedded vision technology, surveillance equipment companies began to use lower-cost cameras based on digital technology. This digital processing gave their systems extraordinary features that outclassed and underpriced analog and lens-based security systems. Fisheye lenses and embedded processing systems with various vision-centric algorithms dramatically enhanced the image the camera was producing. Techniques that correct for lighting conditions, improve focus, enhance color and digitally zoom in on areas of interest also eliminated the need for mechanical motor control to perform pan, tilt and zoom, improving system reliability. Digital signal processing has enabled video resolution of 1080p and higher.
But a clearer image that can be manipulated through digital signal processing was just the beginning. With considerably more advanced pixel processing, surveillance system manufacturers began to create more sophisticated embedded vision systems that performed analytics in real time on the high-quality images their digital systems were capturing. The earliest of these embedded vision systems had the capacity to detect particular colors, shapes and movement. This capability rapidly advanced to algorithms that detect whether something has crossed a virtual fence in a camera's field of view; determine if the object in the image is in fact a human; and, through links to databases, even identify individuals.
I recently got OpenCV up and running on my Zynq-7000 based Zedboard. The performance of the OpenCV samples I ran was very good even without using the FPGA fabric. I didn't get my OpenCv from Xilinx, but rather downloaded and built it from the generic ARM Linux source.
I really wanted to see how compatible the Zynq is to the rest of the ARM CPU/SOC world. The Linux distribution I'm using (Xillinux) is made for the Zynq, but pretty much everything else I used was generic. I built CMAKE, Mjpg-Streamer and OpenCV from non-targeted source files. The Mjpg-Streamer source I used was made for the Raspberry Pi. The others were generic ARM Ubuntu sources. I also downloaded and used a large number of generic source libraries - all of that without touching a single line of code.