The topic of multi-camera analytics perhaps deserves some additional discussion. It should be possible to combine multiple views from multiple cameras to more precisely determine acceleration, relative position, and object characteristics (a 'person' in a sign vs. a real 3D person). Any features that support these requirements?
There surely is a lot of applications where embedded vision can really shine. I come from the academia, where I'm working mainly on vision for mobile robotic and surveillance applications. In the field of robotics, the dominant trend is to pack the machine with a PC and let it handle all the algorithmic heavy lifting. There are however emerging applications where using a PC as we know it is a dealbreaker - think UAVs. As for surveillance - at present the dominant paradigm is centralized processing, using some server or even a server cluster. The image data from cameras has to be transfered for processing, putting a large pressure on the communication infrastructure. Sometimes the constraints presented by the communication infrastructure are a brick wall - a complete system redesign is necessary to top over it (or go around it). A natureal solution to this problem is in-place processing.
Programmable logic really shines when it comes to processing of local image information, e.g. using the sliding window approach. Our stream processors for image filtering and feature detection and matching can crunch hundreds of VGA frames per second. Combine it with a nice, low power embedded processor and you get a system for (almost) every job. And with Zynq, you get it all in one package. The only problem is that the development is significantly more complicated than it is the case with pure software designs.