How still photos and videos taken with standalone cameras are being replaced in by camera-inclusive smartphones and tablets through the use of new embedded-vision technologies.
Next time you take a selfie or a shot of that great meatball entrée you're about to consume at the corner Italian restaurant, you will be contributing to the collection of 880 billion photos that Yahoo expects will be taken in 2014. Every day, Facebook users upload 350 million pictures and Snapchat users upload more than 400 million images. Video is also increasingly popular, with sites like YouTube receiving 100 million hours of video every minute.
These statistics, forecasted to increase further in the coming years, are indicative of two fundamental realities: people like to take pictures and shoot video, and the increasingly ubiquitous camera phone makes it easy to do so. Cell phone manufacturers have recognized this opportunity, and their cameras' capabilities are becoming a significant differentiator between models, therefore a notable investment target.
However, image-sensor technology is quickly approaching some fundamental limits. The geometries of the sensor pixels are approaching the wavelengths of visible light, making it increasingly difficult to shrink their dimensions further. For example, latest-generation image sensors are constructed using 1,100 nm pixels, leaving little spare room to capture red-spectrum (~700 nm wavelength) light. Also, as each pixel's silicon footprint shrinks, the amount of light it is capable of capturing and converting to a proportional electrical charge also decreases. This decrease in sensitivity increases noise in low-light conditions, and decreases the dynamic range -- the ability to see details in shadows and bright areas of images. Since smaller pixel sensors can capture fewer photons, each photon has a more pronounced impact on each pixel’s brightness.
Resolution ceilings prompt refocus on other features
Given the challenges of increasing image-sensor resolution, camera phone manufacturers appear reluctant to further promote the feature. As a case in point, Apple’s advertising for the latest iPhone 5s doesn’t even mention resolution, instead focusing generically on image quality and other camera features. Many of these features utilize computational photography -- using increasing processing power and sophisticated vision algorithms to make better photographs. After taking pictures, such advanced camera phones can edit them in such a way that image flaws -- blur, low light, color fidelity, etc. -- are eliminated. In addition, computational photography enables brand new applications, such as reproducing a photographed object on a 3D printer, automatically labeling pictures so that you can easily find them in the future, or easily removing that person who walked in front of you while you were taking an otherwise perfect picture.
High Dynamic Range (HDR) is an example of computational photography that is now found on many camera phones. A camera without this capability may be hampered by images with under- and/or over-exposed regions. It can be difficult, for example, to capture the detail found in a shadow without making the sky look pure white. Conversely, capturing detail in the sky can make shadows pitch black. With HDR, multiple pictures are taken at different exposure settings, some optimized for bright regions of the image (such as the sky in the example), others optimized for dark areas (i.e. shadows). HDR algorithms then select and combine the best details of these multiple pictures, using those pictures to synthesize a new image that captures the nuances of the clouds in the sky and the details in the shadows. This merging also needs to comprehend and compensate for between-image camera and subject movement, and to determine the optimum amount of light for different portions of the image. (See image below.)
An example of a high-dynamic-range image created by combining images captured using different exposure times.
Another example of computational photography is Super Resolution, wherein multiple images of a given scene are algorithmically combined, resulting in a final image that delivers finer detail than that present in any of the originals. A similar technique can be used to transform multiple poorly lit images into one higher-quality well-illuminated image. Movement between image frames, either caused by the camera or subject, increases the Super Resolution implementation challenge, since the resultant motion blur must be correctly differentiated from image noise and appropriately compensated for. In such cases, more intelligent (i.e. more computationally intensive) processing, such as object tracking, path prediction, and action identification, is required in order to combine pixels that might be in different locations of each image frame.
Read the full story on EE Times' sister site, Embedded.com.