Today's mobile processors are improving at an astonishing rate -- and consequently delivering visually stunning user experiences at less than 3 watts of power consumption.
Just how is it that mobile processors running at 5 percent of the power of a game console, can produce the kind of graphics realism we had previously come to expect only from game consoles?
It's all about the pixel and making it shine, without drinking a lot of power. Let's look at a few of techniques being used in modern SoC application processors to achieve these fantastic results.
Creating realism within the constraints of mobile devices
Computer graphics (CG) features like hardware tessellation, geometry shading, and high dynamic range (HDR) are developed on big PCs and workstations and then migrated to the mobile platforms. The mobile platforms are getting larger screens and more powerful processors, the elements needed for advanced CG. The challenge is how to get stunning, real-time, workstation-class graphics in a battery-powered device that can run for eight or more hours and not become too hot to hold.
The first step in any power management system, whether it's a phone, your home, or a PC, is to turn things off that aren't being used.
With rudimentary power management under control, developers of the SoCs looked into algorithmic opportunities to deliver realistic images. With the semiconductor suppliers working in their research labs, and with game developers, movie studios, and university computer scientists, clever techniques have emerged over the past two to three years.
Tiling, chunking, and tile-based deferred rendering
In 1991, computer scientists at Microsoft Research launched the Talisman project to investigate new techniques to improve rendering time while scaling screen resolution and color depth. The results were promising, but the hardware design proved to be too challenging for the semiconductor process technology available. However, several ideas from the project were successfully developed, including tiling and deferred rendering (originally developed by Pixel Planes at the University of North Carolina). Microsoft used that knowledge to develop it for Talisman. Software renderers, like Reyes developed at Pixar, also employed tiled methods, prior to Microsoft.
In tiling, the image is broken up into small sections and updated only when the image within them changes. In addition, tiling uses a tiled Z (depth) buffer to determine if a portion of a polygon is visible, and if not, it's not rendered. The process is similar to a traditional 3D graphics pipeline, but reduced to the size of a small tile. Tiling can also take advantage of early Z buffering (buffering early in the pipeline, after scan conversion, but before shading), to further reduce work.
Z-depth rejection rending example.
Imagination Technologies was one of the first companies to exploit tiling in 1988 with its Power VR design used in the ill-fated Sega Dreamcast. Since then it has been used in many low-power graphics processors, most notably the Apple iPhone and iPad.
Tile-based deferred rendering
In 2008 Imagination Technologies introduced the concept of tile-based deferred rendering (TBDR) in its SGX5 GPU design. TBDR postpones processing of some pixel processes (shading, texturing, blending) to maintain acceptable interactive performance such as hidden surface removal or visual surface removal.
In deferred rendering, rasterization is postponed until all of the polygons have been supplied. In immediate mode rendering, the polygons are rasterized as soon as they arrive, regardless of where they are on the screen. Traditional desktop PC or game console GPUs only support direct rendering mode, and it is one of the reasons they consume more power.
Immediate mode rendering
Immediate mode, or direct rendering mode, bypasses the internal tile buffers and writes pixels out to the frame buffer in system memory immediately, without any batching or any other overhead that is inherent to tile-based rendering.
Direct mode rendering can be more power efficient with frames that either have minimal or no depth complexity (i.e., single layer), or for scenes that require lots of mid-frame updates or small partial updates, etc.
When Qualcomm introduced its Adreno 320 GPU in the Snapdragon 600 in October 2012, it was the first SoC able to switch dynamically between the two graphics rendering modes, either at the request of the application or based on a heuristic analysis of the rendered scene, which provided incremental power savings and better rendering performance.
Other deferred techniques
The best deferred rendering algorithms display those parts of the scene that are more visually important than others. For example, deferred polygon rendering displays foreground, larger polygons, or deferred shading.
In addition to exposing new graphics features, the latest version of OpenGL for embedded systems (the "ES" in OpenGL ES) also has features for improving power efficiency.
GLES3 also introduced its own form of deferred rendering, wherein a two-pass model is employed. The first pass gathers data required for shading computations such as positions, normals, and materials, which is then transferred into the geometry buffer as a series of textures. In the second pass, a pixel shader computes the direct and indirect lighting at each pixel using the information of the texture buffers, in screen space. This gives the ability to render many lights in scene at higher frame rates with only one geometry pass, and results in lower power consumption. Qualcomm has made great use of this feature in the Adreno GPU design.