Design Article
The basics of HD H.264 and next-generation encoding
Paul Greene, NVISION and Shubha Tuljapurkar, Telairity
3/7/2007 3:00 PM EST
Video compression is a complex, compute-intensive task with numerous filtering and matrix operations. Current MPEG2 video codec chips have built-in DSPs, or special VLIW engines, and some hard-coded functionality to compress video in real-time. However, as the industry moves to high definition (HD) encoding, there is six times as much data to process as standard definition (SD), and the H.264/ AVC compression standard is greater than five times as complex as MPEG2, due to the large number of compression tools and options in AVC that can be used to compress a superlative picture at half the bit rate of MPEG2.
Broadcast equipment manufacturers developing new encoding products have to address different application requirements including those for the rapidly emerging IPTV segment. Encoding solutions must not only deliver high video quality and a low bit rate, but they should also be adaptable to the feature needs of the quickly proliferating range of video platforms. There are several solutions available to meet the challenges of HD video encoding, including PC platforms, FPGAs, dedicated ASIC/System on-chip codecs, and DSPs. We discuss the benefits and costs of each of these solutions, and draw on Telairity's experience in developing its H.264 HD real-time encoders to highlight the importance of programmable architectures for encoding solutions that deliver long-haul viability.
PC Platforms, FPGAs, SoCs, and DSPs
One solution for HD processing that is appealing because of its ubiquity, and its increasing performance, is to use the PC platform and a combination of software and supplemental video hardware. However, since PC CPUs such as the Intel Pentium' or AMD Opteron' are designed for PC applications and not video processing, multiple dual or quad Intel/AMD processors are required to accomplish the task of encoding HD video. Specialized hardware for motion estimation, typically FPGAs, is needed as well. In real-world video applications, it is difficult to accomplish linear performance gains merely by increasing the number of general-purpose CPUs. You quickly reach the point of diminishing returns and have to use application-optimized processors to achieve highest system performance. Thus neither the economics nor the absolute-reliability requirements of video distribution support the use of general-purpose PC processors.
FPGAs have recently become more suitable in video applications because of their immense parallel processing capability and programmability. The highest-capacity FPGAs feature a large number of logic gates, and several vendors have implemented HD video encoders using anywhere from one to four high-density FPGAs, along with a few DSPs. FPGAs have the advantage of board density, but their power dissipation is high. Software upgrades using simple "C" programs aren't an option. Instead, any algorithm changes to the FPGA require time and effort because of the complex task of resynthesizing/reassigning gates, which can affect the logic timing and functionality of the solution.
Another possible solution is a dedicated HD AVC codec ASIC or system on-chip (SOC). The solution typically consists of a combination of hard-wired functions and an embedded processor, which creates an encoding solution that is rigidly defined in terms of how it handles the AVC algorithms. Any changes or improvements to the algorithm require a turn of silicon that is expensive (a 90-nm mask sets costs close to $1M), as well as taking anywhere from six to nine months (or more) to implement and deploy. Overall, the SOC approach lacks the flexibility and software programmability required for encoding applications in which the compression algorithm continues to evolve in time for improved video quality and lower bit rate.
A better way to handle AVC video compression algorithms is with high-performance digital signal processing (DSP) chips. But since the amount of real-time HD data to be processed is very large, several (20+) general-purpose DSPs are required. In addition, these devices are not designed for the high-bandwidth data sharing required for video compression. As a result, these solutions typically also require a number of FPGAs to communicate between video slices and achieve high-quality compressed video. Such a combination of DSPs and FPGAs produces a good picture, but the implementation is expensive to develop and maintain, and occupies a lot of space.



