Design Article
An FPGA design flow for video imaging applications
Suhel Dhanani, Altera
7/3/2007 3:19 PM EDT
Trends driving video applications to FPGAs
FPGAs are the ideal platform for implementing digital signal processing (DSP) algorithms with high computational requirements (i.e. high performance), since the ability of an FPGA fabric to lay down multiply-accumulate (MAC) resources in parallel can enable DSP performance that is at least an order of magnitude higher than programmable digital signal processors (DSPs).
Two key trends dominate the video design landscape today that pushes the envelope of available DSP power. One is the move inexorably towards high definition (HD) in everything – from displays and surveillance cameras to medical and military imaging systems. Processing a frame of HD video is approximately 4× to 6× the amount of data being processed when compared to a simple definition (SD) frame. This increased need for high definition data processing is driving video applications into higher performance platforms such as FPGAs.

1. HD video frame data is significantly larger
than the corresponding SD video frame.
The second trend is the emergence of H.264 as one of the most widely used compression algorithms, also called COding/DECoding or CODEC. The H.264 CODEC allows designers to trade-off computational complexity to either achieve higher image quality for a given bit-rate or to achieve higher compression efficiency. Increasingly designers are choosing to implement this CODEC with the higher computational complexity. This trend also reinforces the need for a high performance DSP engine (i.e. FPGAs) to achieve system quality and bit-rate requirements.
High performance video signal chains for HD video processing can involve many blocks such as video scaling, de-interlacing, and chroma resampling. However designing and implementing high performance video signal chains in FPGAs is a complex task. Following is an investigation of several of the reasons behind this complexity and some of the newer tools and solutions that can assist designers to abstract away some of these complexities.
Implementing complex and high-performance video signal chains
Traditionally designers have turned to Hardware Description Languages (HDLs) to get the highest performance implementation of their DSP algorithms. FPGA vendors provide fairly robust EDA tools for the elaboration, synthesis, and simulation of complex designs using HDL.
However increasingly model-based design methodologies like those provided by Mathworks with their Simulink tool is gaining traction for both simulation and design implementation. Model-based design provides a graphical environment and a customizable set of block libraries that allows design specification, simulation, and implementation in a selected platform.
While this methodology does sacrifice some performance it offers the promise of significant productivity gains, especially when implementing complex video signal chains.
The Mathworks Simulink tool includes a set of libraries with blocks for common functions such as video processing, filtering, and modulation that can be used to quickly build and simulate video datapaths.
For many blocks provided by Simulink, there exists equivalent blocks provided by FPGA vendors – these blocks have underlying HDL descriptors and can be directly implemented within an FPGA. Once these video datapaths are built and simulated using Simulink, they can then be converted to HDL by swapping the Mathworks blocks to equivalent blocks provided by the FPGA vendor.
Fig 2 shows some of the common video processing functions provided with Simulink and the corresponding equivalent functions provided by Altera's Video Image Processing Intellectual Property (IP) Suite.

2. FPGA vendors provide hardware blocks that correspond
to the software blocks provided with Simulink.
(Click this image to view a larger, more detailed version)
FPGA vendor blocks enable building a signal chain using the intuitive Simulink interface, simulating the design in that environment, and then implementing the same design in an FPGA by automatically generating an FPGA bit stream.
While implementation may seem easy, there are some real-life problems to solve first. There are three primary challenges associated with this flow.
- The first challenge comes from connecting the different video processing blocks and managing the handshaking function between disparate blocks to maintain data coherency.
- The second challenge is dealing with simulation times that can span several days for a few frames of HD video when the blocks are built using underlying HDL.
- Lastly there is the integration challenge. FPGA vendors provide tools (such as DSP Builder from Altera) that can generate an FPGA bitstream for video datapaths built using blocks provided by the vendor. But one has to generate the entire system that contains the video processing datapath in conjunction with other system blocks such as a DDR memory controller, an I2C interface, or even an embedded processor.
Let's look at some of the tools provided by FPGA vendors to resolve these challenges:



