Editor's Note: I am delighted to have the opportunity to present the following piece from the first quarter edition 2012 of the Xcell Journal, with the kind permission of Xilinx Inc.
This piece compares using standalone DSPs vs FPGAs and talks about the advantages and disadvantages of both. FPGAs can do the functions of multiple DSPs but in some cases an FPGA would be overkill. This goes into great detail on the good and bad and a bit of how-to implement DSP functions optimally.
DSPs are invaluable in electronic system design for their ability to quickly measure, filter or compress analog signals on the fly. In doing so, they help the digital world communicate with the real world, the analog world. But as electronic systems become more elaborate, incorporating multiple analog sources to process, engineers are forced to make tough decisions. Is it better to use multiple DSPs and synchronize that functionality with the rest of the system? Or is it better to have one high-performance DSP handling multiple functions with elaborate software?
In many cases, today’s systems are so complex that single-DSP implementations have insufficient processing power. At the same time, system architects simply can’t afford the costs, complexities and power requirements of multiple-chip systems.
FPGAs have now emerged as a great choice for systems requiring high-performance DSP functionality. In fact, FPGA technology can often provide a much simpler solution to difficult DSP challenges than a standalone digital signal processor. To understand why requires a look back at the origins and evolution of the DSP.MICROPROCESSORS FOR A SPECIALIZED PURPOSE
Over the past two decades, traditional DSP architectures have struggled to keep pace with increasing performance demands. As video systems make greater strides toward high definition and 3D, and communications systems push the limits of current technology to achieve higher bandwidth, designers need alternative implementation strategies. Hardware used to implement digital signal-processing algorithms may be categorized into one of three basic types of device: microprocessors, logic and memory. Some designs may require additional hardware for analog-to-digital (A/D) and digital-to-analog (D/A) conversion, and for high-speed digital interfaces.
Traditional digital signal processors are microprocessors designed to perform a specialized purpose. They are well-suited to algorithmic-intensive tasks but are limited in performance by clock rate and the sequential nature of their internal design. This limits the maximum number of operations per second that they can carry out on the incoming data samples. Typically, three or four clock cycles are required per arithmetic logic unit (ALU) operation. Multicore architectures may increase performance, but these are still limited. Designing with traditional signal processors therefore necessitates the reuse of architectural elements for algorithm implementation. Every operation must cycle through the ALU, either fed back internally or externally, for each addition, multiplication, subtraction or any other fundamental operation performed.
Figure 1 – Traditional DSP architecture
Unfortunately, in dealing with many of today’s high-performance applications, the classical DSP fails to satisfy system requirements. Several solutions have been proposed in the past, including using multiple ALUs within a device or multiple DSP devices on a board; however, such schemes often increase costs significantly and simply shift the problem to another arena. For example, to increase performance using multiple devices follows an exponential curve. In order to double the performance, two devices are required. To double the performance again takes four devices, and so on. In addition, the focus of programmers often shifts from signal-processing functions to task scheduling across the multiple processors and cores. This results in much additional code that functions as system overhead rather than attacking the digital signal-processing problem at hand.
A solution to the increasing complexity of DSP implementations came with the introduction of FPGA technology. The FPGA was initially developed as a means to consolidate and concentrate discrete memory and logic to enable higher integration, higher performance and increased flexibility. FPGA technology has become a significant part of virtually every high-performance system in use today. In contrast to the traditional DSP, FPGAs are massively parallel structures containing a uniform array of configurable logic blocks (CLBs), memory, DSP slices and some other elements. They may be programmed using high-level description languages like VHDL and Verilog, or at a block diagram level using System Generator. Many dedicated functions and IP cores are available for direct implementation in a highly optimized form within the FPGA.
The main advantage to digital signal processing within an FPGA is the ability to tailor the implementation to match system requirements. This means in a multiple-channel or high-speed system, you can take advantage of the parallelism within the device to maximize performance, while in a lower-rate system the implementation may have a more serial nature. Thus the designer can tailor the implementation to suit the algorithm and system requirements rather than compromising the desired ideal design to conform to the limitations of a purely sequential device. Very high-speed I/O further reduces cost and bottlenecks by maximizing data flow from capture right through the processing chain to final output.
As an example of how the FPGA stacks up, let’s consider a FIR filter implementation using both classical DSP and FPGA architectures to illustrate some of the strengths and weaknesses of each solution.