Design Article
Using FPGAs to solve tough DSP design challenges
Reg Zatrepalek, Hardent Inc.
7/23/2012 8:37 AM EDT
Editor's Note: I am delighted to have the opportunity to present the following piece from the first quarter edition 2012 of the Xcell Journal, with the kind permission of Xilinx Inc.
This piece compares using standalone DSPs vs FPGAs and talks about the advantages and disadvantages of both. FPGAs can do the functions of multiple DSPs but in some cases an FPGA would be overkill. This goes into great detail on the good and bad and a bit of how-to implement DSP functions optimally.

DSPs are invaluable in electronic system design for their ability to quickly measure, filter or compress analog signals on the fly. In doing so, they help the digital world communicate with the real world, the analog world. But as electronic systems become more elaborate, incorporating multiple analog sources to process, engineers are forced to make tough decisions. Is it better to use multiple DSPs and synchronize that functionality with the rest of the system? Or is it better to have one high-performance DSP handling multiple functions with elaborate software?
In many cases, today’s systems are so complex that single-DSP implementations have insufficient processing power. At the same time, system architects simply can’t afford the costs, complexities and power requirements of multiple-chip systems.
FPGAs have now emerged as a great choice for systems requiring high-performance DSP functionality. In fact, FPGA technology can often provide a much simpler solution to difficult DSP challenges than a standalone digital signal processor. To understand why requires a look back at the origins and evolution of the DSP.
MICROPROCESSORS FOR A SPECIALIZED PURPOSE
Over the past two decades, traditional DSP architectures have struggled to keep pace with increasing performance demands. As video systems make greater strides toward high definition and 3D, and communications systems push the limits of current technology to achieve higher bandwidth, designers need alternative implementation strategies. Hardware used to implement digital signal-processing algorithms may be categorized into one of three basic types of device: microprocessors, logic and memory. Some designs may require additional hardware for analog-to-digital (A/D) and digital-to-analog (D/A) conversion, and for high-speed digital interfaces.
Traditional digital signal processors are microprocessors designed to perform a specialized purpose. They are well-suited to algorithmic-intensive tasks but are limited in performance by clock rate and the sequential nature of their internal design. This limits the maximum number of operations per second that they can carry out on the incoming data samples. Typically, three or four clock cycles are required per arithmetic logic unit (ALU) operation. Multicore architectures may increase performance, but these are still limited. Designing with traditional signal processors therefore necessitates the reuse of architectural elements for algorithm implementation. Every operation must cycle through the ALU, either fed back internally or externally, for each addition, multiplication, subtraction or any other fundamental operation performed.

Figure 1 – Traditional DSP architecture
This piece compares using standalone DSPs vs FPGAs and talks about the advantages and disadvantages of both. FPGAs can do the functions of multiple DSPs but in some cases an FPGA would be overkill. This goes into great detail on the good and bad and a bit of how-to implement DSP functions optimally.

--------------------------------------------
DSPs are invaluable in electronic system design for their ability to quickly measure, filter or compress analog signals on the fly. In doing so, they help the digital world communicate with the real world, the analog world. But as electronic systems become more elaborate, incorporating multiple analog sources to process, engineers are forced to make tough decisions. Is it better to use multiple DSPs and synchronize that functionality with the rest of the system? Or is it better to have one high-performance DSP handling multiple functions with elaborate software?
In many cases, today’s systems are so complex that single-DSP implementations have insufficient processing power. At the same time, system architects simply can’t afford the costs, complexities and power requirements of multiple-chip systems.
FPGAs have now emerged as a great choice for systems requiring high-performance DSP functionality. In fact, FPGA technology can often provide a much simpler solution to difficult DSP challenges than a standalone digital signal processor. To understand why requires a look back at the origins and evolution of the DSP.
MICROPROCESSORS FOR A SPECIALIZED PURPOSE
Over the past two decades, traditional DSP architectures have struggled to keep pace with increasing performance demands. As video systems make greater strides toward high definition and 3D, and communications systems push the limits of current technology to achieve higher bandwidth, designers need alternative implementation strategies. Hardware used to implement digital signal-processing algorithms may be categorized into one of three basic types of device: microprocessors, logic and memory. Some designs may require additional hardware for analog-to-digital (A/D) and digital-to-analog (D/A) conversion, and for high-speed digital interfaces.
Traditional digital signal processors are microprocessors designed to perform a specialized purpose. They are well-suited to algorithmic-intensive tasks but are limited in performance by clock rate and the sequential nature of their internal design. This limits the maximum number of operations per second that they can carry out on the incoming data samples. Typically, three or four clock cycles are required per arithmetic logic unit (ALU) operation. Multicore architectures may increase performance, but these are still limited. Designing with traditional signal processors therefore necessitates the reuse of architectural elements for algorithm implementation. Every operation must cycle through the ALU, either fed back internally or externally, for each addition, multiplication, subtraction or any other fundamental operation performed.

Figure 1 – Traditional DSP architecture
Unfortunately, in dealing with many of today’s high-performance applications, the classical DSP fails to satisfy system requirements. Several solutions have been proposed in the past, including using multiple ALUs within a device or multiple DSP devices on a board; however, such schemes often increase costs significantly and simply shift the problem to another arena. For example, to increase performance using multiple devices follows an exponential curve. In order to double the performance, two devices are required. To double the performance again takes four devices, and so on. In addition, the focus of programmers often shifts from signal-processing functions to task scheduling across the multiple processors and cores. This results in much additional code that functions as system overhead rather than attacking the digital signal-processing problem at hand.
A solution to the increasing complexity of DSP implementations came with the introduction of FPGA technology. The FPGA was initially developed as a means to consolidate and concentrate discrete memory and logic to enable higher integration, higher performance and increased flexibility. FPGA technology has become a significant part of virtually every high-performance system in use today. In contrast to the traditional DSP, FPGAs are massively parallel structures containing a uniform array of configurable logic blocks (CLBs), memory, DSP slices and some other elements. They may be programmed using high-level description languages like VHDL and Verilog, or at a block diagram level using System Generator. Many dedicated functions and IP cores are available for direct implementation in a highly optimized form within the FPGA.
The main advantage to digital signal processing within an FPGA is the ability to tailor the implementation to match system requirements. This means in a multiple-channel or high-speed system, you can take advantage of the parallelism within the device to maximize performance, while in a lower-rate system the implementation may have a more serial nature. Thus the designer can tailor the implementation to suit the algorithm and system requirements rather than compromising the desired ideal design to conform to the limitations of a purely sequential device. Very high-speed I/O further reduces cost and bottlenecks by maximizing data flow from capture right through the processing chain to final output.
As an example of how the FPGA stacks up, let’s consider a FIR filter implementation using both classical DSP and FPGA architectures to illustrate some of the strengths and weaknesses of each solution.
A solution to the increasing complexity of DSP implementations came with the introduction of FPGA technology. The FPGA was initially developed as a means to consolidate and concentrate discrete memory and logic to enable higher integration, higher performance and increased flexibility. FPGA technology has become a significant part of virtually every high-performance system in use today. In contrast to the traditional DSP, FPGAs are massively parallel structures containing a uniform array of configurable logic blocks (CLBs), memory, DSP slices and some other elements. They may be programmed using high-level description languages like VHDL and Verilog, or at a block diagram level using System Generator. Many dedicated functions and IP cores are available for direct implementation in a highly optimized form within the FPGA.
The main advantage to digital signal processing within an FPGA is the ability to tailor the implementation to match system requirements. This means in a multiple-channel or high-speed system, you can take advantage of the parallelism within the device to maximize performance, while in a lower-rate system the implementation may have a more serial nature. Thus the designer can tailor the implementation to suit the algorithm and system requirements rather than compromising the desired ideal design to conform to the limitations of a purely sequential device. Very high-speed I/O further reduces cost and bottlenecks by maximizing data flow from capture right through the processing chain to final output.
As an example of how the FPGA stacks up, let’s consider a FIR filter implementation using both classical DSP and FPGA architectures to illustrate some of the strengths and weaknesses of each solution.
Navigate to related information


anne-francoise.pele
7/23/2012 10:30 AM EDT
Another piece, originating from the first quarter edition 2012 of the Xcell Journal, is "Embedded Vision: FPGAs’ Next Notable Technology Opportunity".
To access the article, click here: http://www.eetimes.com/design/military-aerospace-design/4376567/Embedded-vision--FPGAs--next-technology-opportunity
Sign in to Reply
Dr DSP
7/23/2012 1:48 PM EDT
There are some very useful concepts covered here and the summary is generally spot on, however it is important to consider the DSP function in the system context.
If other functions are required in addition to the DSP it may push the solution of choice into an FPGA. For example, a low performance DSP function that is part of a sensor or motor control system need not be implemented in a stand alone DSP device. An FPGA might be the right solution in this case.
Sign in to Reply
ReneCardenas
7/26/2012 11:44 AM EDT
Dr. DSP,
COuldn't the same be said in the other direction?, if there are other considerations more suitable for a DSP processor then the tilt can be as well move the other direction.
In my opinion, it has to be a case by case decision of the designer, given a set of resources and time constraints.
Just another point of view
Sign in to Reply
Greg.Dee
7/26/2012 5:02 PM EDT
no offence but "For example, a low performance DSP function that is part of a sensor or motor control system need not be implemented in a stand alone DSP device. An FPGA might be the right solution in this case." doesn't make sense to me, if it's low performance you go down the performance chain, not up it. So one would consider a general micro-controller with it's obvious advantages.
Sign in to Reply
glen.herrmannsfeldt
7/26/2012 3:10 AM EDT
The FIR equation is wrong.
Sign in to Reply
ReneCardenas
7/26/2012 11:32 AM EDT
Glen,
It is too easy to critisize and rush to jugment in haste, when it is so easy to offer the correction of such typo that appears in many publications that are transcribed by non-technical people.
Simply stating the transgression in this case that the index coefficients are transposed for the constatnt term and the discrete variable term, would have accomplished more and been more informative to others that may not have see this simple transgression. Article as good merits otherwise, in my opinion.
Sign in to Reply
nicolas.mokhoff
7/26/2012 11:51 AM EDT
ReneCardenas: your comment on striving toward positive criticism is welcome.
Sign in to Reply
ReneCardenas
7/27/2012 4:29 PM EDT
Thanks Nic, that is my motto, be wise enough to know that in no way we can master the universe alone, but each of us should attempt to make the universe much friendlier place to everyone. Specially new commers to engineering.
There are lots of complexities and tough problems in the world, and nothing is gained by been destructive.
Sign in to Reply
Medina
7/26/2012 11:42 AM EDT
Glen, would you care to point out the anomaly in the equation? It wasn't very evident when I looked at it.
Thanks
Sign in to Reply
EricC
7/26/2012 8:52 AM EDT
Further information on how Xilinx System Generator and MathWorks HDL Coder enable Model-Based Design for targeting Xilinx FPGAs is available at http://www.mathworks.com/xilinx.
Sign in to Reply
EricC
7/26/2012 8:56 AM EDT
Further information on how Xilinx System Generator and other HDL code generation tools may be used with MATLAB and Simulink -- including examples, demos, and videos -- are available from http://www.mathworks.com/fpga.
Sign in to Reply
EricC
7/26/2012 10:09 AM EDT
Corrected link is http://www.mathorks.com/fpga
Sign in to Reply
Krutsch
7/26/2012 4:05 PM EDT
If only one would have to do FIR filters only…? Fact is the software stack is more complicated and it is really a pain to do it on an FPGA. For radar, some high end medical applications it might be a good choice.. For many , many applications it is a pain, try to get a solution certified for some automotive and avionics applications and you will see.
Sign in to Reply
Alxx123
7/27/2012 12:48 AM EDT
Thats all well and good but no mention or comparison on the power used in dsp vs fpga
Sign in to Reply
agk
7/29/2012 6:56 AM EDT
With FPGA's we can create massively large parallel processing so that DSP algorithms can bring useful results.
Sign in to Reply
kinnar
7/29/2012 2:37 PM EDT
Actually what we are trying to implement using the FPGA is already there in DSP Processor, but what matters is the portability and the size reduction of the final product by implementing some functionality of DSP using FPGA, this way one will be able to reduce the use of DSP in many designs, but the real disadvantage of this method is it totally depends hardware dependent.
Sign in to Reply