Regardless of whether you are using VHDL, System Verilog, or a different design capture language (see also 10 Ways to Program Your FPGA), there are a number of universal design techniques with which FPGA engineers should be familiar, from the very simple to the most advanced.
In this column we'll take a look at 10 I believe to be important and discuss why I feel this way.
State Machine Design
Example state machines: Moore (left) and Mealy (Right)
FPGA's are often called upon to perform sequence and control-based actions, but how do we actually make the FPGA do this? The answer is a Finite State Machine (FSM) -- a logical construct that transitions between a finite number of states.
The FSM will only be in one state at any particular point in time, and it will transition between states depending upon one or more triggers. There are two main classes of state machine as follows:
- Moore: The state machine outputs are a function only of the current state. A classic example of this form of state machine is a counter.
- Mealy: The outputs from the state machine are a function of both the current state and one or more inputs. A classic example of this form is the Richards Controller.
Due to their deterministic nature, State Machines are very popular in safety-critical applications. However, even in more normal applications, you will find state machines to be at the heart of many FPGA designs, performing tasks like controlling and monitoring processing pipelines.
See also How to Implement State Machines in Your FPGA
Basics of FPGA Math
For many applications we want our FPGAs to be able to perform mathematical operations like addition, subtraction, multiplication, and division. Unlike conventional processors, FPGAs do not have algorithmic logic units (ALUs) to perform integer math operations or floating-point units (FPUs) to perform floating-point operations for us. We do, however, have digital signal processing (DSP) elements that implement multiplier-accumulator functions, but we still need to know how to effectively use these functions to implement meaningful math.
Enter fixed-point mathematics. Fixed-point representations maintain the decimal point at a fixed position, thereby allowing for straightforward arithmetic operations. The major drawback of fixed-point representations is that, in order to represent larger numbers or to achieve a more accurate result with fractional numbers, a larger number of bits are required.
A fixed-point number consists of two parts called the integer and fractional parts. The normal way of representing the split between the integer and fractional bits within a fixed-point
x represents the number of integer bits and
y the number of fractional bits. For example, 8,8 represents 8 integer bits and 8 fractional bits, while 16,0 represents 16 integer bits and 0 fractional bits. This format is often called Q format and is given as Q
m represents the number of integer bits and
n represents the number of fractional bits.
See also Fixed and Floating-Point Packages and The Basics of FPGA Mathematics.
First In, First Out (FIFO) Buffers
(Source: Adam Taylor)
First In, First Out (FIFO) buffers are one of the most useful structures within a FPGA. A very common application is to use them to buffer data as it passes from one clock domain to another. We can also use them as buffers and line buffers for signal and image processing applications. They also remove the need to use ping-pong RAMS to transfer data between two elements if one is being read while the other is being written. In short, FIFOs are very useful tools.
One important point to consider is that we need to take care when we use a FIFO that we do not overflow it; this can occur when the side reading data cannot do so fast enough to prevent the side writing the data filling up the FIFO. In order to address this situation we can use the following formulas:
See also FIFOs: A Tutorial Description.
The CORDIC Algorithm
(Source: Adam Taylor)
Normally, when we wish to perform trigonometry, we reach for our calculator and use the inbuilt functions, but what if we want to implement trigonometric functions in an FPGA?
In fact there are a variety of different approaches, including representing the function as a lookup table or as a Taylor Series. In many cases, however, the best solution is the CORDIC (COordinate Rotation DIgital Computer) algorithm, which has been around since 1956.
The CORDIC algorithm was conceived by Jack Volder while he worked on the B-58 program. It was also the algorithm that was used in the first scientific calculator -- the HP-35.
This algorithm is both very powerful and very simple, requiring only shift and add operations, so it is a shame that not many engineers are familiar with it.
The unified CORDIC algorithm can operate in one of three coordinate systems: Linear, Circular, or Hyperbolic. Within each of these systems, the algorithm functions in one of two modes: rotation or vectoring.
All of this provides support for a wide range of functions that we may wish to implement in our FPGA, and once we have these we can go off in a number of wonderful directions.
See also What is all this CORDIC stuff anyhow?
(Source: Adam Taylor & Max Maxfield)
Sadly, not everything is synchronous to the clocks that we are using within our FPGA. As good FPGA engineers, we want to develop a synchronous system and -- as such -- we must take precautions to accommodate asynchronous signals entering the device.
As these external signals transition asynchronously relative to the clock we are using to capture the signal, there is a risk a signal could change during the setup and hold time and send the capture flip-flop into a metastable condition. In order to prevent that, we need to employ a two-stage flip-flop synchroniser to ensure the flip-flop has time to recover.
We can also use this technique internally within the FPGA when data signals cross between clock domains.
See also Wrapping One's Brain Around Metastability.
Discrete & Fast Fourier Transforms
For many readers, the topic of Fourier transforms and analysis will cause flashbacks to the nightmares and sleepless nights of their university studies. However, the ability to analyse things within the frequency domain is of vital important for many FPGA applications.
There are parameters of a signal that require analysis within the frequency domain to access the information contained within. These parameters are present within the frequency domain, where we can identify the frequency components of the signal, their amplitudes, and the phase of each frequency. Working within the frequency domain also makes it much simpler to manipulate signals due to the ease with which convolution can be performed in this domain.
Depending upon the type of the signal -- repetitive or non-repetitive; discrete or non-discrete -- there are a number of methods one can use to covert between time and frequency domains, including Fourier series, Fourier transforms, or Z transforms. Within electronic signal processing in general and FPGA applications in particular, the engineer is most often interested in one type of transform, the Discrete Fourier Transform (DFT), and its most popular variant, the Fast Fourier Transform (FFT).
Of course, in order to eventually return to the time domain, we will also need to use the inverse DFT or FFT.
See also Getting to Grips with the Frequency Domain.
What if we need to implement some very complex math function within the FPGA? We could take the time to implement this function fully using fixed-point maths, but this will add significant complexity and time. Alternatively, we can plot the function and implement the polynomial trend line in the FPGA.
We can work out the polynomial trend line by plotting the transfer function using a mathematical program such as MATLAB or even a spreadsheet application like Excel. The engineer can then add a polynomial trend line to this dataset such that the equation for the trend line can be implemented within the FPGA in place of the mathematically complex function (provided the trend line equation meets the accuracy requirements, of course).
The accuracy can be easily verified by calculating the output using the polynomial equation and comparing it against the result of the transfer function. If we are struggling to achieve the required accuracy, there are a number of techniques we can use to increase the accuracy, such as piecewise polynomial approximation.
See also Calculating Mathematically Complex Functions
Infinite Impulse Response Filters
(Source: Adam Taylor)
It is common to want to perform signal processing techniques inside an FPGA; for example, filtering a signal or a band of signals prior to further processing.
With traditional analog methods such as Butterworth, Bessel, and Chebyshev, we use operational amplifiers to implement the filter. Within an FPGA, we can implement these filters using an Infinite Impulse Response (IIR) function.
IIR filters are one of the most efficient implementations of digital filter to provide the stopband, passband ripple, and roll off required. Their efficient implementation consumes less FPGA resources and also offers a reduced processing latency; however, due to feedback, they can be unstable if not designed properly and they do not have a linear phase (unlike a FIR filter).
On order to implement an IIR and ensure the result is a filter that is stable, we need to understand fixed-point FPGA mathematics and filter theory.
See also The Scientist and Engineer's Guide to Digital Signal Processing (Chapter 19: Recursive Filters).
Finite Impulse Response Filters
(Source: Adam Taylor)
FPGAs provide the ability to design and implement filters that would be very hard to achieve using more traditional analog methods based on operational amplifiers and passive components.
Finite Impulse Response (FIR) filters are used when we want to guarantee a stable filter in which the phase of the filter will remain constant. This comes at the cost of an increased size (order) to obtain the required roll off and attenuation when compared to an IIR filter.
FIR filters can be implemented using a variety of structures (e.g., polyphase); however, they typically take full advantage of the parallel nature of FPGAs along with any multiply-accumulate blocks.
We can commonly use a simple FIR filter as a compensation filter to correct for digital-to-analog converter sinc roll-off correction.
When I was at university, we used to have to design these by hand; these days, however, there are a number of high-level tools like MATLAB or Octave that can be used generate the coefficient's needed to implement the desired filter type.
See also The Scientist and Engineer's Guide to Digital Signal Processing (Chapter 16: Windowed-Sinc Filters).
Image Processing Filters
(Source: Adam Taylor)
We can use the parallel nature of an FPGA to implement extremely fast complex image processing pipelines. These take the raw image sensor output and enhance and process it, doing things like object detection, edge detection, and a variety of other complex functions.
As most information stored within an image is spatial in nature, we can employ convolution filters. Commonly we apply a small filter kernel across the image, where this filter kernel will use inputs from the pixels surrounding the target pixel. This means we need to buffer lines above and below the pixel currently being processed, which is where use of FIFOs comes in. If we need to perform a convolution with a larger filter kernel, we can use the FFT convolution.
See also The Scientist and Engineer's Guide to Digital Signal Processing (Chapter 24: Linear Image Processing).
Of course, this column just scratches the surface of design techniques with which FPGA engineers should be familiar. In reality, there are as many techniques as there are FPGA use cases. What techniques would you suggest I cover in a future column?