Design Article
Tell us What You Think
We want to know what you thought about this Design. Let us know by adding a comment.
How to use the CORDIC algorithm in your FPGA design
Adam P. Taylor, EADS Astrium
5/12/2012 11:14 AM EDT
Editor's Note: This article first appeared in the Second Quarter 2012 issue of Xcell Journal, and is reproduced here with the kind permission of Xilinx (Click Here to see a PDF of this issue).
Most engineers tasked with implementing a mathematical function such as sine, cosine or square root within an FPGA may initially think of doing so by means of a lookup table, possibly combined with linear interpolation or a power series if multipliers are available. However, in cases like this the CORDIC algorithm is one of the most important tools in your arsenal, albeit one that few engineers are aware of.
Invented by Jack Volder while designing a new navigation computer at Convair for the B-58A Hustler program in 1959, CORDIC – it stands for Coordinate Rotation Digital Computer – is a simple algorithm designed to calculate mathematical, trigonometric and hyperbolic mathematical functions.
The real beauty of this algorithm is that you can implement it with a very small FPGA footprint. CORDIC requires only a small lookup table, along with logic to perform shifts and additions. Importantly, the algorithm requires no dedicated multipliers or dividers.
This algorithm is one of the most helpful for DSP and industrial and control applications. Perhaps its most common use is in implementing traditional mathematical functions as shown in Table 1, where multipliers, dividers or more interesting functions are required in devices that lack dedicated multipliers or DSP blocks.
For example, designers use CORDICs in many small industrial controllers to implement mathematical transfer functions and true RMS measurement. Engineers are also using CORDICs in biomedical applications to compute fast Fourier transforms (FFTs) to analyze the frequency content of many physiological signals. In this application, along with the traditional mathematical functions, designers use the CORDIC to implement the FFT twiddle factors.
Inside CORDIC
The CORDIC algorithm can operate in one of three configurations: linear, circular or hyperbolic. Within each of these configurations the algorithm functions in one of two modes – rotation or vectoring. In rotation mode, the input vector is rotated by a specified angle, while in vectoring mode the algorithm rotates the input vector to the x axis while recording the angle of rotation required.
Additionally, you can derive other functions from the CORDIC outputs as well. In many cases you could even implement these through the use of another CORDIC in a different configuration, as shown here.
The unified algorithm seen below covers all three CORDIC configurations. The algorithm has three inputs, X, Y and Z. Table 2 defines the required lookup table precalculated values depending upon configuration, while Table 3 addresses how these are initialized at startup depending upon the mode of operation (vectoring or rotation).
where m defines the configuration for either hyperbolic (m = –1), linear (m = 0) or circular (m = 1). Correspondingly, the value of ei as the angle of rotation changes depending upon the configuration. The value of ei is normally implemented as a small lookup table within the FPGA and is defined in Table 2.
In the equation, di is the direction of rotation, which depends upon the mode of operation. For rotation mode di = –1 if Zi < 0 else +1, while in vectoring mode di = +1 if Yi < 0 else –1.
When configured in either a circular or hyperbolic configuration using rotation mode, the output results will have gain that can be precalculated using the number of rotations defined in the equation
This gain is typically fed back into the initial setting of the algorithm to remove the need for post-scaling of the result.
Working designers must keep in mind that the CORDIC algorithm only operates within a strict convergence zone. You may have to do some prescaling to ensure it performs as expected. It is worth noting that the algorithm will get more accurate the more iterations (serial) or stages (parallel) you decide to implement. A general rule of thumb is that for n bits of precision, you will require n iterations or stages. All of this is, however, easily modeled in simple tools such as Excel or MATLAB prior to cutting code to ensure you will obtain accurate results with the selected iterations.
The CORDIC algorithm as defined will only converge (work) across a limited range of input values. For circular configurations of CORDIC algorithms, convergence is guaranteed for the angles below the sum of the angles in the lookup table – that is, between –99.7 and 99.7 degrees. For angles outside of this, you must use a trigonometric identity to translate an angle within this range. This is also true for convergence within the linear configuration. However, to gain convergence in hyperbolic mode, you must repeat certain iterations (4, 13, 40, K… 3K+1). In this case the maximum input of θ is approximately 1.118 radians.
Where is CORDIC used?
Designers use CORDIC algorithms in a wide range of applications, from digital signal processing and image processing to industrial control. The most basic way of using a CORDIC is to combine it with a phase accumulator and generate sine and cosine waves for use in I and Q modulation. Using the algorithm to generate these waveforms can, if correctly done, result in a high spurious-free dynamic range. Good SFDR performance is required for most signal-processing applications.
Within the field of robotics, CORDICs are used within kinematics, where they are useful for determining the position and movement of robotic joints and limbs. In this application, you can use a circular CORDIC in vectoring mode to easily add coordinate values with new coordinate values. Within the field of image processing, three-dimensional operations such as lighting and vector rotation are the perfect candidates for implementation using the algorithm.
Modeling in Excel
One of the most straightforward methods of modeling your CORDIC algorithm before cutting code is to put together a simple Excel spreadsheet. It allows you to model the number of iterations and gain (An) initially using a floating-point number system and later in a scaled, fixed-point system, providing a reference for verification of the code during simulation.
As you can see from the Excel implementation in Table 4, the initial X input is set to An to reduce the need for postprocessing the result. The initial argument is set in Z, which is defined in radians, as are the results.
Implementing your CORDIC
Unless there is a good reason not to, the simplest method of implementing a CORDIC algorithm within an FPGA is to utilize a tool such as Xilinx CORE Generator. CORE Generator provides a comprehensive interface that allows you to define the exact functionality of the CORDIC (rotate, vector, etc.), as seen in Figure 1.
Unfortunately, the CORE Generator does not provide options for working with CORDICs within the linear mode (the tool does provide separate cores to perform these functions). However, you can write the VHDL code required to implement the algorithm in very few lines, as the simple example of a circular implementation below shows.
There are two basic topologies for implementing a CORDIC in an FPGA: either a state machine-based approach or a pipelined approach.
If the processing time is not critical, implement the algorithm as a state machine that computes one CORDIC iteration per cycle until the desired number of cycles has been completed. If you require a high calculation speed, then a parallel architecture is more appropriate. The code above implements a 15-stage parallel CORDIC operating within the rotation mode. It uses a simple lookup table of ArcTan(2-i) as demonstrated in Table 2 for a circular configuration, coupled with a simple array structure to implement the parallel stages.
In all of these ways, the CORDIC algorithm proves its merit as a simple but powerful algorithm that all FPGA designers should be aware of. Even better, you can implement the CORDIC very simply using either core-generation tools or hand coding.
About the author
Adam P. Taylor is Principal Engineer at EADS Astrium (www.astrium.eads.net). Adam can be contacted via email at aptaylor@theiet.org
If you found this article to be of interest, visit Programmable Logic Designline where you will find the latest and greatest design, technology, product, and news articles with regard to programmable logic devices of every flavor and size (FPGAs, CPLDs, CSSPs, PSoCs...).
Also, you can obtain a highlights update delivered directly to your inbox by signing up for my weekly newsletter – just Click Here to request this newsletter using the Manage Newsletters tab (if you aren't already a member you'll be asked to register, but it's free and painless so don't let that stop you [grin]).
Most engineers tasked with implementing a mathematical function such as sine, cosine or square root within an FPGA may initially think of doing so by means of a lookup table, possibly combined with linear interpolation or a power series if multipliers are available. However, in cases like this the CORDIC algorithm is one of the most important tools in your arsenal, albeit one that few engineers are aware of.
Invented by Jack Volder while designing a new navigation computer at Convair for the B-58A Hustler program in 1959, CORDIC – it stands for Coordinate Rotation Digital Computer – is a simple algorithm designed to calculate mathematical, trigonometric and hyperbolic mathematical functions.
The real beauty of this algorithm is that you can implement it with a very small FPGA footprint. CORDIC requires only a small lookup table, along with logic to perform shifts and additions. Importantly, the algorithm requires no dedicated multipliers or dividers.
This algorithm is one of the most helpful for DSP and industrial and control applications. Perhaps its most common use is in implementing traditional mathematical functions as shown in Table 1, where multipliers, dividers or more interesting functions are required in devices that lack dedicated multipliers or DSP blocks.
Table 1. Functions resulting from
CORDIC configurations and mode.
CORDIC configurations and mode.
For example, designers use CORDICs in many small industrial controllers to implement mathematical transfer functions and true RMS measurement. Engineers are also using CORDICs in biomedical applications to compute fast Fourier transforms (FFTs) to analyze the frequency content of many physiological signals. In this application, along with the traditional mathematical functions, designers use the CORDIC to implement the FFT twiddle factors.
Inside CORDIC
The CORDIC algorithm can operate in one of three configurations: linear, circular or hyperbolic. Within each of these configurations the algorithm functions in one of two modes – rotation or vectoring. In rotation mode, the input vector is rotated by a specified angle, while in vectoring mode the algorithm rotates the input vector to the x axis while recording the angle of rotation required.
Additionally, you can derive other functions from the CORDIC outputs as well. In many cases you could even implement these through the use of another CORDIC in a different configuration, as shown here.
The unified algorithm seen below covers all three CORDIC configurations. The algorithm has three inputs, X, Y and Z. Table 2 defines the required lookup table precalculated values depending upon configuration, while Table 3 addresses how these are initialized at startup depending upon the mode of operation (vectoring or rotation).
where m defines the configuration for either hyperbolic (m = –1), linear (m = 0) or circular (m = 1). Correspondingly, the value of ei as the angle of rotation changes depending upon the configuration. The value of ei is normally implemented as a small lookup table within the FPGA and is defined in Table 2.
Table 2. CORDIC angle of rotation.
In the equation, di is the direction of rotation, which depends upon the mode of operation. For rotation mode di = –1 if Zi < 0 else +1, while in vectoring mode di = +1 if Yi < 0 else –1.
When configured in either a circular or hyperbolic configuration using rotation mode, the output results will have gain that can be precalculated using the number of rotations defined in the equation
This gain is typically fed back into the initial setting of the algorithm to remove the need for post-scaling of the result.
Working designers must keep in mind that the CORDIC algorithm only operates within a strict convergence zone. You may have to do some prescaling to ensure it performs as expected. It is worth noting that the algorithm will get more accurate the more iterations (serial) or stages (parallel) you decide to implement. A general rule of thumb is that for n bits of precision, you will require n iterations or stages. All of this is, however, easily modeled in simple tools such as Excel or MATLAB prior to cutting code to ensure you will obtain accurate results with the selected iterations.
Table 3. Initialization settings depending upon
mathematical operation as demonstrated in Table 1.
mathematical operation as demonstrated in Table 1.
The CORDIC algorithm as defined will only converge (work) across a limited range of input values. For circular configurations of CORDIC algorithms, convergence is guaranteed for the angles below the sum of the angles in the lookup table – that is, between –99.7 and 99.7 degrees. For angles outside of this, you must use a trigonometric identity to translate an angle within this range. This is also true for convergence within the linear configuration. However, to gain convergence in hyperbolic mode, you must repeat certain iterations (4, 13, 40, K… 3K+1). In this case the maximum input of θ is approximately 1.118 radians.
Where is CORDIC used?
Designers use CORDIC algorithms in a wide range of applications, from digital signal processing and image processing to industrial control. The most basic way of using a CORDIC is to combine it with a phase accumulator and generate sine and cosine waves for use in I and Q modulation. Using the algorithm to generate these waveforms can, if correctly done, result in a high spurious-free dynamic range. Good SFDR performance is required for most signal-processing applications.
Within the field of robotics, CORDICs are used within kinematics, where they are useful for determining the position and movement of robotic joints and limbs. In this application, you can use a circular CORDIC in vectoring mode to easily add coordinate values with new coordinate values. Within the field of image processing, three-dimensional operations such as lighting and vector rotation are the perfect candidates for implementation using the algorithm.
Modeling in Excel
One of the most straightforward methods of modeling your CORDIC algorithm before cutting code is to put together a simple Excel spreadsheet. It allows you to model the number of iterations and gain (An) initially using a floating-point number system and later in a scaled, fixed-point system, providing a reference for verification of the code during simulation.
As you can see from the Excel implementation in Table 4, the initial X input is set to An to reduce the need for postprocessing the result. The initial argument is set in Z, which is defined in radians, as are the results.
Table 4. Excel Model of CORDIC configured for circular rotation.
Implementing your CORDIC
Unless there is a good reason not to, the simplest method of implementing a CORDIC algorithm within an FPGA is to utilize a tool such as Xilinx CORE Generator. CORE Generator provides a comprehensive interface that allows you to define the exact functionality of the CORDIC (rotate, vector, etc.), as seen in Figure 1.
Figure 1. CORE Generator allows simple implementations
of many popular CORDIC configurations.
of many popular CORDIC configurations.
Unfortunately, the CORE Generator does not provide options for working with CORDICs within the linear mode (the tool does provide separate cores to perform these functions). However, you can write the VHDL code required to implement the algorithm in very few lines, as the simple example of a circular implementation below shows.
There are two basic topologies for implementing a CORDIC in an FPGA: either a state machine-based approach or a pipelined approach.
If the processing time is not critical, implement the algorithm as a state machine that computes one CORDIC iteration per cycle until the desired number of cycles has been completed. If you require a high calculation speed, then a parallel architecture is more appropriate. The code above implements a 15-stage parallel CORDIC operating within the rotation mode. It uses a simple lookup table of ArcTan(2-i) as demonstrated in Table 2 for a circular configuration, coupled with a simple array structure to implement the parallel stages.
In all of these ways, the CORDIC algorithm proves its merit as a simple but powerful algorithm that all FPGA designers should be aware of. Even better, you can implement the CORDIC very simply using either core-generation tools or hand coding.
About the author
Adam P. Taylor is Principal Engineer at EADS Astrium (www.astrium.eads.net). Adam can be contacted via email at aptaylor@theiet.org
If you found this article to be of interest, visit Programmable Logic Designline where you will find the latest and greatest design, technology, product, and news articles with regard to programmable logic devices of every flavor and size (FPGAs, CPLDs, CSSPs, PSoCs...).
Also, you can obtain a highlights update delivered directly to your inbox by signing up for my weekly newsletter – just Click Here to request this newsletter using the Manage Newsletters tab (if you aren't already a member you'll be asked to register, but it's free and painless so don't let that stop you [grin]).
Navigate to related information

