IMO, CORDIC is a great way to compute sine and cosine, and also arctan. CORDIC only uses shifts and ADD/SUB. I'd also stick to fixed-point arithmetic, since the values of sin and cos only go between -1 and +1.

As far as representing angles, I like Binary Angular Measurement (BAM). You represent an angle as a fixed-point binary fraction of a circle. You use every bit of your number representation, and BAM automatically calculates angles modulo 360 degrees due to the modulo 2^N nature of binary arithmetic.

BAM angles have the nifty property that unsigned and two's complement arithmetic are equivalent. You can think of angles as being from 0 to just less than 360 degrees using unsigned arithmetic, or from -180 to just less than 180 degrees using 2's complement arithmetic.

That's the 1st I've heard of "BAM." Thanks for the info!

As for CORDIC, I've still not quite wrapped my head all the way around it, yet, but I see it takes some pre-computed lookup tables for it to work. Which is okay, I suppose, but the purpose of this set of exercises was to get some decent fixed-point libraries put together.

The other thing about this silly project was that it got the computation done in a single (albeit slow) clock tick.

The pure Taylor works, but it just eats up too much logic. Here's how it's normally done: You start with a lookup table in blockRAM. There is plenty in a Xilinx part. You then do a Taylor to interpolate between the LUT values. A first-order (linear interp) or second-order is usually good. You need the derivative of sine at the LUT points. The derivative is cosine, which you get from a different address in the same LUT. Since the blockRAM is dual-port, you read it out on the second port on the same cycle. Now you multiply this by the lower-order address bits (left over after using the upper bits for the LUT address) and bit-shift and add to the LUT value. Booyah, 1st-order Taylor. If you want to do higher orders the derivatives are easy (either sin or cos from the LUT or their opposites) and you use Horner's Method to reduce the number of multiplies necessary. This method is also super-fast when you use blockRAMs and DSP48s. This is usually what you'll get if you use a core.

CORDIC is OK if you want a lot of precision and you have a lot of time to blow. It's not as valuable as it was in the pre-DSP48 days.

Beware of maintaining mathematical properties such as monotonicity. A classic problem with rough estimates from tables is to have the interpolations at the joints between section be discontinous or even slightly backwards step. Even when you think you have the right formula all you need is one of them to falsely round down and the other to falsely round up, which can happen with a fraction of lsb inaccuracy. Since an implementation can end up being reused somewhere else you probably want to be sure it is suitable for unexpected uses.

Given the small word length of your inputs you can afford to run tests which exhaustively check every value for glitches like that.

You can do arctan with a modified version of CORDIC. You can also do something similar to the method I described below, but using the Farrow technique. It's not easy to get the derivative of arctan as it is for sine, so you use a blockRAM LUT for arctan, another for the derivative of arctan, and possibly another for second derivative. Then you can do a Taylor expanson to interpolate between the LUT values in the first LUT by using the derivatives in the other LUTs. Use Horner's Method again to save multiplies. The Farrow method works for arbitrary functions.

So if you want to know the arctan between LUT values N and N+1, you take the N value from the first LUT, add 1/2 * the Nth value in the second LUT, and (1/2)^2 / 2! * the Nth value in the 3rd LUT. (Not taking into account Horner's Method.) There are probably a lot of other techniques for arctan that smart people have come up with.

If you're doing arctan on a complex IQ value in order to find its phase, for demodulation purposes, you can take the absolute value to map it into the first quadrant, and then use one 2-dimensional lookup table to find phase angle and another for magnitude. (There are also simple algebraic functions which approximate magnitude pretty well. That might be another interesting topic for the next article in this series.) By 2-dimensional, I mean that half of the address bits are used for I and half for Q and the output of the LUT gives you arctan(Q/I).

David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.

To save this item to your list of favorite EE Times content so you can find it later in your Profile page, click the "Save It" button next to the item.

If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.