# Doing Math in FPGAs, Part 1

There are a number of tricks we can use to perform mathematical operations efficiently in FPGA hardware.

For a recent project, I explored doing "real" (that is, non-integer) math on a Spartan 3 FPGA. FPGAs, by their nature, do integer math. That is, there's no floating-point unit (or even fixed-point) in there unless you "roll your own" (I think many of the newer devices have dedicated math functions in them).

This led me (temporarily) down the path of trying to convert the "real" numbers I was working with (that are fairly small) into integers (by multiplication by 10 -- basically multiplying by powers) to perform the function. Why would I want to do this? Well, I was trying to get over the pains of (large) combinational multipliers and cumbersome dividers. Of course, multiplication is just repeated addition; but it appears that my synthesis tools prefer to generate them from combinational logic. This makes for faster speed at the sacrifice of real estate (or so I suppose).

The synthesis tools, however, do not seem to do this for division. This leaves us with a potential problem, then. If I roll my own divider, it will likely do successive additions/subtractions to compute a result. This means that I would have a stand-alone unit that would need to incorporate "ready" signaling to the next higher assembly. That is, I would have to hand it the numbers to divide and then wait for the module to signal that it had completed the operation before I could continue on with my process.

But, if we can contrive our math such that we are always using powers of two, then we can use shifts, right? That is, a left shift of one bit is equivalent to a multiply by two, while a right shift of one bit is equivalent to a divide by two:

This is handy if we can keep ourselves in the binary world, but sometimes it's nice to work in the more familiar decimal world. To this end, I thought I'd share a couple of tricks I've learned over the years.

The first trick is a fast way to multiply by 10. Even these days, this is a faster way to perform a "multiply by 10" function than what your CPU will normally do when it sees X * 10. So, are you ready to do some math?

Yep -- if you perform a left-shift of one bit and add it to a left shift of three bits (2^{3} = 8), then you've got a multiply by 10. In an FPGA, this is easily done in a single clock cycle... (on an older Intel CPU, it takes three clock cycles, while a normal multiply takes five or more).

Now, dividing is not so easy, but there is a way to do it. Allegedly, this is what the Microsoft compilers do when they see a ÷10, as it's much faster than doing an actual division (which is slow, slow, slow). What I'm about to show you is all "smoke and mirrors" to me as I've never seen the derivation, but it surely seems to work. Are you ready to scratch your head a lot and then (on your own) try a bunch of examples to prove it really works?

Here's what to do... whatever your word length, fill a register with 19...9Ah. That is, if you have a 16-bit word, load a register with 199Ah (199A in hexadecimal). Similarly, if you have a 32-bit system, then fill a register with 1999999Ah. Now multiply this by your number (1 clock in the FPGA) to generate a double-word result (that is, a 32-bit result on a 16-bit system or a 64-bit result on a 32-bit system). If you throw away the lower half of the result, you'll find that the upper half is identical to dividing the original number by 10. Here's a pictorial depiction of the process:

Simple, huh? I'm sure most of you already know this, but there's always a few of us that could use an occasional reminder. I mentioned that I had started down the multiply-by-10 and divide-by-10 path to try to get around the need for "real" number representations in my project. Well, I was never quite able to "tease" that out -- it got too complex too fast. I'm sure one of you folks that's a lot smarter than myself (which is pretty much any of you, I'm sure) could have figured it out, but I'm me, and that's where we'll leave it for now (our editor, Max Maxfield, hates it when I get long-winded, which is pretty easy for me to do).

Next time I'll talk about the implementation I settled on for my project, along with some of the reasons I chose the options I did. Until then, please post any questions or comments below.

**Related posts:**

- Altera & Enpirion: Great Things Have Arrived
- Lattice Always-On Sensor Solutions for Context-Aware Mobile Devices
- Altera's Secret Processor Unveiled: a Quad-Core ARM Cortex-A53
- Handling Asynchronous Clock Groups in SDC
- Cloud-Based FPGA Verification from OneSpin & Plunify
- Altera Announces 20nm FPGA/SoC Tools
- Xilinx Announces 20-nm FPGA/SoC Devices

Author

paragdighe 12/10/2013 4:45:31 AM

Author

DrFPGA 12/9/2013 2:37:47 PM

Anyone else have tools with detailed enough docs that they can tell ahead of time what is being done?

Author

betajet 12/9/2013 2:20:02 PM

Lesson 1: Don't trust automatic state assignment. Some synthesizers make it hard to turn off automatic state assignment, but it's worth the effort.

Lesson 2: Don't trust simulation. It only simulates theoretical models, not the real world.

Lessons 1+2 combined and generalized: Know the limitations of your tools.

Author

DrFPGA 12/9/2013 11:02:27 AM

Maybe there should be a 'list' of tricks that users suggest would be good to put into synthesis tools. Perhaps via a discussion thread on Programmable Logic Design Line?

Author

Max The Magnificent 12/9/2013 10:04:24 AM

Pretty cool math...I wonder whether they teach something like that in vlsi classes...

I'd be interested to knwo that myself -- but I fear not. When I started out, everyone knew and "swapped" tips and tricks like this -- it was key to making programs run as fast as possible when you were limited in terms of memory size, clock frequency, and the fac tthat CPUs too multiple cycles to do anything.Magazines like BYTE were always publishing stuff like this... I love this stuff

Author

Max The Magnificent 12/9/2013 10:01:11 AM

There's some nice material about constant division on the companion website for "Hacker's Delight"I was going to mention that book Hacker's Delight -- it's a great book -- lots of useful tricks in it. I just now found there's a second edition (the link above) -- I have th efirst -- I'll have to add this second edition to my wish list :-)

Author

Adam-Taylor 12/9/2013 4:04:30 AM

This is then scaled by integer(2**16 * 0.1) = 6554 or 0x199A doing the multiply then works out the same as doing the divide. This trick can be used for any number when the divisor is fixed.

Author

Adam-Taylor 12/9/2013 3:47:20 AM

The more complex things like filters and CORDICs etc are covered in other articles

I find a simple example often helps to demonstrate the principles

Author

DrFPGA 12/8/2013 11:49:13 PM

Thanx for posting the link to your article. Very detailed and even had some code!

Author

tom-ii 12/8/2013 10:31:35 AM

Of course I don't mind!