Sounds a bit like Goldschmidt division: converting the factor 1/10 into X/(2^n). Multiply by X, shift N bits, done. Taking the upper 16 bits of a 32-bit word is equal to a 16-bit shift.

I suspect 1/10 doesn't translate perfectly to binary so, just like 1/3 becomes 0.33... you get a hex factor of 0x1999... in your division.

These are clever tricks, but why are you stuck on dealing with base 10, when ultimately you're implementing all the operations with shifts, adds & subtracts in base 2?

These are clever tricks, but why are you stuck on dealing with base 10, when ultimately you're implementing all the operations with shifts, adds & subtracts in base 2?

I didn't intend to imply that I was stuck with base 10. Base 10 is just natural for us humans, and there are plenty of applications that use it a lot, especially when the end result is decimal math. In this specific application, I am sending the computation engine decimal numbers, and expecting them in return. It was worth my while to run down the rabbit hole to see if there was an immediately easy way to get it done this way.

@Kris: Pretty cool math...I wonder whether they teach something like that in vlsi classes...

I'd be interested to knwo that myself -- but I fear not. When I started out, everyone knew and "swapped" tips and tricks like this -- it was key to making programs run as fast as possible when you were limited in terms of memory size, clock frequency, and the fac tthat CPUs too multiple cycles to do anything.

Magazines like BYTE were always publishing stuff like this... I love this stuff

Remember when we had to do many of the FPGA tricks by hand? Register retiming, one hot state machines, if then else to mux conversions? Now the synthesis tools do all this for us. Hopefully tricks like the multiply by a constant will get folded in too. Just depends on how common the need is for these tricks I guess.

Maybe there should be a 'list' of tricks that users suggest would be good to put into synthesis tools. Perhaps via a discussion thread on Programmable Logic Design Line?

I remember the first time I synthesized an FPGA design for a real product. The synthesizer automatically came up with a one-hot state machine coding. Everything simulated fine. Unfortunately, this particular design did not have a well-formed clock when connecting or disconnecting the input signal, which would cause it to lose the one-hot code occasionally when reconnecting. Switching to a manual state coding that automatically recovered from bad states fixed the problem -- and saved logic cells in this case.

Lesson 1: Don't trust automatic state assignment. Some synthesizers make it hard to turn off automatic state assignment, but it's worth the effort.

Lesson 2: Don't trust simulation. It only simulates theoretical models, not the real world.

Lessons 1+2 combined and generalized: Know the limitations of your tools.

Very important point and one difficult to do in practice. How many tools actually document (in detail) what the tool will do or not do. I find I need to do small example designs and look over th output of the place and route tool to figure out what is going on. Widh the vendors did this for me!

Anyone else have tools with detailed enough docs that they can tell ahead of time what is being done?

@Brian: There's some nice material about constant division on the companion website for "Hacker's Delight"

I was going to mention that book Hacker's Delight -- it's a great book -- lots of useful tricks in it. I just now found there's a second edition (the link above) -- I have th efirst -- I'll have to add this second edition to my wish list :-)

Hi, Tom: As you said, multiply is a series of additions -- but not dependent on the magnitude of the multiplier, only the number of 1 bits after making both operands positive by complementing if negative.

It is a shift and add sequence starting with the low order bit of the multiplier if it is a 1 add the multiplicand to the double wide product high order and shift right 1 into low order. if multiplier bit is zero, just shift product right one.

Repeat until higher multiplier bits are zero or shifts equal to multipl;ier width. If the high bits are all 0s, then just shift for the remaining word width.

Division was trial subtraction by subtracting the divisore fron the dividend, if the result was positive shift 1 into the quotient high bit else shift 0. Remainder is left in the reg that held the high order dividend and the quotient in the low order.

If the signs of the dividend and divisor were different, complement at the end.

This is best I remember, maybe a few details missing. The shifts amounted to *2 and /2 and the add/subtract would wind up in the appropriate power of two positions.

I think the constant used in the compiler is 1/10 so they are multiplying by the reciprical of 10 to effectively divide by 10.

The way I look at this is as a modified form of Q15 arithmetic. For starters, I can represent a fractional number as a 16 bit fixed point signed integer by the following relationship:

-32768 <--> -.5 32768 <--> .5

Thus .5 is 0x 7FFF (almost). To get .1, I divide by 5 and .1 is 0x199A. This is where the 0x199A factor comes from. When I multiply 2 Q15 numbers, I get a 32 bit result a 32-bit Q31 result. This means .25*.1 is as follows:

.25 => 0x4000 .1 => 0x199A

0x4000 * 0x199A => 0x06668000 0x666800 >> 16 is 0x0666 => which corresponds to .025

Floating-point numbers are not real numbers. Real numbers obey the associative law of addition. Floating-point numbers do not. Try adding 1 to an accumulator 10^9 times with six-digit floating point. Once the accumulator has reached 10^6, adding more ones doesn't change the accumulator, so the sum of 10^9 ones is 10^6 instead of 10^9. If you add the 1's in groups of 10, and then add those sums in groups of 10, and so on, you'll get the correct value. However, since the result depends on the order of addition, the floating-point numbers violate the associative law. Don't expect floating-point to behave like real numbers without considering these effects.

Non-negative integers, OTOH, do behave mathemically like modulo 2^n numbers so you do get the correct result modulo 2^n.

I agree with the above poster regarding using a decimal radix. Why not use base 2 like IEEE floating point or base 16 like IBM/360?

Not only do floating point numbers not obey associativity, they also lack precision. Sure, if 24 bits of precision doesn't meet your needs, you can go to double precision, but both formats are wasteful when your doing arithmetic in hardware.

Fractional fixed point (often referred to as "Q" format) is efficient -- you choose exactly the precision you need, no less and no more -- and the bookkeeping exercise of keeping track of the radix point is not a big deal.

Fractional fixed point (often referred to as "Q" format) is efficient -- you choose exactly the precision you need, no less and no more -- and the bookkeeping exercise of keeping track of the radix point is not a big deal.

This is actually what I used in my final implementation - We'll get to that in a few weeks, I suppose.... (Max willing)

The divider works because the 0x199A for a 16 bit number is the reciprocal of 10 which is 1/10 or 0.1

This is then scaled by integer(2**16 * 0.1) = 6554 or 0x199A doing the multiply then works out the same as doing the divide. This trick can be used for any number when the divisor is fixed.

David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.

To save this item to your list of favorite EE Times content so you can find it later in your Profile page, click the "Save It" button next to the item.

If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.