Sounds a bit like Goldschmidt division: converting the factor 1/10 into X/(2^n). Multiply by X, shift N bits, done. Taking the upper 16 bits of a 32-bit word is equal to a 16-bit shift.

I suspect 1/10 doesn't translate perfectly to binary so, just like 1/3 becomes 0.33... you get a hex factor of 0x1999... in your division.

Hi, Tom: As you said, multiply is a series of additions -- but not dependent on the magnitude of the multiplier, only the number of 1 bits after making both operands positive by complementing if negative.

It is a shift and add sequence starting with the low order bit of the multiplier if it is a 1 add the multiplicand to the double wide product high order and shift right 1 into low order. if multiplier bit is zero, just shift product right one.

Repeat until higher multiplier bits are zero or shifts equal to multipl;ier width. If the high bits are all 0s, then just shift for the remaining word width.

Division was trial subtraction by subtracting the divisore fron the dividend, if the result was positive shift 1 into the quotient high bit else shift 0. Remainder is left in the reg that held the high order dividend and the quotient in the low order.

If the signs of the dividend and divisor were different, complement at the end.

This is best I remember, maybe a few details missing. The shifts amounted to *2 and /2 and the add/subtract would wind up in the appropriate power of two positions.

I think the constant used in the compiler is 1/10 so they are multiplying by the reciprical of 10 to effectively divide by 10.

These are clever tricks, but why are you stuck on dealing with base 10, when ultimately you're implementing all the operations with shifts, adds & subtracts in base 2?

The way I look at this is as a modified form of Q15 arithmetic. For starters, I can represent a fractional number as a 16 bit fixed point signed integer by the following relationship:

-32768 <--> -.5 32768 <--> .5

Thus .5 is 0x 7FFF (almost). To get .1, I divide by 5 and .1 is 0x199A. This is where the 0x199A factor comes from. When I multiply 2 Q15 numbers, I get a 32 bit result a 32-bit Q31 result. This means .25*.1 is as follows:

.25 => 0x4000 .1 => 0x199A

0x4000 * 0x199A => 0x06668000 0x666800 >> 16 is 0x0666 => which corresponds to .025

Floating-point numbers are not real numbers. Real numbers obey the associative law of addition. Floating-point numbers do not. Try adding 1 to an accumulator 10^9 times with six-digit floating point. Once the accumulator has reached 10^6, adding more ones doesn't change the accumulator, so the sum of 10^9 ones is 10^6 instead of 10^9. If you add the 1's in groups of 10, and then add those sums in groups of 10, and so on, you'll get the correct value. However, since the result depends on the order of addition, the floating-point numbers violate the associative law. Don't expect floating-point to behave like real numbers without considering these effects.

Non-negative integers, OTOH, do behave mathemically like modulo 2^n numbers so you do get the correct result modulo 2^n.

I agree with the above poster regarding using a decimal radix. Why not use base 2 like IEEE floating point or base 16 like IBM/360?

Not only do floating point numbers not obey associativity, they also lack precision. Sure, if 24 bits of precision doesn't meet your needs, you can go to double precision, but both formats are wasteful when your doing arithmetic in hardware.

Fractional fixed point (often referred to as "Q" format) is efficient -- you choose exactly the precision you need, no less and no more -- and the bookkeeping exercise of keeping track of the radix point is not a big deal.

These are clever tricks, but why are you stuck on dealing with base 10, when ultimately you're implementing all the operations with shifts, adds & subtracts in base 2?

I didn't intend to imply that I was stuck with base 10. Base 10 is just natural for us humans, and there are plenty of applications that use it a lot, especially when the end result is decimal math. In this specific application, I am sending the computation engine decimal numbers, and expecting them in return. It was worth my while to run down the rabbit hole to see if there was an immediately easy way to get it done this way.

David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.

To save this item to your list of favorite EE Times content so you can find it later in your Profile page, click the "Save It" button next to the item.

If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.