# Arduino Adventures: Does 255/256 = 0 or 1?

Max Maxfield tries to wrap his brain around the problems associated with truncating the results from integer division operations.

As you may recall from my previous blogs, I recently purchased a 4x4x4 3D tri-colored LED cube kit powered by an Arduino-compatible controller. Creating different effects for this rascal is helping me reacquaint myself with the C programming language (well, the Arduino version thereof). In turn, this is helping prepare me to program my Arduino-powered robot.

I will be writing a more in-depth column on my cube in the not-so-distant future. Suffice it to say, for the moment, that I started out with some very simple test patterns, and that I've gradually increased the complexity and sophistication of my effects.

Each LED has three color channels: red, green, and blue. Each channel is controlled using pulse-width modulation. The values that can be assigned to each channel range from 0x00 to 0xFF (0 to 255 in decimal). Those two values equate to being fully off and fully on, respectively.

Up until now, my effects have involved driving the various channels hard on or hard off, thus allowing me to assign seven colors (red, green, blue, red + green, red + blue, green + blue, and red + green + blue) to each LED. Now I'm at the stage where I want to experiment with a wider gamut of colors and to transition gradually from one color to another, and therein lies my problem.

For the purpose of this discussion, let's assume that we are interested in a single color channel for a single LED, and that I have two global variables of type "int" (integer): oldColor and newColor. Each will be set initially to either 0 or 255. Let's also assume that I have a function that starts off looking a bit like this.

The idea is that, when this function is called, there are four possible value combinations for oldColor and newColor.

This explains why I'm using integer variables instead of byte (eight-bit) variables. I want to be able to employ negative values. The howMany parameter specifies how many steps we wish to use to transition from the old color value to the new color value. The parameter is also used to calculate the modColor value, which is the amount by which we wish to keep incrementing or decrementing the oldColor (starting) value until it reaches the newColor (ending) value.

Purely for the purposes of these discussions, let's assume that we pass a value of 4 into the howMany parameter when we call out the lightLEDs function. If the old and new colors start off with the same value (both 0 or both 255), then modColor will end up as 0, so nothing will change as we circle around our for loop. Of course, this is what we want in this case.

Now suppose that we call this function when the old color is 0 and the new color is 255. In this case, modColor will be calculated as 255/4 = 63.73. This will be truncated to 63, since modColor was defined as an integer. (We certainly don't want to use floating-point values. They can consume lots of memory, and performing operations on them can involve a lot of compute cycles if our processor doesn't have a floating-point unit.) In this particular case, this is close enough for government work, as they say. This also explains why I used "i <= howMany" as the control in our for loop. If I had used "i < howMany," we would have gone around our for loop four times (i = 0, 1, 2, and 3). Our tmpColor variable would have been 4x63 = 252. By performing the loop an extra cycle while capping the value at 255, we ensure that we leave the LED fully on when we exit the loop.

Now let's consider what happens when the old color is 255 and the new color is 0. Assuming howMany is 4, then modColor will be calculated as -255/4 = -63.73, which will be rounded up (or down, depending on your point of view) to -64. In this case, we really need to perform the loop only four times -- 255-(4x-64) = -1, which will be bottom capped to 0 -- not five times, as the current code would have it. Of course, since we *are* bottom capping the value at 0, the extra cycle won't hurt us, but it's a waste of computing resources, which niggles at me.

The problem is that things get worse the more steps we take. Let's assume the old color is 0, the new color is 255, and we decide to perform the transition using 128 steps. In this case, modColor will be calculated as 255/128 = 1.99, which will be truncated to 1. Even if we perform our for loop 129 times (0 to 128), we'll still end up with a final color value of 129 instead of 255. Or, as an absolute worst case, suppose the old color is 0, the new color is 255, and we perform the transition using 256 steps. Now we end up with modColor as 255/256 = 0.99, which will be truncated to 0. That means we won't change the color at all.

From the above, we know that we don't have to worry about fading the color down from 255 to 0, which involves modColor having negative values, because two's complement values automatically round (truncate) toward negative infinity. It's the positive values we have to worry about. So, how about if I modify my function to look like the following?

By Jove, I think I've solved it. As you can see, if we are fading the LED up (modColor has positive values), we start off multiplying modColor by 2 before dividing it by howMany. Next we extract the least significant bit. If this is a 1, then we know the system will round our value downward, so after we've divided by 2 (to counteract our original multiplication by 2), we add this 1 back in.

Remembering that multiplications and divisions by 2 involve shifting the two's complement value one bit left or right, respectively, this doesn't add much of a computational burden. It also means that we can use "i < howMany" as our loop control.

To see how this works, let's return to our original example (old color = 0, new color = 255, number of steps = 4). When we calculate our modColor, we start by multiplying 255 by 2 to generate 510. Next, we divide this value by 4 to generate 127.5, which will be truncated to 127, or 01111111 in binary. This is the point where we extract the least significant bit -- in this case, 1. Now we divide 127 by 2 to generate 63.5, which will be truncated to 63. Finally, we add the 1, representing the old least significant bit, and get a modColor value of 64. When we proceed around our loop for i = 0, 1, 2, and 3, our corresponding tmpColor values will be 64, 128, 129, and 256. The last one will be capped at 255.

What do you think? Have I missed anything, or have I cracked this problem? Also, can you think of a more elegant way to achieve what I'm trying to do?

**Related posts:**

- Silicon Labs' Energy-Friendly ARM Cortex-M0+ MCUs
- See Bugblat's TIF (Tiny FPGA Board) & PIF (FPGA for Raspberry Pi)
- What's the Best Robot Base for My Arduino?
- Infographic: The AVR MCU — Atmel's Secret Maker Sauce
- The Internet of Things That Go Bump in the Night
- Do Arduinos Dream of Electric Sheep?
- Be an Atmel AVR Hero & Win $1,000

Author

Garcia-Lasheras 10/24/2013 8:07:50 PM

But there are a lot of problems in using integers for this purpose too, as we cannot deal with decimal values. One of them is that the LED intensity increase/decrease is not going to be linear along the cycles -- it may reach the final value far before the for loop ends.

I propose you a bit more computational intensive solution, but I believe the LED regulation may be nicer:

void lightLEDs (int howMany,

int howLong)

{

int tmpColor = oldColor;

for (int i = 1; i <= howMany; i++)

{

if (newColor != oldColor)

{

tmpColor = 255 * i;

tmpColor = tmpColor / howMany;

if (newColor > oldColor)

tmpColor = 0 + tmpColor;

else

tmpColor = 255 - tmpColor;

}

// Use current tmpColor to drive LED

delay(howLong);

}

}

Author

salbayeng 10/25/2013 1:10:56 AM

I use the delta addition approach in Servo systems as well as LED drivers.

I call it "phase accumulation" , the idea has been around for years , embodied in DDS chips.

In your case I would set up an integer "phase accumulator" and then overlay the "led brightness" byte on the high 8 bits. You then add "ramping speed" to "phase accumulator"

All of the accumulating is done in an interrupt routine, so you can do multiple LED's , and other system timers at once , and can have long fade times. You can also interrupt a fade and fade smoothly off to something else.

--------------------------

Re the "missing bit" , yeh you get this issue with fixed point multiply (and divide) . From memory if you do it in assembler you can keep track of the missing bit, something like this: (plucked from code I wrote years ago, and the detail forgotton)

mulsu r23, r20

SBC r19, _zero

ADD r17, r0

ADC r18, r1

ADC r19, _zero

I can't remember exactly how this works , this piece is transferring the partial results from a multiply into the result. Note the

SBC r19, _zero, this seems to be pointlessly subtracting zero from a register, but its actually propagating the missing bit. Note the Atmel chips have 6 flavors of multiplyMUL Rd, RrMultiply Unsigned R1:R0 ← Rd x RrMULS Rd, RrMultiply Signed R1:R0 ← Rd x RrMULSU Rd, RrMultiply Signed with Unsigned R1:R0 ← Rd x RrFMUL Rd, RrFractional Multiply Unsigned R1:R0 ← (Rd x Rr) << 1FMULS Rd, RrFractional Multiply Signed R1:R0 ← (Rd x Rr) << 1FMULSU Rd, RrFractional Multiply Signed with Unsigned R1:R0 ← (Rd x Rr) << 1So if you go to the trouble of writing in assembler,and pick and choose the type of MUL you use, you can make 255/256=1 , It's also useful to see whether "negative zero" can appear in your algorithm and whether it is handled correctly typically -255/-256 = 1 while 255/256=0 but what about 0/-256 ??

A possible fix to your problem is to use a condition test like <= in one direction and > in the other

Author

Max The Magnificent 10/25/2013 10:02:01 AM

It's also useful to see whether "negative zero" can appear in your algorithmNow you are making my brain ache -- surely one of the main points about the twos complement form of representation is that you cannot get negative zero...

Author

Max The Magnificent 10/25/2013 9:58:34 AM

Author

Max The Magnificent 10/25/2013 10:10:12 AM

if (newColor > oldColor)

tmpColor = 0 + tmpColor;

else

tmpColor = 255 - tmpColor;

Isn't the first part of this redundant? Couldn't we just say:

if (newColor < oldColor)

tmpColor = 255 - tmpColor;

Or are we doing it your way to keep the time pretty much consistent for fade-ups and fade-downs by performing roughy the same number/type of calculations each way?

Author

Garcia-Lasheras 10/25/2013 10:19:34 AM

In addition, about the timing coherence, I was assuming that delay(howLong) dominates the fading, so I was not thinking on exactly matching the number of CPU cycles -- I just love symmetry ;-)

The code could be then:

void lightLEDs (int howMany,

int howLong)

{

int tmpColor = oldColor;

for (int i = 1; i <= howMany; i++)

{

if (newColor != oldColor)

{

tmpColor = 255 * i;

tmpColor = tmpColor / howMany;

if (newColor < oldColor)

tmpColor = 255 - tmpColor;

}

// Use current tmpColor to drive LED

delay(howLong);

}

}

Author

Max The Magnificent 10/25/2013 2:55:31 PM

Of course, I'm handling all 64 LEDs each with three color channels, but the way I'm doing it is by means of a plan so cunning we could pin a tail on it and call it a weasel :-)

Author

Garcia-Lasheras 10/25/2013 5:42:26 PM

"I cannot wait to go home and try your solution thsi evening..."Please, give me feedback once you try this -- if I can help you with other chunk of C code, just let me know ;-)

Author

Max The Magnificent 10/25/2013 5:53:17 PM

Please, give me feedback once you try thisI will do so

If I can help you with other chunk of C code, just let me know ;-)Don't worry ... you can count on it LOL

Author

Wnderer 10/25/2013 10:54:12 AM

if (modColor > 0) modColor = (modColor + 1) / howMany

else if (modColor < 0) modColor = (modColor - 1)/howMany

if modColor is either 0, 255 or -255 and howMany is always even then you end up with a couple of compares, an increment, a decrement and some bit shifts. If I recall correctly that switch statement will create a jump table, so you're better off with if statements if you're looking for compact assembly code.

Author

Max The Magnificent 10/25/2013 2:52:19 PM

I might be missing something but why not just add or subtract 1?I've not tried this yet, but I think that if I try to fade the LEDs up over 256 increments / levels then this will consume a lot of processing cycles and may be overkill (changes not discernable to the human eye). Even if I transition the color using say 16 steps with 100 milliseconds (0.1 seconds) between each step, this still equates to 1.6 seconds.

My idea of being able to specify the "howMany" increments value and the delay between incremnents will let me experiment a bit. Also, setting the "howMany" parameter to 1 should make the LED immediately turn hard on or hard off. The idea is that I can vary the pace of the display either randomly or in a controlled manner...

I'm having a lot of fun with this.

Author

Alvie 10/25/2013 3:21:18 PM

Sometimes using Fixed-Point arithmetic can help with these issues.

The basics behind FixedPoint are quite simple, and easy to implement in most architectures. Let's have a look at your concrete problem (255/256) and see how we could get a somewhat accurate output from it.

The easiest way to implement this would be to use a 10.6 fixed point, where 10 bits are used for the real part (signed number range would be between +511 and -512), and 6 bits for the "fractionary" part - since 6 bits can represent 64 different values, each bit can represent 1/64, 0.01625. This is your resolution in terms of the fractionary part.

The beauty about fixed-point is that most of the arithmetic is done with the integer representation. This includes addition/subtraction and multiplication.

In order to add two 10.6 fixed-point values, represented by a 16-bit integer, just... add them. The result will also be a 10.6 fixed point value. Same for subtraction. For multiplication, the result will not be 10.6, but rather 20.12 - 32 bits. In order to convert this multiplication back to 10.6, just shift it by 6 bits, and truncate the result (note: if you're dealing with signed numbers, you must do this in a slightly different way, in order to preserve the highest bit signal).

So, let's take a look at your case: 255/256.

First, let's represent them as fixed-point, 10.6 numbers, in binary

:

255 => 0011111111.000000

256 => 0100000000.000000

and:

1 => 0000000001.000000

In the following explanations, I'll prefix the numbers with "f" if they are representations of fixed-point values, and no prefix for their integer representations.

Now, let's invert f256. This is done by computing (f1/f256), also in fixed-point format. In order to perform this division, we use the integer representations to perform an integer division. But, similarly to the multiplication (which doubles the number of fractionary bits), the division will also change the fractionary bits - not only that, it will indeed remove them.

Let's take a look at a simple (f1/f1) division. This ought to result in f1, of course. If we take a clear look at the f1 representation above, and compute it's integer value, we get 2^6 == 64. If we divide the values using their integer representation, we get 64/64, which gives us 1. But this integer "1" does not represent "f1", it represents "0000000000.000001". So, we just lost a lot of precision (6 bits actually) by performing the division. To fix this, we shift the result by 6 bits to the left, and get the correct value: "0000000001.000000".

Now, another division. Let's divide f3 by f2, the result should be f1.5. The representation of f3 is "0000000011.000000", and f2 is "0000000010.000000". The integer representations are 192 and 128, respectively. If we perform the integer division, we get 1. If we shift this by 6-bits as in previous example, we get "0000000001.000000", which is f1 - not f1.5. So, we just lost all precision !!!

There are a few solutions to overcome this. One of them is to get a few more resolution by extending the numbers to the next representation, which in this case would be 32-bit. Basically, what we will be doing is to transform the equation:

x = a / b;

into

x = ( (a * N) / b ) / N

which is equivalent. The "N" value would be ((2^6)-1) in this case - which matches the 6-bit shift we have to perform to correct the division result - if we multiply the first argument by the same "shift amount" we have to perform at the end, we don't need to perform the final shift. We can also use a shift to perform the computation of the first operand.

So, for the case of f3/f2, we will get:

a*N (f3 times N): 0000000011.000000[000000] (integer representation: 12288)

b: [000000]0000000001.000000 (integer representation: 128)

And the result is 96 (12288 / 128). In binary form (and fixed point form) it's : [000000]0000000001.100000.

Integer part is 1, and fractionary part is '100000' in binary, 32 in decimal. If we multiply 32 by the fractionary resolution we computed in first place (0.01625), we get... 0.52, close enough to the "0.5" we need to add to the integer part.

Doing the same for your 255/256 values, we get a nicely looking value (result of (255<<6)/256):

0000000000.111111

We are very close to f1, and we can also truncate the result. But that would not be wise : the result is way more close to f1 than to f0.

So, we want to do rounding before converting that to an integer (which means shifting by 6 to the right). In order to do so, we add a '1' in the MSB of the to-be-discarded part. We are discarding 6 bits, so we have to add '100000' to the result:

0000000000.111111

+0000000000.100000

= 0000000001.011111.

Shift by 6 to the right.... gives "1".

So, here's your answer, and a quick, dirty and confuse explanation of fixed point arithmetic :)

Author

Max The Magnificent 10/25/2013 3:40:24 PM

Sometimes using Fixed-Point arithmetic can help with these issues...It all sounds so simple the way you say it LOL

Actually, you are correct in that I had completely neglected to consider using something like a 10.6 fixed-point representation. For some tasks this would be very appropriate -- but for my current application I think I'll stick with the solution from Javi (Garcia).

Author

DrFPGA 10/25/2013 4:43:48 PM

Author

Max The Magnificent 10/25/2013 5:52:15 PM

Author

josyb 10/26/2013 7:03:14 AM

To obtain the same behaviour,

rounding away from zerowhen dividing positive numbers as you get when dividing negative numbers:if ( modcolor > 0 ) modcolor = (modcolor + howmany - 1) / howmany ;else modcolor /= howmany ;

You can write this in a single line perhaps generating less code (as only one division operation is specified):modcolor = (modcolor + (modcolor > 0 ? howmany - 1 : 0)) / howmany ;

This trick can also be modified forrounding half away from zero:if (modcolor < 0) {

sign = 1 ;

modcolor = -modcolor ;

}

else sign = 0 ;

modcolor = (modcolor + (howmany / 2) - 1) / howmany ;

if (sign = 1) modcolor = -modcolor ;

If you want e.g. 1.5 to be rounded to 2 you can leave out the '-1'Author

Max The Magnificent 10/26/2013 5:58:30 PM

To obtain the same behaviour...Wonderful -- thanks for this (now I'll try to wrap my brain around it :-)

Author

pviry750 10/26/2013 11:09:21 AM

You're in effect drawing a straight line in a discontinous space, the coordinates being time and color value. The standard way to do this efficiently is the Bresenham algorithm. The idea is to incrementally compute the current error between the next integer color value (y coordinate) and the next "ideal" color value, and when this error becomes too large to update the current color value accordingly. Here is the code for the inner loop (excerpt from the wikipedia page) :

Note that there is no integer division in the loop, hence no room for rounding errors. And it's efficient: never more than a comparison and two additions at each iteration.

Author

Max The Magnificent 10/26/2013 6:01:09 PM

The standard way to do this efficiently is the Bresenham algorithm...Cool beans -- I will bounce over to peruse and ponder the Wikipedia page on this -- I'm sure I will be doing a lot more of this in the future -- for example, with regard to my current robot project.

Author

Bastian.Schick 10/27/2013 2:50:53 AM

Author

U. Dreher 10/26/2013 7:18:27 PM

The solution is really simple:

result = (dividend + (divisor/2))/divisor or

result = (dividend + (divisor>>1))/divisor

This might lead us to the next level of discussion: what if divisor is odd?

You simply have to decide whether you want to round up (x.5 => x+1) or down (x.5 => x). In one of the cases the above equations will read ((divisor+1)/2) resp. ((divisor+1)>>1).

There's always a last philosophical question or, in other words, the freedom of choice :)

Author

Max The Magnificent 10/28/2013 4:38:12 PM

hough the Bresenham algorithm might be appropriate...I know this is not complex -- but I can imagine the problems that woudl ensue trying to explain all of this to an absolute beginner :-)

Author

another nickname 10/26/2013 8:29:22 PM

Just keep 2 integers for each number numerator and denominator) and rewrite your algorithm using 2 numbers instead of one for all calcluations and do multiplications first and divisions last. It's pretty much the same as Fixed Point arithmetics .

Author

Max The Magnificent 10/28/2013 4:39:53 PM

Your truncation problem would not be there if you stick with fractions.True enough -- but the problem also goes away if we use the solution proposed by Javi (Garcia) in the early comments to this blog.

Author

Bastian.Schick 10/27/2013 2:49:14 AM

- The distance between 0 and 255 is (255-0)+1

- In the ancient times we used a modified Bresenham algorithm to fade between colours.

Cheers

Author

Max The Magnificent 10/28/2013 4:41:18 PM

In the ancient times we used a modified Bresenham algorithm to fade between colours...You make it sound as though you are talking about the last millennium .... Oh, wait a moment... LOL

Author

jpessin 10/28/2013 4:26:42 PM

http://arduino.cc/en/Reference/Map

This gives:

void lightLEDs (int howMany,

int howLong)

{

for (int i = 1; i <= howMany; i++)

{

int tmpColor = map(i, 0, howMany, oldColor, newColor);

// Use current tmpColor to drive LED

delay(howLong);

}

}

Author

Max The Magnificent 10/28/2013 4:46:34 PM

It's probably not the most efficient option computationally, but perhaps the simplest solution would be to use the Arduino language's map() function.This just goes to show that I really need to spend some time on the Arduino website looking to see what's there. I'm currently ver yhappy with the solution proposed by Javi (Garcia) earlier in these comments, but it's great to know that this (and other) functions are available.