There is a lot of talk about low power these days, and not surprising given the way that handheld devices are become the must have things to be seen with, play with, and to keep our productivity high.
Power has become one of the principle attributes that designs are optimized for. We see power optimization techniques popping up at all abstraction levels of the design, from a new fabrication technology that reduces transistor leakage, through clock gating, power gating, variable voltage and frequency and everything in between.
The other day I wrote a blog that talked about the differences between accuracy and fidelity when it comes to power estimation. Accuracy can only come with detail, but detail often prevents our ability to access the information from which we can make other important decisions. When we cannot have accuracy, we need fidelity.
But even today, we do not see the power optimization button in synthesis tools. Where is the “optimize for low power” option? The problem is that when trying to minimize power consumption there are so many related factors that spread across the whole design and cannot be localized, as can most of the tradeoffs between area and timing. And we also have to realize that optimizing for total energy consumption for a particular task involves optimization across large spaces of time.
Sure, inserting a smaller driver on a device will decrease power, but this is fiddling in the mud. Power optimization starts at the system level and we then try and fine tune it as we go down the levels of abstraction.
So let’s take a look at power and FPGAs. FPGAs have a reputation for being power hogs, but does that mean that no battery operated device should use an FPGA? Not so fast on that. While it is true that an equivalent ASIC implementation of the same functionality would consume considerably less power, that is not always an economic option, or one that gets the products out during the necessary time window.
So let’s assume that an ASIC is not viable. The choices are now to implement the functionality in an FPGA, in a DSP, in a general purpose processor or in a customized processor – or a combination of them.
If we assume that all implementations use exactly the same underlying technology (not sure that is possible, but need to simplify things a bit), then total power consumed, to perform a particular task, is proportional to the total number of transistors or gates that switch in the course of performing that action (plus the leakage power for all of those instantiated).
The active power is a combination of the total number of transistors, the number of clocks and many other factors. While a general purpose processor may be very flexible, it does not use gates efficiently for any particular task and is likely to take more clock cycles than any other solution. In addition, additions gates are taken up by the memories, busses and other functions necessary to implement the solution.
A DSP may well be optimized for a particular type of application and thus provide a much better utilization of gates and get the job done a lot faster. In a similar manner a customized processor may take this even further providing only what is absolutely necessary for the desired task and get the job done at the fastest possible rate for a software programmable solution. The more customized the solution is, the less flexible it becomes and may not be as efficient at tasks other than the one it was optimized for.
Now, while an FPGA is going to have much less optimal power consumption per gate actually used (it can never be possible to “fill” an FPGA exactly, and thus we have to bear the extra static power consumption of the unused gates), it can get the job done with far fewer clocks and total real gates than any programmable solution (yes – I know most FPGAs are programmable as well, but not in the same way as a processor). A study conducted by BDTi (www.bdti.com) at the beginning of this year (see Figure 1) showed that an FPGA solution is likely to have 40X the performance compared to a top of the line DSP based solution.
Figure 1: relative performance between DSP and FPGA solutions
With this much extra processing power, the DSP can either be eliminated, or if it is still required for other tasks, it can probably be replaced by a much smaller one. The same would go for a general purpose CPU being replaced by this type of solution. So while an FPGA may consume more peak power than a DSP, the FPGA could be shut off for about 97% of the time and still provide the same compute capacity assuming the FPGA were not being used for any other activities.
Now, while I have no hard data to compare these two solutions, it would appear as if total power per unit of computation is probably less with FPGAs, for their current generations, but you have already heard some of the caveats that go along with that.
So my point is, that it is unlikely that we will see any synthesis tool be able to optimize for power in the near future, when it is so difficult just to compare these mega design alternatives that could provide differences in power consumption by factors of 10 or more. It makes those 5% savings from tool XYZ look like icing on the cake rather than being the cake. This is why we still need good designers who can work through these kinds of problems and why we are not all going to be replaced by software engineers who can program in C.
Brian Bailey – keeping you covered