Given the surging importance of power in today's shrinking technologies, low-power verification is taking on a vital role. Design teams can no longer afford to just worry about dynamic power. They have to pay close attention to leakage power as well. Thus they have to implement power reduction methodologies at the system architectural stage. Here are some of the power reduction techniques used today along with the possible hazards associated with using them.
Power Reduction Techniques
Most of the complexity of using these techniques comes from the need to create the necessary low-power structures virtually. Any time a design tool has to create logic based on implied functionality, there is added complexity for verification of this logic. An alternative to the virtual creation of low-power structures is the explicit addition of these structures into RTL. But this causes a different set of issues (see the Power Intent Specification section).
Clock gating and operand isolation are two methods for reducing dynamic power that have been in common use for some time now. Clock gating is efficient for registers that do not change values often, while operand isolation is used when combinational results are not used in the current context. Both of these methods require addition of enable signals in the RTL, but current synthesis tools recognize these enables and automatically create the necessary structures. The verification impact is low for these two methods, and the dynamic power savings can be significant. The use of clock gating adds latches into the design, along with the need to route the high fanout enable signals. Verification and place-and-route tools need to deal with enable latch cloning and decloning.
Synthesis optimization is used to reduce dynamic power, but may or may not be effective, based on the synthesis tool used. If your synthesis tool cannot take advantage of concurrent timing, area, and power optimizations, then a structure has already been created that negates most of the benefits of power optimization. Optimizations such as gate sizing and merging, pin swapping, buffer optimization, and instance reduction often have been used after the basic structure is synthesized for performance or area. At that point, it is too late to take advantage of significant power savings. The impact on verification is very low, and there is no need to add in any virtual structures.
Multi-voltage threshold (MVT) synthesis is used to reduce leakage current. The synthesis tool is allowed to replace low-threshold (fast) cells with high-threshold (slow) cells on non-critical paths. This method is effective in reducing leakage power but, again, depends on the quality of your synthesis tool. If your synthesis tool is not using global, concurrent synthesis, a structure is already in place where there are more critical paths. This leaves less room for high-threshold cells. There is no impact on verification, and no need to virtually create logic.
Substrate biasing is a technique used to reduce sub-threshold leakage current by decreasing the performance of functional blocks when speed is not critical. Its effectiveness is diminished below the 90-nm process technology, since gate tunneling leakage becomes more of a factor. There is no verification impact, but you do pay a high implementation cost because of the need for extra power routing.
The use of multiple supply voltages (MSV) allows for reduced dynamic power on non-critical functionality. Using multiple voltages can provide substantial power savings but adds a high degree of difficulty to the design process. A large amount of "what-if" analysis is necessary to make sure lower voltage blocks will still meet timing and area requirements. This can be very difficult with a bottom-up synthesis flow. The actual timing is not known until all the blocks are assembled for chip-level analysis. This can result in iterations to solve timing, area, and power problems. The synthesis tool needs to be aware of power domains when doing logical, physical, and clock tree optimizations.
There is also the need for level-shifter insertion on all signals crossing voltage domains. A synthesis tool that allows for top-down compiles and multi-library support can greatly reduce the number of iterations. Place-and-route (P&R) becomes much more difficult as well. All power domains need to remain separate when considering power rail, clock, and signal routing. Signal integrity can become a huge issue if signals from one power domain are allowed to be routed close to signals from another domain. Test insertion needs to be power domain-aware as well. The test insertion tool may need to reorder scan chains based on power domain crossings, as well as insert level shifters in the scan chain. The test tool may need to create special testing modes to account for high switching activity on the scan chains. Automatic test pattern generation (ATPG) may need to reduce switching caused by random auto-fill of the don't-care bits. Using MSV in a design is considered a more advanced method since it has a high impact on architectural analysis, synthesis, P&R, test, and verification. The use of MSV in designs does require the virtual insertion of level shifters.
The use of power shut-off (PSO) is the most effective way to save both dynamic and leakage power. The penalty is much higher verification and implementation costs. The system architect needs to determine what portions of the design can actually be shut off. The need to maintain a state of the shut-off block also has to be considered. Is there enough time to scan in and then scan out the register contents to and from a memory? Can the state simply be reset? Are state retention power gates (SRPGs) needed? Will outputs of the shut-off block need to be isolated? And then, how is this all verified? How will always-on power be routed? How many power switches will be necessary for shut off? And where will they be placed? Will you now need to route high fanout nets for power shut-off control? Will the chip be tested with all power domains on at the same time? If not, we may need to implement special scan-chain bypass circuitry and controllers. Will this cause power problems on the tester? PSO will affect all design, implementation, and verification steps. And of course the isolation, power switches, scan-chain bypass, and SRPGs need to be inserted virtually.
Dynamic voltage and frequency scaling (DVFS) can save substantial dynamic and leakage power, but does come with added implementation and verification complexity. The architects will need to determine functions that can run at reduced speed. They need to determine if the latency to switch from one voltage/frequency setting to another is worth the cost. During implementation, multi-mode and multi-corner analysis needs to be run to ensure performance is met for all voltage levels. How will the clocks be synchronized? What will control the voltage/frequency mode switching? How will that controller be verified? Using DVFS in a design necessitates the virtual insertion of level shifters.
Adaptive voltage scaling (AVS) is a method used to reduce the voltage supply by monitoring process quality and temperature. The idea is to create monitors on the die to feedback performance to a controller. If the temperature or process quality is good, then the voltage can be reduced, with no impact on performance. This can impact die area and verification, but can reduce dynamic power as well as leakage power. Like DVFS, this may require virtual level-shifter insertion.
Power Intent Specification
Once the decision has been made to use low-power design methods, this intent has to be somehow specified. Early ideas for specifying low-power intent included adding the power intent into RTL, design constraints (SDC), and library (Liberty) source files. This translated into having many copies of the same functional code to express different power requirements. This also necessitated complete functional regression runs simply because an isolation value was found to be incorrect. Fortunately, led by the Common Power Format (CPF) and later the Unified Power Format (UPF), power intent specification has been stripped completely out of RTL. Given my familiarity with CPF, I will speak in terms of CPF.
The idea behind CPF is to allow the user to completely specify low-power intent in a human-readable, TCL-based format—and more importantly, to keep the power requirements separate from the functional (RTL) requirements. It is then up to separate point tools to support interpretation of the CPF commands and translation into the native tool commands. This means the user defines low-power intent in one place instead of many tool-dependant places. Given the advent of this power-intent format, the need to verify this intent has become vital.