Design Article
Ensuring low-power design success at 65 nm
Tom Chau and Cheng Shi
7/17/2006 9:00 AM EDT
The holy grail of power management is to achieve the lowest power, best timing and smallest area while delivering a billion-dollar product to market first. This is especially true for wireless, mobile and consumer ICs at the 90- and 65-nanometer nodes, where the biggest design challenge is to deliver power-efficient devices without sacrificing performance. So what combination of low-power techniques should be used, and how can the engineer minimize trade-offs to implement high-performance and power-savvy designs?
Following are some tips, compiled from lessons learned when working with advanced low-power designs at 90 nm and 65 nm, to help ensure low-power design success.
Do
• Partition multivoltage design at the register-transfer level by logic hierarchy and domain. Consider clocks as separate synchronous domains. "Code in" module-level isolation cells and level shifters to ensure successful formal verification and simulation of the power control sequence. If physical-block hardening is desired, provide design constraints with the necessary access to the block ports.
• Implement clock gating during RTL synthesis to reduce dynamic power and area, not just on logic but on the clock tree.
• Place level shifters close to the voltage area boundary, to simplify routing of the secondary power rail and ensure that all level shifters transition simultaneously when there is a voltage crossover.
• Cluster and spread registers intelligently during placement. Clumping registers saves power; spreading them can enhance performance.
• Optimize for leakage early using multithreshold libraries. Consider leakage power during RTL synthesis.
• Power down an active block with all techniques at your disposal: isolation cells, state-retention registers, power-gating cells, multithreshold (MT) CMOS cells and "always on" buffers.
• Power up a sleeping block in phases. A sudden jump-start can cause rush currents and compromise reliability.
• Understand the trade-offs of fine- and coarse-grain MT CMOS. Fine-grain turns on and off faster; coarse-grain can enable more leakage reduction.
• Make one corner the most timing-critical.
Don't
• Overdo clock gating. Optimal low-power clock tree implementation must consider timing, skew, insertion delay, area and congestion, as well as dynamic and leakage power. Some designers can focus on one or two metrics and end up with a design that is worse overall.
• Forget about design-for-test voltage-aware scan insertion. In this DFT process, scan chains are not allowed to cross between voltage islands. Adaptive voltage scan techniques can also minimize the routing congestion and number of level shifters.
• Underestimate signoff timing, crosstalk and power analysis. The worst corner for power is often different from the worst (or best) corner for timing, and power consumption can vary widely in different operational modes. That is why the signoff flow needs to be augmented using multimode/multicorner techniques that analyze all modes simultaneously using many independent corners.
• Go overboard when using high-threshold cells during leakage optimization. If you set the timing-critical range too tightly in physical synthesis, many timing-noncritical paths inevitably become timing-critical because of process variation. In addition, from the perspective of automated test program generation, more timing-critical paths warrant more at-speed test patterns, thereby increasing test costs.
• Get "retention-flop happy." Retention registers (also known as balloon flops) have a flip-flop and a retaining latch built in a single library cell, which means the area is usually larger for a retention register than for a standard register. Synthesis tools must make a choice as to which registers should be selectively replaced to be nearest the retention registers, to provide the best leakage savings without compromising the area and routability of the sleep rail and sleep signals.
By Tom Chau, group director of corporate application engineering for RTL-to-GDS flow and methodology, and Cheng Shi, director of corporate application engineering for low-power and reliability products, Synopsys Inc.



