Successive advances in CMOS technology scaling (to 35 nanometers by 2008) and in packaging technology are increasing the number of functions, gates and I/O interfaces packed on a die. Shrinking device dimensions, coupled with the use of low-dielectric-constant materials, is accelerating performance gains. Supply voltage is scaling down as well. Everything looks great-until you look at the power picture and at the signal integrity challenges that these advances pose.
Industry estimates indicate that power demand for a typical system is growing by 25 percent to 30 percent annually, while the power storage capacity of batteries is increasing at only 10 percent to 15 percent. To achieve longer battery life and to avoid the spiraling costs of cooling systems and of expensive packaging, power budgeting and power management are a must.
A breakdown of the total power consumption of a leading industry laptop indicates that 21 percent of the total power is consumed in digital circuitry, while the display, drive and LAN account for 36, 18 and 18 percent, respectively. Those figures indicate that even if we save 50 percent on IC power consumption, the total power savings is capped at 10 percent. It's therefore clear that if power is to be managed effectively at a system level, a top-down approach must be implemented that starts with software at the operating-system level and that includes power optimization at the behavioral, RTL, logic, circuit and library element levels.
This article will focus mainly on the technology and hardware design aspects of power management. The total power consumption can be represented by the equation on page 35.
Static power is a consequence of circuit design biasing techniques and is not technology-dependent. Also, short-circuit switching power is, in reality, part of dynamic power. Thus we will focus on the dynamic and the leakage power terms.
The dynamic power term is defined as (K x C x F x V2). It is quadratic in to voltage and is linear in all the remaining terms. The quadratic dependence of power on voltage indicates that scaling down the operating voltage is the factor that results in the highest payoff in power savings. Also, it can be accomplished, to a certain extent, without sacrificing performance through the use of parallel processing to make up for the reduced voltage performance degradation.
However, there are two limitations to that approach. One is noise margins; the other is the point at which the overall power cost of parallel processing exceeds all the benefits resulting from scaling down Vdd. The idea behind parallel processing is to minimize the power x delay product.
The dielectric constant (K) has been scaling down from over 3 in 0.25-micron technology to 2.2 in 0.1-micron technology, with a target of 1.5 in 35-nm technology. Also, with shallower junctions, source and drain capacitances have been dropping. Narrower metal pitches and lower K translate to a lower metal lines area and the appearance of fringing capacitance; but tighter metal pitches also translate to two adverse factors: higher coupling (crosstalk) capacitance and higher total capacitance per square millimeter. It is estimated that the total switched capacitance will grow from 600 pF/mm2 in 0.18-micron technology to more than 2,000 pF/mm2 in 35-nm technology. It is also worth remembering that while device dimensions are shrinking, die size is still growing as more functions and more memory are crammed on the die.
With every technology step, channel electric field saturation occurs at lower values of Vdd. Thus Vdd continues to scale lower, since there is no performance gain in operating at voltages above saturation (power is simply wasted). In order for performance (clock frequencies) to continue improving while Vdd is scaled down, another parameter-threshold voltage-is scaled down as well. That enables the gains in performance but increases the leakage of the device at the rate of a full order of magnitude per 80-mV decrease in Vt.
Thus, for a decrease in Vt from 0.62 V to 0.18 V, the corresponding leakage of an N device increases from 0.01 pA/micron to 100 pA/micron-a 105 increase. And leakage power grows from being negligible compared with dynamic power to being responsible for roughly 50 percent of the total power consumption of a chip at 35 nm.
In a typical chip, a handful of critical paths and functions need the high performance made possible by the low-Vt devices, while the performance of the rest of the circuits could be achieved with higher-Vt devices that exhibit much lower static leakage. Technologies supporting multiple Vt on the same die are becoming common practice at 0.13 micron and below.
Total IC power consumption is growing roughly 15 percent per technology node. That increase by itself doesn't seem too bad, until we examine local power densities. Basic calculations indicate that local power densities of critical functions will increase by a factor of 4 to 7. Without careful attention to every minute detail in such blocks and without using every possible effort to reduce that localized power consumption, it's only a matter of time before metal migration as well as device degradation and ultimate failure occur. Thus, such techniques as parallel processing at a lowered voltage (and at an area cost) cannot be overlooked.
Clock Power Consumption
Before addressing a series of potential solutions that can help with the overall power picture, we would like to single out one circuit element that will always be a great power consumer: the clock network. Clock network power consumption can account for as much as 40 percent of the total power consumption of a chip. Clock network management and design-including gating at the block level, enabled flip-flops, clock on demand and local clock networks-continue to gain in importance.
We have already touched on several techniques that will help with the power picture in CMOS technology. Voltage supply reduction beyond the nominal supply value for a technology node remains the single most
effective solution, given the quadratic relationship between dynamic power and supply voltage. This need not happen at the expense of performance. It can be done at the level of the whole chip, where parallel processing will compensate for the lost performance. The technique can be carried out until parallel processing is no longer cost-effective, or until most noncritical paths approach criticality or noise margins get too close for comfort.
A variation on the same theme is to have "voltage islands" where noncritical blocks are operated at a lower voltage than the rest of the chip. But that approach brings the added cost of the extra power supply as well as the effort of routing two power grids inside the chip.
Reduced swing circuit techniques exploit the same theme of reduced voltage as the means of reducing power, albeit by reducing the voltage level to which internal capacitances are charged rather than by reducing the supply voltage. The literature is full of such techniques, but again they all come at an added cost in area, noise margins, additional reference voltages and routing effort. Reduced swing need not result in reduced performance; on-chip differential signaling is but one example.
Another circuit technique that has been around for some time is the generation of body bias to lower the Vt when needed to improve performance and to bring it back to its high nominal value (rendering leakage insignificant) when the circuit is idle.
Smart chip architecture in general (including power grid design and clock network design), and smart memory architecture in particular, can be a significant contributor to chip power management. The total size of on-chip memory is growing with every technology node. Memory calls are expensive. Optimizing local memories to minimize misses is one way of saving power.
Advances in silicon-on-insulator technology hold the promise of reducing power consumption by 10 percent to 15 percent compared with CMOS of equivalent performance. Another way of stating it is that SOI promises to buy us one generation of performance at each technology node (e.g., 0.18-micron SOI will have the power/performance of 0.13-micron CMOS). SOI is similar to CMOS; the main difference is that the devices are built into a thin film of silicon isolated from the traditional substrate by a film of silicon oxide. That makes the body a floating body, resulting in lower device junction capacitances and thus yielding the performance advantage.
SOI comes in two styles: partially depleted (the more common style) and fully depleted. Few challenges result from the floating body: The main ones are the history effect (i.e., switching history affects future behavior) and the difficulty of modeling devices properly.
SOI has ardent proponents, who see great potential in it, and detractors, who see it as an expensive technology with diminishing advantages over CMOS as we scale past 0.1 micron. While it is certain that SOI will find many uses, only time will tell how broad its usage will become.
In addition to the pure power-saving aspect of managing IC power, there are compelling reliability issues. The most important among them are product lifetime (mean time between failures) and electromigration. MTBF is inversely proportional to power dissipation. This is especially important in light of the local power densities discussed earlier. Another related factor is the design of the power grid to ensure proper reliability and proper performance (IR drop and switching L x di/dt).
Before discussing operating-system and software techniques for managing power, we need to mention power modeling and power management tools. Design estimation must be established early in the design cycle. The higher levels of abstraction afford higher degrees of freedom in design and power optimization. Fully leveraging that freedom requires reasonably accurate power models at every level. That is not the case today.
Power characterization at the library level is common practice now. It is accurate enough for gate-level power optimization, but it certainly does not capture the available power savings of transformations at higher levels.
Also, growing design complexity calls for the automation of power analysis. That includes measuring peak power, average power, metal migration and power-related signal integrity aspects. Many EDA tools address those needs, and designers continue to use those tools as part of their flows.
A large part of the power budget can be consumed by functions outside of digital logic. The power management of displays and disk drives is necessarily an operating-system function. When not in use, displays can be turned off, and disks can be spun down.
Even in embedded systems without such subsystems, power management is increasingly mandatory. Managing power effectively in such environments requires detailed knowledge of the usage modes. That way, a coherent set of power-down states can be crafted that can provide large power savings without affecting performance and without requiring the user to pick from a confusing array of "nap, doze and sleep" modes.
Power sign-off is fast becoming an integral part of the design flow. The higher the abstraction level, the greater the degrees of freedom, and the less accurate the available power models. That calls for conducting more research on power modeling at the behavioral and RTL levels. Power optimization should be carried on at every stage of a design.
R&D director Jamil Kawa, who holds MScEE and MBA degrees, has major interests in circuit design and modeling, while vice president Don MacMillen leads 30 EDA researchers in the Advanced Technology Group of Synopsys Inc. (Mountain View, Calif.). He holds a PhD degree.
1. Benini and De Micheli. Dynamic Power Management. Kluwer Academic Publishing, 1998.
2. Brodersen and Chandrakasan. Low Power CMOS Design. IEEE Press, 1998.
3. Roy and Prasad. Low Power CMOS VLSI Circuit Design. Wiley, 2000.
© 2001 CMP Media LLC.
12/1/01, Issue # 13150, page 32.