Meeting today’s tough power specifications involves advanced low-power design techniques that go well beyond clock gating or other power optimizations automated in synthesis and implementation tools. These advanced techniques require architectural decisions that must be made early in the design flow, introducing two big areas of uncertainty that the designer needs to overcome:
- How can I get predictability early in the flow to decide which low-power techniques will actually meet the power specification?
- Low-power techniques introduce a new level of complexity, so how can I implement those techniques and verify that I did them correctly, ensuring no bug escapes?
This design article uses a case study to illustrate how these twin uncertainties can be overcome. The authors explain how the design is split into power domains, and they show how to apply the following advanced low-power techniques in turn: multi-supply voltage, power shutoff with state retention, and multi-voltage threshold optimization. The article discusses the resultant dynamic and leakage power numbers estimated at various stages of the design flow, and it shows the importance of realistic activity vectors to produce accurate power estimation. The article also explains how the techniques are specified in a power intent file, and how that file is used, along with the design and power-aware simulation and formal verification tools, to verify the correct implementation of the power architecture.The need for predictability
Only a few years ago, optimizing digital designs for low power usually meant no more than clock gating – controlling dynamic power by disabling the clock to parts of the circuitry that could be idle, and choosing a frequency of operation no faster than was needed to meet performance. Then came processes and library cells with multiple threshold voltages (MVt) – a low threshold voltage switches quicker but uses more power, while a higher threshold voltage uses less power at the expense of performance. These MVt cells helped with both dynamic and leakage power reduction.
As we progressed down successive process nodes, we were able to increase frequency (which is good for performance but not for power) while reducing supply voltage (since energy is proportional to the square of voltage, this made up for the increase in frequency). We could use processes and libraries with multiple supply voltages (MSV) – using different levels as dictated by the balance of performance needs versus power. Beyond the 65nm node, leakage started to become a major issue, and the only sure way to deal with leakage is to turn the circuit off when not in use. Hence, one low-power design technique, power shutoff (PSO), also known as power gating (with or without state retention), has become common in today’s designs.
While clock gating and MVt are optimizations that can be applied to reduce power after the RTL design and during the implementation process, MSV and PSO are advanced low-power techniques that apply to selected power domains, are architectural in nature, and need to be specified early in the design flow. Additionally, they have a significant overhead in terms of component cost, implementation difficulty, and especially verification complexities introduced.
These advanced techniques thus introduce two areas of uncertainty that the designer needs to deal with. First, how can the designer get predictability early in the flow to decide on a low-power architecture that will actually meet the power specification? Second, how can the designer implement those techniques correctly, and verify correct implementation, to ensure no bug escapes?
In this article, we will use a simple design example, namely a dual Ethernet MAC, to illustrate how these two challenges can be met. This design example was implemented using the Cadence® Low-Power Solution.The dual Ethernet MAC design
The block diagram for the dual Ethernet MAC is shown in Figure 1. The design consists of two media access control (MAC) channels, implemented with identical RTL code, a direct memory access (DMA) bridge, and a DMA block. The example is designed to be implemented in a 65nm LP process, with a nominal supply voltage of 1.08V.
Figure 1: Dual Ethernet MAC
We took the design through the following design steps:
- Baseline power estimation using probabilistic activity data
- RTL simulation and generation of realistic activity data
- Implementation of MSV and repeat power estimation
- Implementation of PSO and repeat power estimation
- Full synthesis and MVt optimization
- Correlation with signoff power analysis
At every implementation step, we repeated power estimation to check the result of the applied technique and verified its correct implementation.1. Baseline power estimation
We read the RTL into the Cadence Encounter® RTL Compiler synthesis tool. This tool also serves as an RTL power estimation tool and can create a comprehensive power consumption report, based on quickly synthesizing the RTL to the intended standard cell library for the target silicon process. In the absence of any real activity data, the tool allows designers to specify toggle rates on the design’s primary input ports, and will propagate the resultant activities and duty cycles throughout the design using probabilistic techniques. If no toggle rates are specified, some simple default values are applied.
We used a low-effort compile to get a quick mapping of the design to the target technology library for a quick initial estimate. The resultant estimate of the power consumption of the design using the 65nm LP process, and default toggle probability, is shown in Table 1. We have broken the numbers into dynamic and static power, and for two different operating modes – when both MACs are running, and when both are idle. This will give us a basis for comparison for the various techniques that we will apply in each step. In addition, we’ll be able to see how our estimation and analysis results converge throughout the flow.
Table 1: Baseline Estimate
So, what’s wrong with these numbers? Note that since we are using default probabilistic activity, we really can’t get an accurate estimate for what the power consumption looks like in idle mode. Clearly, we need real activity data for that mode. Apart from that, we don’t have much basis for comparison to know whether these numbers are good or bad, but we suspect from experience that they are much too high.