Design Article
Comment
OphirT
The first big difference is that at the end of the suggested flow you have ...
WiLess
An interesting idea and really helpful. What would be the differences of this ...
Speeding power estimation from weeks to hours
Ophir Turbovich, Cambridge Silicon Radio and Thomas Li, SpringSoft
11/19/2012 10:59 AM EST
The new methodology developed jointly by CSR and SpringSoft:
- Generates accurate gate-level waveforms automatically;
- Enables design teams to correlate and analyze results in the RTL environment, eliminating the necessity to bring up the gate-level environment;
- Eliminates the necessity to simulate from time zero for every power analysis run; and
- Imposes no changes on the established power estimation methodology and tool flow.
It combines Springsoft’s established Siloti Visibility Automation System with CSR-generated map files that use CSR scripts on regular synthesis and place and route tools. Like some emulation and FPGA design tools, the visibility automation system can extract full waveforms from essential signal waveforms (flip-flops and input). In addition, it maps every gate-level flip-flop to the original flip-flop in the RTL. The combination of these features is used to extract the essential signals from the RTL. The result is a gate-level waveform that is identical (or almost identical) to the GTL waveform derived from GTL simulation. Thus, it can be used to derive the toggle rate of every port of every cell in the netlist — precisely the data needed for accurate power estimation.
This ability to automatically generate port activity data significantly reduces both the time and the effort to generate the gate-level waveforms, and eliminates the need to bring up the gate-level testbench environment. After waveform generation, power consumption is analyzed using CSR’s established power estimation flow — the new methodology requires no changes to it.
Using the new methodology, gate-level waveforms can be generated for every relevant scope and time interval, eliminating the need to run simulation from time zero. It can also be applied in verification runs where internal testbench stubs are used to drive the design, for example a CPU stub instead of the CPU RTL.
Visibility automation technology
The new methodology employs two main tool features — “What-if Replay” and “Correlation” — to re-run gate-level simulation with RTL simulation waveforms as the input stimuli.
“What-if Replay” re-simulates the user-specified design block with the previously-generated waveform database as the input stimulus. The user can also specify the time window for the simulation.
“Correlation” maps the signals from the RTL to the gate level and vice versa. For power analysis, this technology is used to map the RTL signal waveform to the gate-level signals in order to drive the re-simulation.
The waveform database — known as the Fast Signal Database (FSDB) — is an open component in SpringSoft’s Verdi Automated Debug System, and is supported by many third-party tools.
Just as the new methodology requires no changes to CSR’s power estimation methodology, so the visibility automation technology requires no changes to CSR’s tool flow. The flow continues to use physical design tools such as Synopsys’ Design Compiler, IC Compiler, and PrimeTime PX, while simulation continues to use tools such as the Cadence Incisive Unified Simulator.
The visibility automation tool inputs, mapping file generation and execution flow are as follows:
Tool inputs
The tool uses the following input data in order to generate gate-level waveforms (see also figure 1):
- RTL FSDB with waveforms from RTL simulation, including the scope that must be regenerated.
- Gate-level to RTL mapping file that points from every essential gate-level signal to the corresponding RTL signal in the RTL FSDB. The user can also map the signal to constant values to tie it to a fixed value during the re-simulation.
- Gate-level file list containing the commands for compiling the gate-level netlist (minus any testbench). The file can also include “defines” and any other compilation flags.
- Configuration file that specifies the user settings for the What-If Replay run. Key parameters include:
- The gate-level design scope on which the simulation is executed and from which the gate-level waveform is generated
- The time scope with start and end times of the gate-level waveforms
- The simulation compile script with simulation compile settings
- The simulation run script with simulation run settings
- The SDF file with the path to be used, if any
Figure 1: Tool inputs and flow
Mapping file generation
The mapping file generation tool — developed by CSR — contains both flip-flop mapping and input mapping.
Flip-flop mapping can be derived from the output of an equivalence checking tool or that of the synthesis/place and route tool. For example, in Synopsys Design Compiler, the saif_map command generates an output file which contains all of the flip-flop mappings.
Input mapping at the gate level is simple when gate-level hierarchies are maintained. When they are not maintained, it is necessary to map the inputs at the top level of the hierarchy, which are maintained just like the top level of a physical design macro. In this case, it is sufficient to map the clocks, design-for-test (DFT) signals such as built-in self-test (BIST), and scan, as well as specific control signals. This mapping drives all of the appropriate clock and control inputs to the desired scope. It also loses some data at the interfaces, such as output from memories until the first sample, and toggling information from the input to the first sampling flop. However, this loss has a negligible effect on the total power estimation.
Execution flow
- The execution flow has three stages: extract, compile and simulate.
- In the extract phase, the tool uses the input files and user settings to generate two main components:
New design-under-test (DUT) source files by performing design extraction for the selected scope. This is the DUT for the re-simulation.
New testbench files which use the RTL simulation result (FSDB) as input stimulus to drive the extracted DUT in re-simulation.
The design is then ready for compilation and re-simulation. The tool automatically drives the compilation and then launches the simulation task. Using common simulation tools such as Synopsys VCS and Cadence Incisive Unified Simulator (IUS), simulation of the design compiled in the compile phase generates the waveform.
After generation of the gate-level waveform within the scope and time range of interest, it can be processed by any popular power estimation tool. If a given power estimation tool cannot accept FSDB directly as the input, there are utilities that can convert FSDB outputs to other formats such as VCD and SAIF files, which can be used in any power estimation tool.
Next: Analysis results


WiLess
11/26/2012 2:08 PM EST
An interesting idea and really helpful. What would be the differences of this flow compared to the flow when RTL VCD along with gate-level netlist is loaded into Primetime PX and tool propagates RTL activities into gate nodes?
Sign in to Reply
OphirT
11/28/2012 3:00 AM EST
The first big difference is that at the end of the suggested flow you have waveform file that you can use for other needs too (and other tools too)…
Also – while working with PTPX with RTL VCD you must make sure that only Flip-Flops are mapped between RTL to netlist. Usually more than Flip-Flops are mapped, and since the RTL simulation is zero delay simulation, the propagation is not done well (and the results are not accurate). It is mostly critical and problematic around adders and multipliers, but not only.
The suggested flow can overcome partial design – RTL which contain stubs or other testbench that replace part of the RTL. You can’t overcome this in PTPX.
In general – you must be PTPX expert to be able to run with RTL VCD, and it is not accurate. In the SpringSoft flow anyone can generate netlist VCD/SAIF, and no need to be an expert to run it on PTPX.
Another issue – PTPX runs in this mode taking too much time. While generating GTL waveform in the new flow took about 1 hour, and the total power estimation was 2 hours, using this on PTPX (when it can be done) took more than 24 hours.
Sign in to Reply