With the rapid move to ultradeep submicron designs and feature size processes of 0.13 micron and below, ensuring the integrity of signals as they traverse conductors on a chip is becoming a challenge. Crosstalk between signals, caused by increased capacitive coupling, is the most severe problem that affects the timing of signals on a chip, causing functional failures and performance degradation. Due to limited ability to analyze signal integrity, designs end up failing to meet optimal performance specifications.
Traditional approaches used for analyzing crosstalk effects all suffer from severe capacity, performance and efficiency limitations. Therefore, a new approach is required a static crosstalk analysis methodology that combines static timing analysis (STA) with crosstalk analysis. Texas Instruments Inc. is applying this methodology in its 'C64x DSP generation design flow to pinpoint and correct crosstalk-related timing problems before tapeout, resulting in improved quality of silicon and first-pass silicon success.
Today, designers turn to either Spice-like simulators or static timing analyzers to gauge the effect of crosstalk on their designs. Though Spice-like simulators provide golden accuracy, they suffer from severe capacity limitations and slow performance. A large netlist may cause a simulation run to take days to compute the results for a single path under a single set of operating conditions for a single input vector. This slow performance renders it impractical for minimum or maximum delay measurements across a variety of operating conditions, a key requirement for timing analysis.
Traditional STA is the most popular approach. It is fast, exhaustive, efficient, does not require vectors and does not need multiple runs for different operating conditions. To get a crude estimate of the effect of coupling capacitance on timing, empirical multipliers of 2x for maximum delay and 0x for minimum delay are used. Coupling capacitance between two nets is simply multiplied by multiplier and added to both the nets. However, this approach may assume overly pessimistic constraints; as a result the chip may not be designed for optimal performance. Another drawback is that for some cases neither 2x is an upper bound nor 0x is the lower bound for the multipliers.
The limitations of current approaches make it clear that the best way to analyze crosstalk is to embed the analysis in a static timing-enabled flow. Static crosstalk analysis provides two key benefits. First, it allows crosstalk effects to be analyzed statically. Secondly, it extends established gate level STA sign-off to include the effects of crosstalk.
The static crosstalk analysis flow includes setup, aggressor filtering, delay calculation with worst-case timing windows, net reselection and delay recalculation and the generation of timing reports.
For crosstalk analysis, the first and foremost requirement is the availability of accurate layout-extracted parasitic data. This data should be in an accepted industry-wide standard format such as SPEF because this allows for integration with all parasitic extraction tools adhering to strict standards. The next requirement is the availability of standard cell libraries. For easy integration into traditional synthesis flows, the methodology should include Synopsys' widely accepted Liberty or .lib format libraries.
One major characteristic of crosstalk is that it is the result of a highly coupled system. Aggressors may turn out to be victims of other aggressors in a different path and so on. This leads to an iterative process in which all the possible timing relationships between all the aggressors and victims must be considered to achieve an accurate analysis of the impact of crosstalk on a particular timing path. Since an enormous amount of data must be analyzed, an efficient methodology should be defined for selecting victim nets that show a substantial change in timing due to crosstalk, or filtering aggressors that have a negligible impact on timing.
Filtering an aggressor simply means splitting the mutual coupling capacitance to ground and adding it to both the coupled nets. The filtering can be based on parasitic or electrical parameters. Here, the user can employ his or her knowledge of the design specifics to set certain values, such as a coupling capacitance threshold or a ratio of coupling capacitance to the total net capacitance, to limit the number of nets that need to be considered. The selection mechanism must be flexible enough so that the designer can specify parameters in the context of victims and aggressors.
To further reduce the number of nets to be considered, aggressors that have a negligible electrical impact on their victims should also be filtered out. Each aggressor must therefore be analyzed to determine the worst-case voltage bump it induces upon the victim. If the accumulated voltage bump effect of all the aggressors and the voltage bump caused by the individual aggressor is less than the user-specified thresholds, the aggressor then may be filtered.
Another factor to keep in mind is that slowdowns or speedups on victim nets are a function of the temporal relationship between the two nets. Therefore, to be able to locate all possible crosstalk violations in the design, the first critical iteration (Pass 1) requires that worst-case timing windows be considered. Pessimism in Pass 1 is sequentially reduced in subsequent iterations.
In the next step, victims that show significant crosstalk effects on timing after Pass 1 are reselected for further analysis. Delays on victims are computed using arrival windows from Pass 1. These delays are then used to generate updated timing windows that are used for the subsequent iterations until the delays converge. The logical correlation between aggressors on the same net should also be considered because the effect of one aggressor may cancel the effect of another aggressor. For this, the methodology should include tracing simple logical relationships between aggressors and it should be able to account for the effects.
Because iterations use up a significant amount of computational resources, the algorithms for convergence must be very efficient so as to make the solution converge within two to three iterations To make these iterations even more efficient, the designer may limit the number of nets on which convergence has to be achieved. To do this, the following criteria may be used: Analyze only nets on the critical path; consider a net's absolute or percentage change in delay caused by crosstalk effects; and/or select nets based on the net slack.
After the delays have been accurately computed, intuitive timing reports are generated. The user can gauge the severity of crosstalk in the design with one glance at the timing reports. Complete hierarchical names of the nets and instances are reported, accurately pinpointing the location of the victim net and the aggressor nets so that the designer can fix the violations quickly.
The TI approach
At 600 MHz, the Texas Instruments TMS320C6416 DSP is the world's fastest fixed-point DSP. It is fabricated in a 0.13-micron, six-metal-layer copper technology. TI sought to enhance its static timing and crosstalk analysis flow to help pinpoint and repair crosstalk-related timing problems in this high-speed, ultradeep submicron design with greater accuracy.
Previously, TI had used parasitic scaling with its standard static timing analysis tool to analyze crosstalk delay and had found it to be inadequate in accuracy and cumbersome to integrate into the design flow. Therefore, TI was interested in evaluating new approaches to crosstalk analysis that could deliver more realistic results. TI's primary goals for the evaluation were to achieve Spice-comparable accuracy; to maintain consistency with its existing STA solution, and to derive a more accurate report of real (vs. potential) crosstalk-related timing violations. Improved runtime, low maintenance overhead and ease-of-use were considered benefits that are nice to have.
TI's standard static timing analysis solution Synopsys' PrimeTime offered an integrated component for crosstalk analysis, PrimeTime SI, which TI began to evaluate. To do that, TI set up the design flow by simply turning on crosstalk analysis switches in existing PrimeTime scripts. Synopsys' PrimeTime SI used Synopsys-standard Liberty format (.lib) libraries and shared the timing setup with PrimeTime. To determine if the Synopsys' solution could calculate delays with Spice-level correlation, TI compared results on small test circuits against detailed Spice simulations. Results correlated to within 10 percent of Spice.
Next, larger test cases were run to benchmark performance. Runtimes were within 2x to 3x of non-SI-enabled PrimeTime. These results met TI requirements for baseline accuracy and run-time. In addition, integration into the PrimeTime environment eliminated many problems in traditional static-multiplier coupling compensation flows, such as the interaction of multifrequency asynchronous/synchronous clocked signals and the consequent aliasing problems that arise.
Finally, TI used Synopsys' PrimeTime SI on its multimillion-gate 'C6416 design to analyze for crosstalk-related hold-time problems and discovered approximately 500 paths with potential crosstalk problems. Initially, the tool generated pessimistic results for some topologies because of complex aggressor signal interactions. This pessimism is undesirable as it increases design cycle time, area and power. Based on the results, TI is working with Synopsys to further optimize its use of the PrimeTime SI delay calculation algorithm. By developing custom scripts to include or exclude victim-aggressor pairs based on user-defined temporal and logical correlation, TI and Synopsys were able to reduce pessimism and achieve more realistic results.
To further enhance its flow to reduce crosstalk-related timing violations and speed turnaround time, TI is now concentrating on adding crosstalk prevention to its front-end implementation flow and has enhanced its libraries to increase immunity to crosstalk problems. The integration of crosstalk avoidance will reduce design cycle time in future products.