# Statistical timing steps into spotlight at ICCAD

Post a comment

Statistical timing steps into spotlight at ICCAD

San Jose, Calif. - Statistical static-timing analysis stood out at last week's International Conference on Computer Aided Design, as experts warned that the days of timing analysis based on nominal delay values were drawing to a close.

Two full paper sessions provided a progress report on academe's struggle with what remains, in the eyes of many, an unsolved problem. But discussions of design-for-yield, statistical algorithms and interpretations of results helped to clarify the major issues.

Puneet Gupta of the University of California, San Diego, stated the problem concisely. In an embedded tutorial on physical-aware design-for-manufacturing, Gupta pointed out that in today's new world of yield analysis, parametric yield-the ability to get parts that meet their specifications-has become at least as important as catastrophic-failure yield: the ability to get parts that are not disabled by dead circuits. While catastrophic failure is primarily based on defects, he said, parametric yield is primarily determined by the designer's ability to understand and control process variations and their effect on the electrical behavior of circuits.

Gupta reviewed a number of mechanisms leading to variations, including those in etch stages, in feature formation due to slight errors in stepper focus and in metal geometry due to chemical-mechanical polishing. Each mechanism, and a number of others, leads to variations in critical electrical parameters that in turn lead to changes in circuit performance-in particular, changes in path delay.

In a statement that could almost have served as a dire warning for two other paper sessions, Gupta said, "Many of these variations are in fact systematic, but they are hard to model. So we tend to treat them as if they were random."

Process variations have always been there, of course. But in the past, process integrators controlled critical parameters closely enough to guardband the variations, and give designers fixed nominal values for figures like transistor speed and interconnect impedance. This allowed designers to model node delays as fixed quantities, making static-timing analysis possible.

That is no longer the case. A paper by Anirudh Devgan and Chandramouli Kashyap from IBM Research and IBM Microelectronics, respectively, compared worst- case timing numbers-the kind conventional static analysis would produce-against the actual performance of a number of circuits as modeled by Monte Carlo analysis. They found that the worst-case timing numbers necessary to guardband for process variations were between 21 percent and 27 percent more pessimistic than numbers that would actually have given excellent parametric yield. It was pointed out that designers, driven to advanced processes by their performance needs in the first place, could not leave those kinds of margins on the table.

So it is essential, many authors argued, to abandon static-timing analysis based on nominal delay values that attempt to cover up variations. Instead, they insisted, techniques are needed that capture the actual effect of process variations on the circuit, and analyze timing based on statistical predictions of circuit performance, not worst-case scenarios.

In theory, doing this is easy. Just do a conventional static-timing analysis, but instead of assigning a nominal delay to each node, assign a probability distribution. Instead of saying the arrival time of a signal at some point is 2.9 nanoseconds after the previous clock edge, say that the arrival time is normally distributed around a mean of 2.2 ns, with a standard deviation of 0.2 ns. The distribution is determined from the known variations in the structures through which the signal has passed.

As a number of papers pointed out, only two statistical operations are necessary in this kind of analysis. You need the equivalent of addition, to determine the distribution of delays in a signal after it has passed through a number of delay elements. And you need the statistical equivalent of a find-maximum function to determine the timing distribution of a signal that is the result of combining two or more signals, each of which has its own timing distribution.

By walking through the circuit, then, you can build up the timing distribution for each output of a stage, right before the signals go into registers. Then you can simply pick a probability value-say, 98 percent-that is sufficiently high that other yield issues will mask it in production, and read the delay time for the signal right off the timing distribution. You might find that an output from a combinatorial network, for example, has a mean delay of 17 ns from the previous clock, and that 98 percent of the time the signal will have arrived within 18.3 ns. So you assign the stage delay as 18.3 ns.

This does not compute

But the problem comes in computing those two simple functions. As several papers observed, it is possible to work directly with the probability distribution curves. But the computational problem grows exponentially with circuit size, and quickly becomes uncomputable. The two simple sum and max operators have a way of turning into convolution integrals. It is also possible to do Monte Carlo analysis on the problem, but again, circuit size quickly conquers any available computing resources.

So, as Gupta pointed out, two main branches of exploration have tried to overcome this problem. One group uses what Gupta called smart Monte Carlo techniques-Monte Carlo analyses that attempted to do only statistically important sets of variations. The other approach is to attack the problem of analyzing the timing-distribution functions themselves, but replacing the actual probability distributions with "bounding functions:" functions that are easier to compute but are guaranteed to be always slightly more conservative than the probability distribution functions they replace.

This latter approach was explored in a number of papers in two sessions on statistical timing techniques. Boundary functions were proposed, and shown to substantially reduce computing time by allowing simple estimates to the sum and max functions. Methods of pruning the network were described that eliminated the need to take the timing of some signals into account at all. And all papers reported results that were very close to Monte Carlo analyses and significantly better than worst-case numbers.

But audience questions raised an important objection. All of the papers, in one way or another, made simple assumptions about the degree to which the variations in individual node delays were related to each other. Most assumed the delays were entirely random-that is, there was no correlation between the variations in the delays at different nodes.

A single paper, by Hongliang Chang and the renowned researcher Sachin Sapatnekar, both of the University of Minnesota, described an analysis in which the delay variations were assumed to have spatial correlation: devices physically near each other on the die were assumed to have similar variations from their mean delay values.

The importance of correlation is overwhelming to these techniques. Most of the choices of functions to simplify the sum and max operations assumed-mathematically-that the timing variations are uncorrelated. If there is some correlation, as Chang and Sapatnekar showed, the operations and resulting timing distributions at the output can be quite different.

But correlation is reality, as Gupta remarked in his design-for-manufacturing talk. In fact, yield analysis models delay as a sum of a number of variations, each with their own distribution. Some, like variations in etch, will show a similar pattern based on die location on the wafer, across an entire wafer lot. Others will show variation from wafer to wafer within a lot, and still others will show variation between dice on a given wafer, but will be closely correlated within a small region on the die, as in the case Chang and Sapatnekar investigated.

So the question of just how to reduce the computing load for statistical timing analysis in the presence of correlated variations remains unsolved. Major steps are being made at the theoretical end in dealing with the mathematical problem of partially correlated variations. Now it is necessary to bring a deep understanding of how variations are correlated on actual wafers together with the mathematical work to make sure the researchers are solving the real-world problem.