These days, there is a lot of noise about signal integrity and design integrity in deep-submicron system-on-chip designs. And there is a good reason: Shrinking processes and higher frequencies mean even second-order physical effects can create problems. Designers cannot achieve timing closure, or may have their chips fail on the tester, because signal-integrity effects created logic and timing problems. Worse yet, designers risk field failure of an initially good part if they don't consider design integrity.
Crosstalk-induced errors are the most common signal-integrity problem. They are induced when two or more nets (wire routes) run parallel to one another for some distance, creating a capacitor between them. This capacitor can transmit a pulse from one net to another. If an "aggressor" net switches, this may cause the adjoining "victim" net to temporarily assume a different logic value, resulting in unintended logic transitions. The outcome is repeatable failures of certain logic operations.
Even if a signal has the correct logic value, crosstalk can affect the timing of transitions. This can cause failure to perform at speed, even when the chip is new. Timing dependence on crosstalk is a subtle and complex matter because the timing on victim nets depends on the delay across the first gate, interconnect delay and the behavior of other adjacent nets. Instead of a single delay value for interconnect we now have minimum and maximum delays, which can vary from one cycle to the next. This additional delay variation can preclude timing closure.
A third signal-integrity problem is voltage drop on power supply lines. On every chip, a network of wires distributes power from the pads to the circuitry inside. Since these wires have resistance, the voltage that gets to the internal circuitry is less than that applied to the chip. If this voltage drop, often called IR drop, is too severe, the circuits will not get enough voltage, resulting in malfunction or timing failure.
Design-integrity problems are reliability problems occurring in the field, not during test. Three common problems are electro-migration, hot electron degradation and wire self-heat. Electro-migration occurs when the dc current density in a power supply line is too high, causing the wire's metal grains to be pushed aside by the constant electron wind. Over time, this can keep the specified voltage from reaching the rest of the circuitry, or even result in a completely broken wire.
The hot-electron or short-channel effect results from high electric fields between the source and drain of a device, causing electrons to speed up in the channel. The fastest-"hottest"-electrons damage the oxide and interface near the drain, and the transistor threshold and mobility change over the life of the part. Since most transistors always have the same applied polarity, the shift adds up as the device continues to run. Eventually, the threshold shifts so far the device no longer meets specifications.
Wire self-heat, sometimes called signal line electromigration, is a mechanical failure in the wire caused by frequently varying thermal conditions. As pulses go through the wire the power dissipated by the wire causes it to heat above oxide temperature. The difference in the thermal constants between the oxide and the wire causes mechanical stress and the wire may eventually fail. Wire self-heat problems are exacerbated by low-K dielectrics, which are poorer thermal conductors and mechanically weaker.
These problems have always existed, but at 0.25 micron and below, an intelligent design solution must be adopted or chips will fail. Traditional approaches analyze after design-signal- and design-integrity analyses are conducted after layout and extraction. These tools can identify problems, but offer no automated fixes or guidance to the designer on how to resolve them. In deep-submicron design, crosstalk analysis and parasitic extraction are no longer post-layout tasks; they must be performed concurrently with design. The tools should also be flexible, since the amount of effort expended on signal integrity will depend on the application-a few transient errors and a short chip life may not be a big problem for a video game, but could mean life or death for the wearer of a pacemaker.
We've reached an inflection point in technology innovation where post-layout signal and design integrity analyses no longer suffice. Some new solutions promote signal integrity awareness to all the phases of the design cycle, from physically knowledgeable synthesis to detailed routing. The most effective solution is holistic and gives a broad perspective of signal integrity, timing, power and area concerns as the designer navigates different tools. In cases where prevention is impractical, the tools determine automated ways of correcting signal integrity problems without disturbing other design parameters.
The severity of many problems, including crosstalk, can be reduced by using a placement tool enhanced with signal integrity and design-integrity features. During initial placement, net adjacencies, layer assignments and exact routing topologies are unknown. An advanced placer uses driver strength, estimated routing and signal RCs to estimate signal slews and identify potential victim nets. The placer then decides to shorten the affected wires, use a stronger driver or insert buffers to break long wires into shorter segments. This decision is based on full knowledge of design constraints and accurate prediction of downstream technologies.
Also during concurrent placement and optimization, the placer detects wire self-heat problems by analyzing net length, and estimating wire RCs, signal slews and maximum operating frequencies. Average, peak and RMS currents are compared with ac current density limits for layers, and the results analyzed to determine the best solutions. Increasing wire width is an obvious way to fix a current-density problem, but if the current is caused by the capacitance of the wire itself, increasing its width may not solve the problem. Alternatives include decreasing the length of the wire, driving smaller gates and inserting buffers so one driver need not drive the entire problematic net. Again, only an advanced placer can make these decisions because placement and sizing technology are built into its algorithms.
As with wire self-heat, hot-electron problems are detected through analysis during placement and concurrent optimization. Each gate has a damage per transition computed from input slope and output load, which-multiplied by the maximum frequency and chip lifetime-determines the potential total damage. The placer compares this with the maximum allowable damage for each gate, and if the damage limit is exceeded the placer decides how best to fix the problem.
Unlike power calculations, the reliability calculations for hot electron and wire self-heat are based on the maximum possible switching frequency. This frequency is derived by propagating the clocks forward through the circuit, because the worst case may not be present in the logic simulation vectors. For example, imagine a signal that toggles on each NOP (no-op) operation. This is not a very interesting case to simulate and will account for a very small portion of the input test vectors. However, in the end application of the chip the NOP instruction may account for the vast majority of all cycles executed. Thus, for reliability measures that depend on transition counts, such as hot electron, it is important to use the statically derived maximum frequencies rather then those observed in simulation.
Clock trees are particularly vulnerable to design integrity problems. As the highest-frequency nets in a design, the clocks must be carefully crafted to avoid problems such as wire self-heat and hot electron, which must be predicted and prevented as the clock tree is being developed. If left until later in the design process, these problems can be extremely costly to solve as entire portions of the chip may have to be ripped up and rerouted.
A powerful clock tree generator provides an elegant solution. During topology generation and buffer placement, this tool treats hot-electron and wire-self-heat constraints just as it does other constraints such as skew and insertion delay. The clock tree generator should be able to insert repeaters, change drive strengths and change placement or levels of clock trees to satisfy these constraints.
To detect and correct IR drop problems, the designer predicts the peak value of the voltage drop or predicts the average value and infers peak value from that dc analysis. Using the average cell currents is simpler, plus it's required for electromigration analysis. Therefore, many designers perform dc analysis, then multiply by a factor that incorporates the peak-to-average ratio. Since redesigning the chip to use less power is often impractical, the main way to solve an IR drop problem is to make sure the power supply wires are big enough, but not oversized.
Power analysis in a holistic design flow is done early and often, starting in the planning stage when the symbolic power grid is created. After a quick power and static rail analysis, the power grid is converted from a symbolic to a design-rule-correct one. The power consumption is derived from timing models, stimulus (vectors), voltage, capacitive load and parasitics. The rail analysis is based on static parasitics such as power grid resistance and dynamic parasitics, plus power grid resistance and capacitance, bonding wire parasitics and IBIS package models. As with IR drop, the normal solution for electromigration is to make the wires and vias big enough, which becomes a drastic fix if left until later in the design flow.
Some signal- and design-integrity problems are impossible to find and fix from placement only, so the designer should use a router that detects and fixes problems during global and detailed route. These fixes must include making wires wider, providing multiple vias and creating shielded routes. In addition, the router must be usable in the Engineering Change Order (ECO) mode after detailed routing, extraction and analysis.
Crosstalk-induced timing shifts cannot be eliminated by any practical technique short of shielding all nets, which doubles the routing resource requirements. However, increasing timing margins to account for the extra delay can control the problem. This approach has been used for many years in the guise of multiplying the cross-coupling capacitances by a constant greater than one to account for the Miller effect. However, instead of this global overestimate, a more detailed analysis can correctly account for the timing shift on each net, resulting in a less pessimistic timing analysis. If the timing shift for a given net is excessive, it can be minimized by employing post-routing optimization techniques, such as using extra spacing for nets, and can be virtually eliminated by using shielding. Since both those solutions are costly in terms of routing resources, they are used only for critical nets such as clocks.
In an integrated holistic design flow, accurate RCs and cross-coupling capacitances are extracted after routing. Next, a timing analyzer generates timing windows to identify nets that can switch at the same time. The timing analyzer then combines the cross-coupled parasitics and the timing windows to compute the incremental delay shift for each qualified affected net. After this analysis is complete, the nets with timing violations due to new, incremental delay shifts are prepared for fixes. The router should fix the problems by buffer insertion, extra spacing or shielded routing.
Any post-routing correction must be done very carefully or it may cause more problems than it fixes. The newest place and route tools support post-route buffer insertion and other fixes while minimizing impact on other nets. Without these features, attempts to fix signal-integrity problems during routing can cause additional problems with timing or signal integrity.
The ideal solution to the detection and prevention of signal integrity and design integrity should be integrated, holistic and flow-based so problems may be identified and fixed as they occur during design.