When chip implementation commences today, engineers have a pretty good handle on the clock speed the chip must run at. They use these clock requirements to constrain synthesis tools and then use static timing to see if a design meets constraints. Inevitably, after a long synthesis run, they find that the chip does not meet timing and from there on things start going downhill.
The implementation team needs to identify if a timing problem is real. If it is, then something about the design needs to change to fix the problem. If the timing problem is not real, then a timing exception (false or multi-cycle path) needs to be added to the constraint file. Either way, considerable time is spent wading through timing reports, making sense of the timing problems and then performing new timing-closure iterations.
If the implementation team does not know much about the design in an ASIC flow, for example then you really are in for a long patience-testing, time-zone spanning grind. If test logic was already inserted and new timing exceptions are "discovered", then the test logic needs to be modified through an ECO process.
When networking chips have multiple design modes, the process of iteratively finding, verifying and adding timing exceptions needs to be performed for each of the design modes. There is considerable hand wringing about which mode to optimize (one or all) and how each of these modes needs to be constrained.
Finally, when the design is ready to be taped out these huge constraint files with competing exceptions (is it false or multi-cycle?) are used during static-timing signoff. No one's willing to bet that they got the constraint file right, so it's off to the simulation farm and several days of gate-level timing-annotated simulation to make sure that the design is appropriately constrained.
Finally, out comes a chip, but given the fact that timing exceptions have been added only when absolutely necessary, it consumes way more area and power than it needs to. After all, every single timing path on the chip needed to run at clock speed when it was synthesized.
The crux of the problem is that the relaxations to clock requirements the fact that not every path on the chip needs to run at clock speed are not in place at the start of the chip-implementation flow. Instead, timing exceptions are added as chip-implementation progresses as a means to "get-around" timing problems.
Consider, instead, a chip-implementation flow, where right from the start the golden timing constraints for a design are frozen. These golden constraints would not only specify the clocks on a design, but also a complete set of exceptions to clocking. So, at about the same time that the RTL has been deemed to be functionally correct, chip-implementation commences with complete constraints.
If a design has too many timing exceptions engineers are able to revisit and "clean up" the RTL. With complete timing exceptions at the start of the chip-implementation flow, design budgeting tools are able to make good decisions with regard to computing timing budgets for blocks. Instead of simply breaking a clock cycle into pieces and assigning one piece to each block, they are able to compute timing budgets taking timing relaxations into account.
Good block timing budgets, coupled with golden constraints for each block, allows physical synthesis to only optimize logic when absolutely necessary. This reduces gate count, reduces buffer insertion and results in cells that consume less power. Placement tools have more flexibility because there are fewer critical paths and cells that are not in the critical path can be placed further away from each other, reducing congestion.
When test logic is ready to be inserted, all the timing exceptions are in place so the appropriate test logic can be inserted without an ECO process. When the implementation is ready for static-timing signoff, any timing problems that are reported simply need to be fixed.
There is no time that needs to be spent trying to figure out if the problems are real or not. Gone are the timing closure iterations that result from the piecemeal addition of timing exceptions. Finally, if the timing exceptions were automatically generated and verified by a third-party tool (much like synthesized-gate level netlists are verified using equivalence checkers) there is no need for gate-level simulation.
Something to consider, then: is the surest way to a push-button chip-implementation flow through a design methodology that requires the upfront automated generation and verification of timing exceptions, coupled with chip-implementation tools that are able to leverage these exceptions to reduce the area and power consumption on chips?
Ajay Daga is CEO of startup FishTail Design Automation.