Gigahertz design is a very exclusive neighborhood that requires skilled hand-crafting by teams of engineers during all phases of a design.
How will this costly, time-consuming requirement ever resolve its differences with the realities in mainstream markets that can't afford to invest as much money and sacrifice as much time-to-market? A potential solution is emerging that has already delivered 400 MHz in just 0.18 micron CMOS in less time and with greater ease than it typically takes to attain 200 MHz.
The objective is to deliver the benefits of expertly hand-crafted, dedicated circuitry in a cost-effective manner that does not sacrifice any time to market, and potentially decreases it. On a practical note, for designer acceptance it must also be merely a displacement technology within current design flows rather than a disruptive and costly methodology change.
Currently, the only way designers can acquire the benefits of hand-crafting without doing the work themselves is to license pre-designed circuitry that has already been heavily optimized as hardened intellectual property cores. The benefit of hardened circuitry is that it comes with guarantees. When properly implemented, hardened IP, by its very nature, guarantees its operation at a guaranteed performance level throughout a robust set of conditions. With the elimination of automated tool variables, all characteristics are known in advance of a circuits' use, facilitating greater accuracy in simulation and an earlier design completion.
However, an often-heard complaint about large cores of hardened circuitry is that "it doesn't do exactly what I need it to do". This usually results in "just a minor modification" that inevitably disturbs the carefully crafted balance of variables in the design, creating a sudden unpredictability and loss of operational or performance guarantees. If prevented from making "just one small change," designers will declare a methodology "too inflexible to be capable of implementing everything I need in my designs."
The subsequent cost of debugging and re-verifying the IP core also defeats the (theoretical) concept of amortized development costs across numerous customers by an IP vendor.
High speed blocks
The requirement, therefore, is to create a wide selection of hardened building blocks with a sufficiently small granularity to ensure that nearly all digital functions constructed, but still architecturally large enough to enable the elimination of performance issues. An example of this style of hardened IP are the speed-optimized groups Telairity Semiconductor has begun to deliver.
Telairity's engineers defined a varied gallery of architectural building blocks, then expertly crafted them to operate at high speeds within a unique set of design rules that ensure rapid portability across foundries and processes, and to enable a methodology for assembling them that maintains their performance. The front-end of a chip design using the groups follows the familiar hierarchical HDL methodology, with the exception that the lowest level in the hierarchy infers these high performance groups of hardened IP.
Traditional synthesis tools were designed to utilize a very small building block, typically a single gate of logic. It was determined that small, regular circuit structures were being repetitively constructed throughout a design's implementation, so most tools evolved to utilize small, predefined clusters of gates. Telairity's approach: define groups of circuitry that average 1,000 gates each, 100 times the size of standard cell library elements.
As a way to ensure that all possible circuit combinations could be constructed, traditional synthesis tools were initially designed to utilize a very small building block, typically a single gate of logic. When it was determined that small, regular circuit structures were being repetitively constructed throughout a design's implementation, most tools evolved to also utilize small, pre-defined clusters of gates.
In effect, the granularity of a design's building block was increased to ensure that only optimal implementations of these "standard cells" of circuit structures were being used to improve the results of a final design. Unfortunately, the massive amount of interconnect wiring between all these gates and cells of logic has steadily become the source of most timing problems as chip lithographies have shrunk.
Enormous amounts of money are being spent trying to develop tools that can optimize the wiring that has been automatically generated by other tools. The placement of circuits, wire lengths, and crosstalk between wires are all significant challenges. The most difficult issue for optimizing these synthesized designs is that there is no way to effectively communicate an engineer's intention for each design element to a tool suite.
Telairity's approach was to define groups of circuitry that average one thousand gates each, one hundred times the size of standard cell library elements. Research has shown this to be the optimal circuit size for IP reuse, indicating the optimal inflection point between circuit size and design option flexibility. Hand-crafting groups of this gate count ensures signal integrity for a significant portion of a design's wiring.
In addition to all the physical hand-crafting, gigahertz designers tend to heavily pipeline their designs, using only five or six levels of logic between flip-flops to reach 1 GHz operation. Each group in Telairity's current gallery provides three times the logic levels (fifteen logic levels) to remain compatible with the design practices of today's mainstream designers, yet still easily reach 400 MHz.
To maintain the integrity of the groups throughout their assembly into larger chips, Telairity made the decision to reserve the first three layers of metal for internal group wiring only. Inter-group wiring is never allowed to route through any hand-crafted layers, and must be implemented in layers four and five only. This becomes a relatively simple routing exercise since the quantity of wires is approximately two orders of magnitude less than a synthesized design needs to route. Proprietary techniques are also well defined to ensure that clock skew and power sag are not a problem, even with the high currents of deep-submicron CMOS.
It is important to note that the methodology does allow the inclusion of IP blocks from other sources, such as third party vendors, legacy circuits from previous designs, or perhaps uniquely esoteric circuits for corner case applications that cannot be constructed from the company's gallery of groups. These circuit blocks need only to be hardened before inserting them into the standard flow.