|
Design Tools
Toshiba produces chips used in end-user applications like multimedia, networking, communications, graphics, and PCs. It serves internal and external customers whose designs are implemented with 0.4- to 0.25-µm processes and range from 100,000 to 800,000 gates, with 2 to 20 clocks of frequencies of 66 to 120 MHz. Because of increasing performance requirements, clock design has become a critical step in the company's overall design flow.
The quality of the clock network--skew, in particular--plays an important role in design performance. In fact, clock skew has a direct impact on a design's maximum operating frequency by reducing the time available for signals to propagate between storage elements. Managing clock skew and delay can make the difference between a high-performance product and one that barely meets market requirements. Indeed, the ability to generate a clock tree that meets customers' requirements for skew, transition time, and insertion delay is critical to the success of most of the designs Toshiba builds today. The inability to generate a good clock tree can lead to poor processing yield, which can lead to higher costs and the inability to meet customers' volume needs. Because older approaches to physical clock implementation pose serious problems, especially for deep-submicron design, Toshiba developed its own clock tree synthesizer a few years ago. Recently, though, it looked for another clock tree solution that could be made available to its customers as part of its design flow. The tool also had to be easy to use and constraint-driven. Toshiba turned to CTgen from Cadence Design Systems. It found, in initial projects, that CTgen worked well as a point tool and, in a broader context, fit well in a timing-driven design flow that bridged the gap between logical and physical design. By optimizing clock trees based on physical placement information, CTgen will help Toshiba avoid timing failures and obtain the highest performance a technology can provide without missing a design's time-to-market window. In the past, Toshiba employed clock schemes--looping, mesh/grid, trunk with buffer--that are hard to implement and time-consuming. In addition, they don't provide tight control over managing delay and skew. As design complexity grew, Toshiba found that it required not only a better but also an automated method of handling clocks. It also found that older approaches to physical clock implementation ran into difficulties in deep-submicron technologies. Clock looping and mesh/grid schemes (see Figure 1) add an excessive amount of wiring to the clock network and therefore slow down the clock. A clock trunk with buffers has load balance partition problems. In addition, both techniques often require many hours of manual work to create low-skew clock trees.
The obvious solution was a logic synthesis tool that could be used to synthesize the chip and clock trees. However, in DSM the success of synthesis-generated clock trees depends on the actual physical placement of the clock network. Since the most commonly used synthesizer today operates without knowledge of the design topology (placement), its clock tree results are unreliable. Looking forward, with systems on a chip and the inevitable embedded clocks, the problem only gets worse with traditional synthesis technologies.
Underlying problems with the approaches above were large clock delays and skews that made the trees unusable. As a result, Toshiba decided to employ a more versatile clock tree design methodology and introduced its own clock tree synthesizer in 1994. It was used successfully for generating clock trees for clock nets with more than 6,000 flip-flops, producing an insertion delay of about 2 ns and a skew of less than 100 ps. Although Toshiba continues to maintain both its clock tree tool and its R&D efforts, it recently began looking for another solution that was easy to use, constraint-driven, and could be made available to its customers as a component of its design planning solution. Because of tight constraints placed on clock trees in the high-speed designs it receives today, Toshiba needed a high-capacity, high-performance technology for clock design. It evaluated CTgen and found that it had proved capable of building clock trees for high-density designs--one manufacturer used it to generate a clock tree for a structured-custom system-on-a-chip design containing more than 500,000 gates. So Toshiba decided to use it in its deep-submicron design flow. Goals Toshiba's primary challenge in clock tree design is minimizing skew, although controlling insertion delay is still important. Minimizing insertion delay can improve performance, but designs can tolerate some delay as long as skew is kept within an acceptable window. In synchronous designs, setup and hold time errors resulting from excessive clock skew can prove devastating. In addition to skew issues, Toshiba's designers wanted to apply generated clock trees to nets other than the clock nets (set/reset, for example) to reduce loading and delay. Even though skew and balancing aren't critical for those nets, the company wanted a clock tree generator to support this design style. Toshiba uses CTgen to solve the clock tree using the fewest number of buffers, based on constraints--min/max insertion delay and max skew--that the designer provides. Although the preferred flow is to have the tool automatically select the best topology (the number of levels and the number and type of buffers at each level), sometimes it's necessary for the designer to specify the topology--that is, that special buffers be used at certain levels of the clock tree--in order to meet highly demanding clock constraints. Given a clock root pin, CTgen automatically traces through the design and finds the extent of the clock tree (the set of clock pins to which the constraints apply). To minimize skew, it constructs a tree like the one shown in Figure 2. Multiple levels of buffers (and/or inverters) support the required clock loading, as well as work to meet delay and skew requirements. As shown, the tool uses multiple branches of the tree to connect clusters of flip-flops that are physically near each other in the placed design. To determine loading values, Toshiba uses conventional pin capacitance and estimated interconnect capacitance. CTgen calculates the delays using the Elmore delay model for interconnect and table delay models in the technology (TLF) file for gate delays. While tracing through a design in Toshiba's initial projects, CTgen also recognized any gating components and removed any preexisting buffers and inverters inserted as placeholders during the traditional synthesis process. CTgen kept track of polarity to ensure that any inverting paths in the original tree remain inverted paths in the final tree. The ability to recognize gating components and to minimize skew across gated subtrees is an important feature, since some customers have strict low-power requirements.
The original clock tree net in Figure 3 has been broken into several smaller nets driven by buffers that are placed near their respective pin clusters, and the regularity of the structure means that skew can be closely controlled. To satisfy the minimum insertion delay constraint, extra buffers are inserted (see Figure 4). Making the physical connection With DSM designs, semiconductor vendors are feeling the pain of correlation and convergence. Physical information is mandatory in the logic design phase, but what most semiconductor vendors are finding is that when the same physical algorithms are used in the logic design as are used in their place-and-route technologies, their customers have the best path to convergence. Therefore it was important that Toshiba's clock tree technology work off the same placement information as its place-and-route tools. Since the company used Cadence's Qplace, and CTgen worked with Qplace, the latter was ideal in that regard. CTgen uses placement information in several ways. For example, because placement information is available but routing hasn't yet been done, it has to estimate the routing using a technique such as the Manhattan minimum spanning tree (Manhattan MST). Based on the estimated routing, it calculates the Elmore RC along every output-pin-to-input-pin routing path and uses that value to determine the interconnect delay. In addition, CTgen's built-in routing estimator automatically detects large obstructions present in the routing area--fixed macros and reserved areas for power lines, for example--and route around them before calculating the interconnect delay, ensuring a good pre- and post-routing correlation. Because clock tree generators insert buffers to create the tree, Toshiba wanted to ensure that the buffers are put in legal locations. CTgen handles the task in two steps. On the first pass, it ensures that no buffers are placed on top of large fixed macros such as RAMs or ROMs. This pass may leave some overlap between the clock tree buffers and other cells. Then the tool interacts transparently with Qplace to remove any remaining overlap, leaving all inserted buffers in legal places. Without this step, the designer would have to rerun the placer to validate the legality, a potentially tedious trial-and-error process. Toshiba sees continuing advances in process technologies as a strong impetus for bringing physical design information into the logic design process. To that end, it's already recommending that customers begin employing floorplanning to predict delays and counteract the iterative nature of DSM design. Currently, Toshiba recommends that customers have access to and use CTgen at the design planning stage. In addition, it recently introduced the Design Planning System (based on Cadence's Logic Design Planner/Physical Design Planner, both designed to support CTgen), believing that without clock tree synthesis, no meaningful back annotation can be done for resynthesis, re-optimization, or timing analysis.
Mohammad Mohsin is a senior EDA engineer in the advanced methodology group at Toshiba Electronic Components, Inc. in San Jose. His current work includes defining and implementing design methodologies for deep-submicron technologies and links to layout. He also has several years' experience in physical design. Richard Feaver is a product engineering manager at Cadence Design Systems, Inc. in San Jose. He has worked as a test engineer and designer at AMD, Control Data, Data General, and National Semiconductor. He has been with ECAD/Cadence for about 13 years and has supported many software tools. To voice an opinion on this or any Integrated System Design article, please e-mail your message to miker@isdmag.com. integrated system design March 1998[ Articles from Integrated System Design Magazine ] [ ICs and uPs ] [ Custom ICs and Programmable Logic ] [ Vendor Guide ] [ Design and Development Tools ] [ Home ] For more information about isdmag.com e-mail cam@isdmag.com For advertising information e-mail amstjohn@mfi.com Comments on our editorial are welcome Copyright © 2000 Integrated System Design
|
||||||||||||||||||||||||||||||||
Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints| RSS|
Digital| Mobile |
| Network Websites |
|
International |
|
Network Features |
|
|
|
All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved. Privacy Statement | Terms of Service | About |