United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 

Design Automation

Logic Synthesis for the Next Generation

Chromatic Research examines whether a new logic synthesis tool can handle the growing complexity of its full-custom media processors.

by Michael Klein


Keeping up with the hectic pace set by the PC multimedia industry isn't easy. At Chromatic Research, Inc. (Sunnyvale, CA), we're constantly pushing for more performance from our multimedia acceleration products. As we begin to develop our next generation of media processors, we need a logic synthesis tool that can cope with the products' rapidly increasing gate counts. A recent benchmark of the BuildGates software from Ambit Design Systems Inc. (Santa Clara, CA) gave us promising results--the software synthesized a large piece of logic without much user intervention and significantly improved the design's timing.

Chromatic's Mpact media processors are full-custom designs. At least two-thirds of our next-generation IC is random logic that we will synthesize and place and route ourselves. Our current media processor contains roughly 150 synthesized kgates. But we're quickly moving toward a much higher complexity level--over 500 synthesized kgates.

Our existing synthesis-based design flow is showing signs of strain under the pressure of these larger designs. Most of our emerging problems stem from the fact that our current software can't efficiently synthesize large pieces of logic, so each design must be broken down into many hierarchical blocks and sub-blocks that are each synthesized separately. This methodology often leads to unpredictable timing results. In addition, the time needed to prepare this disjointed design for synthesis is becoming prohibitive.

Currently, the synthesis process doesn't affect partitioning. Block partitioning is driven mainly by functional definition. The chip's architecture is defined by first-order physical constraints, such as the locality requirements of major blocks (gross floorplan) and bussing (minimizing top-level interconnect). Once this high-level partitioning is completed, the detailed physical characteristics of the chip dictate any further re-partitioning that is performed.

Benchmark We recently benchmarked a beta version of the BuildGates software using an actual 21-kgate block from our next-generation processor. (The complete processor is currently being designed using our existing software and methodology.) It's a very complex block containing flip-flops, latches, datapath control logic, multi-cycle paths, with some very difficult timing paths. For this particular processor, 21 kgates constitutes a small to medium-sized block. There are two blocks that each have about 100 kgates.

For the 21-kgate block to run through synthesis in our existing methodology, we created a hierarchy of 38 sub-blocks. The sub-blocks were defined as individual modules in Verilog and synthesized separately. They were then combined into the full block.

Basically, the benchmark was intended to see if BuildGates could improve one piece of an existing design without using tool features or design libraries that were different from those used in the original design (see "Beyond the benchmark"). The tool's compatibility with our design environment allowed us to leverage the existing libraries and tool flow. However, the BuildGates environment can synthesize much larger pieces of logic, so much less design partitioning is required.

The original design was entered in RTL Verilog and run through synthesis to create a netlist. Our existing flow generates both Verilog and EDIF netlist formats, but BuildGates currently supports only the Verilog representation. We do all of our own standard cell layout using Aquarius-XO from Avant! (Sunnyvale, CA). After parasitic extraction, we feed the results into Cadence's Pearl static timing analyzer. We perform timing analysis at the block level first, and then finish with full-chip analysis.

Before synthesis, the designer of this block had to delve into the lower levels of RTL hierarchy and manually create the budget for each of the 38 sub-blocks. It's a very time-consuming process, because the designer has to tinker with scripts and derive constraint files that will get the desired speed results from the synthesis tool (see Figure 1).

The first approach that we took for our Ambit benchmark was to duplicate that exact flow with its 38 sub-blocks and the pre-determined timing budgets. We simply translated the existing timing-budget scripts into the Ambit script format. Because we were working with pre-released software, this wasn't an error-free process, but we worked the kinks out and also provided Ambit with feedback for the product. The pre-layout results were very good: 16 percent more speed for the entire block.

While pre-layout timing and gate counts generally point us in the right direction, what goes on silicon is really what counts. After placement and routing, we found that the block area was actually slightly smaller, total wire length was also somewhat less, and timing was slightly better for the BuildGates result, meeting all timing constraints without further iterations.

Then, as a second attack, we ran the block through the Ambit tool in one chunk, with no manual timing budgets for the sub-blocks. And we got good results: the timing was roughly 10 percent faster than the speed obtained from our existing methodology. We were encouraged not just by the resultant speed, but also by the run time and the reduced amount of human labor required prior to the actual synthesis run. We discovered that we can throw away all of the time-consuming hierarchy and script development, and we still get better timing results (see Figure 2).

Post-route results for the one-shot approach gave us over a 10-percent reduction in routed block area from our existing flow, and like the hierarchical approach, met all timing constraints with extracted data, although with slightly less slack.

Developing the hierarchy and writing the scripts required at least two man-months of preparation. Our chips consist of, maybe, ten such blocks, so roughly one man-year is needed to prepare the entire chip for synthesis.

Another key point in our benchmark was to ensure that the results from every timing analyzer were identical. We were using several timing tools: the one built into our current synthesis tool, the one built into the Ambit synthesis tool, and Pearl from Cadence Design Systems Inc. (San Jose, CA). When given the same data, we want to have all these tools agree down to the picosecond. We had a couple of snags up front with BuildGates' timing analyzer, but they were due to the preliminary nature of the software. We've resolved those issues and, since then, have had exact agreement on all the timing analyses.


Figure 1. The flows show the differences in complexity for our normal synthesis flow and the single-block flow.

While we didn't actually use BuildGates' scripting features, it's clear that the tool's open scripting language will be of help to our design flow. A scripting interface with full-programmatic support is something very valuable that's missing from competing tools. Ambit's scripting language can give us access to many internal functions, which is something that we've needed for a long time. The flexibility of the tool to perform very specific optimizations of a logic block look to be very powerful.

Beyond the benchmark
Because the Chromatic Research benchmark involved just one functional block of the media processor, it sidestepped the chip-level benefits offered by Ambit's BuildGates synthesis software. In addition, other features were overlooked because Chromatic Research's benchmark process simply ran an existing design through the synthesis tool and didn't require creating a design from scratch.

BuildGates was specifically designed for hierarchical synthesis, a necessary ingredient for creating complex deep-submicron devices. Its features help users design very large ICs with minimal intervention. For example, hierarchical constraint management means that users don't need to repeatedly apply the same constraints at each level of design hierarchy.

Automatic time budgeting is a feature that is beneficial to designers working at the chip-level. Rather than manually creating time budgets, BuildGates can automatically assign the correct amount of delay to each hierarchical block. This saves designers countless hours of work. To use the time-budgeting technology, designers create a first-pass netlist using a default budget. The time budgeter then automatically distributes the delays correctly throughout the hierarchy.

Users can customize BuildGates through its Tcl programming language interface, which enables users to quickly develop design-independent optimization techniques. Also, Ambit's open environment makes the design netlist accessible for debugging.

The tool's technology mapper uses a parallel algorithmic-based, rather than rules-based, approach. It uses a Boolean algorithm instead of a tree-based algorithm. The result is faster run times for very large designs and better quality of results. In addition, the algorithm-based approach can map to bigger cells with special switches. It improves productivity by synthesizing larger logic blocks, reducing user intervention and producing predictable timing results to reduce iterations.

Glen Anderson is an applications engineer for Ambit Design Systems Inc. (Santa
Clara, CA).

The run time for this benchmark was good, and it should improve as the beta software evolves into a final product. We obtained the same result as with our existing synthesis flow but in less time. Actually, the runtime for the first approach (duplicating the same methodology with 38 separate sub-block compiles) took about 24 hours, twice as long as the 12 hours needed by our current flow. However, the timing results were substantially better. The runtime for Ambit's push-button approach was only five hours. So when we synthesized the whole block in one shot, we beat the existing run time by a factor of two.

It's important to look beyond comparing the synthesis compile times, though. We really are looking at the improvements possible in the overall process. Spending less time on budgeting and scripting means we can spend more time developing better architectures or getting products to market sooner.


Figure 2. The new tool flow consumes far less script development time than the older tool flow.

We tried feeding the 21-kgate block through our existing synthesis flow without the hierarchy and time budgeting, and it gave us substantially worse results. With both tools, a better result was achieved when a large amount of human labor was put into developing the manual timing budgets. However, even without the timing budgets, Ambit produced better timing results than our current flow.

These initial results are promising. We have tried running our largest and most timing-critical block through BuildGates in one shot, and after some initial parser bugs were fixed, we successfully produced a netlist that was very close to meeting timing constraints, and with a significantly lower gate count than the existing block. Due to a netlist format problem, we have not yet been able to place and route this block, but we expect to do so soon. Other VLSI projects at Chromatic are also now benchmarking BuildGates. Through these efforts, we will get a better idea of the types of blocks and under which conditions the tool provides the most benefit.

Michael Klein is the director of VLSI Design for Chromatic Research, Inc. (Sunnyvale, CA).

To voice an opinion on this or any Integrated System Design article, please e-mail your message to michael@isdmag.com.


integrated system design   June1997



[ Articles from Integrated System Design Magazine ] [ ICs and uPs ]
[ Custom ICs and Programmable Logic ] [ Vendor Guide ]
[ Design and Development Tools ] [ Home ]



For more information about isdmag.com e-mail cam@isdmag.com
For advertising information e-mail amstjohn@mfi.com
Comments on our editorial are welcome
Copyright © 1997 Integrated System Design Magazine

  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
With Acquisition Delayed, Sun Cutting 3,000 Jobs
With its proposed acquisition by Oracle being delayed by regulators, Sun plans to cut 3,000 jobs across several regions over the next 12 months.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About