United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 


How Magma makes timing
Print this article Email this article Reprints RSS Digital Edition

EEdesign.com


Since I remember the days before synthesis tools, I constantly marvel at how clever these little rascals are. However, even the recent generation of physically aware synthesis tools are running into problems meeting timing for today's humongous deep-submicron (DSM) designs. One company -- Magma Design Automation -- appears to have come up with a rather cunning solution.

Timing closure (or lack thereof)
One of the biggest considerations with today's multi-million-gate digital IC designs is achieving timing closure. A key problem is the way in which traditional synthesis tools perform their evaluations, which limits the number of gates they can work with at any one time.

For the purposes of a really quick example, let's assume we're working with a simple cell library that contains eight different size/drive-strength versions of an inverter, and four each for the AND, NAND, OR, and NOT gates. Now assume that the RTL source code for the design contains a statement like "w = !( (!x + y) & z )". This statement can be implemented using a variety of logic gate topologies, such as the four shown in Figure 1. Achieving the optimal timing solution is tricky, because substituting a larger, faster gate may actually slow that path down. This is because the new gate's larger input parasitics will slow whatever upstream gate is driving it.

Figure xx - Title
Figure 1 - Alternative topology examples

If the synthesis tool considers all possible permutations of the different sized logic gates, this will equate to 8 x 4 x 4 = 128 for topology (a) and 8 x 4 x 8 x 4 = 1024 for (b), (c), and (d). In reality, of course, our example function would represent only a small portion of a much longer path. In attempting to meet the timing constraints for the entire path, the synthesis tool would have to perform numerous time-consuming evaluations.

In order to manage this type of problem, the design has to be artificially partitioned into blocks of about 100,000 gates or smaller. Timing budgets are then assigned to each block in a somewhat arbitrary fashion, the synthesis tool works on a block-by-block basis, and everything is "stitched back together" at the end.

Interconnect delays dominate over logic delays in DSM designs, which makes device timing and performance extremely dependent upon layout. Another core problem is that delay effects that used to be relatively insignificant -- such as those caused by crosstalk and voltage drop -- are now extremely significant in today's DSM IC technologies. Synthesis technology was developed during a time when both interconnect delays and these other effects could be ignored for all practical purposes. Unfortunately, the over-simplistic timing approximations that continue to be used by conventional synthesis tools mean that post-synthesis delay estimations can significantly diverge from the actual results generated by layout.

In an attempt to combat these problems, the latest generation of "physically aware" synthesis tools bring some level of physical knowledge as early as possible into the front-end design flow. These tools are "placement-savvy" in that they use design-dependent wireload estimations derived from the initial placement and global routing of the design.

However, even with these physically aware synthesis tools, significant discrepancies between post-synthesis and post-layout timings remain. For example, the router may choose different routing layers to those assumed during the synthesis step, which can substantially affect the ensuing interconnect delays. This means that engineers don't know if timing closure is even achievable until time-consuming layout, with typically multiple iterations, has been performed.

Gain-based synthesis
This is where things become difficult to describe in a short column, so the following presents a very simplistic view of a complex topic. In 1999, Bob Sproull, David Harris, and internationally-renowned Ivan Sutherland explored this topic in a book called "Logical Effort: Designing Fast CMOS Chips."

Boiled down to a nutshell, it's possible to determine a logical effort (LE) value for each cell in a library, where LE represents the effect of a gate's internal topology on its ability to produce output current, and the individual LEs for the whole library are normalized to that of a simple inverter.

Based on underlying LE concepts, it's also possible to derive intrinsic delay (P) and gain (G) values associated with each cell (G values describe how the cell's performance is affected by its electrical environment - that is, what it's connected to). The trick is that, based on the way in which LE, P, and G are defined, they are largely independent of the size of the transistors used to form each cell.

To cut a long story short, Magma has developed a patent-pending technique to take an existing cell library comprising multiple size/drive-strength versions of each type of cell, and use this to automatically generate special library abstracts called SuperCells'. These SuperCells have associated LE, P, and G values, which can be dynamically sized to accommodate their environment and loading. Thus, rather than continually analyzing multiple drive strength cells for a given function, the SuperCell concept allows each logic function to be accurately modeled using just one cell (Figure 2).

Figure xx - Title
Figure 2 - SuperCells (spring symbols indicate dynamic sizing capability)

"But what does all of this mean?" you cry. Well, it allows you (actually, Magma in this case) to create a synthesis tool that inputs the RTL for the design, the design constraints, and the standard library (*.lib) associated with the targeted implementation technology. It can then automatically generate the SuperCells based on the target library, and to use these to predict and fix the path delays (note the word "fix", because this is where things start to get really cunning).

The first beauty of gain-based synthesis is its simplicity. Remember the statement "w = !( (!x + y) & z )" that we discussed before. Well, in order to determine which topology provides the best timing solution, the gain-based synthesis tool need only evaluate the SuperCell gain-based delays associated with each circuit configuration (Figure 3).

Figure xx - Title
Figure 3 - Gain-based synthesis makes it easy to select the optimum technology

It's important to note that these evaluations are performed without exhaustive analysis of a complex search space and without the artificial estimation of track parasitics (yes, I know it sounds weird, but read on). More importantly the evaluations are performed without guessing at or fixing cell sizes before the actual routing. All of this means that:

  • Synthesis times are dramatically reduced compared to traditional techniques.
  • The relative simplicity of gain-based calculations means this form of synthesis has a far greater capacity than other synthesis solutions. In turn, this means that a gain-based synthesis tool has the capacity to handle multi-million gate designs without resorting to artificial partitioning.
  • All timing optimizations are completed and all circuit delays are determined and frozen by the end of the synthesis step. This means that the synthesis tool can immediately detect if the design cannot be physically implemented, instead of having to perform multiple time-consuming layout iterations to learn the same thing.

Gain-based physical synthesis
But wait, there's more! Magma's logic synthesis technology is completely embedded into their physical synthesis environment, which includes two unique capabilities for holding timing constant (Figure 4).

Figure xx - Title
Figure 4 - Gain-based synthesis defines a fixed timing plane

The first is size-driven placement, which is where SuperCell technology really comes into play. As the placement engine performs its task, all of the cells are dynamically sized to meet their timing budgets based on the actual loads they see. The key point here is that the smallest possible sizes are used for each gate so as to just meet the timing budget. This means the chip occupies the smallest amount of silicon real estate, which dramatically reduces congestion, power consumption, and noise-related problems.

Following placement, the SuperCells are reverse-mapped into the appropriate cells from the real cell library, and a load-driven routing engine is used to tune the width and spacing of interconnect so as to maintain the original timing budgets and to ensure signal integrity. Thus, both size-driven placement and load-driven routing operate within the fixed timing plane where they adjust cell sizes and loads in order to hold delays constant.

The Mind Boggles
As I said earlier, this column presents a very simplistic view of a very complex topic (you can learn a lot more from the white papers on Magma's web site). Every synthesis vendor is prone to singing the praises of their particular solution - often in four-part harmony - and all synthesis tools have their own advantages. (I'm fairly sure I'll be hearing from other vendors and possibly writing about their offerings in the not-so-distant future).

Of course nothing is perfect (although my mother thinks I come close ). For example, gain-based synthesis is not as effective when designing small blocks or working with 0.25 micron technologies or above in which interconnect delay effects are not as significant. Gain-based techniques are also less useful when working with microprocessor-type designs using custom cell libraries that don't have predictable output/input ratios. And a gain-based synthesis approach may be less effectual for designs that have dedicated placing requirements, such as high-end networking switches that have weird and wonderful electrical requirements that are outside the scope of the logical and timing implementation.

And of course nothing is going to make up for bad RTL. Sometimes the current incarnation of a design is never going to meet its timing constraints, no matter how good the synthesis technology is that you're using. Having said all of this, however, it seems to me that Magma has come up with something that's really rather interesting.

They've ended up with a fully integrated physical synthesis system with built-in logic synthesis, both of which use Magma's "FixedTiming" approach to totally eliminate timing closure iterations. That is, Magma has a single executable that takes you from RTL through synthesis and layout -- including all of the associated steps like clock and power design, test insertion, and signal integrity -- all the way to GDSII. This unique technology has the capacity to support multi-million gate designs with blazingly fast synthesis runtimes.

Timing is frozen at the end of the synthesis, at which point you are immediately made aware if your design has no chance of reaching timing closure, thereby saving you a lot of messing around performing time-consuming layout to discover the same thing. Subsequently, all downstream tools like place-and-route function in such a way as to maintain the timing established by synthesis, which means that post-layout timing constraints are attained without having to iterate back into synthesis.

When I first became aware of Magma's technology, my knee-jerk reaction, based on my background, was to consider it primarily in terms of front-end ASIC design engineers. However, while penning this column, I had a chat with Stuart Hamilton from NEC Electronic's System LSI division. Stuart told me that when a customer presents him with an RTL level or gate-level netlist, he immediately uses Magma's SuperCell and Fixed-Timing technologies to determine whether or not the design can be physically implemented at the required speed. This allows him to quickly report any potential problems back to the customer without having to perform time-consuming and expensive layout. All in all, it sounds like Cool Beans to me.

Until next time, have a good one!

Clive (Max) Maxfield is president of Techbites Interactive, a marketing consultancy firm specializing in high-tech. Author of Bebop to the Boolean Boogie (An Unconventional Guide to Electronics) and co-author of EDA: Where Electronics Begins, Max was once referred to as a "semiconductor design expert" by someone famous who wasn't prompted, coerced, or remunerated in any way.





The views and opinions expressed in this column are strictly those of the author and should not be taken as an editorial position of EE Times or any of its other editors, publications or Web sites.


  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Looking for a new job?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
SRC Expands R&D Centers
The Semiconductor Research Corp has added a new center to its university R&D efforts.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.



All White Papers »   

  Design Resources
Designing for a dual Galileo-based GPS system
Malcolm Lomer of SiGe Semiconductor discusses GPS design challenges with the Galileo satellite system.
More »
 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About