The SoC design process is becoming more and more like a playing in a casino.
The chip performance outcome is like betting on red or black in roulette:
50% of the designs at the 90nm node will fail to meet performance
specifications, according to Handel Jones of International Business Strategies (Los Gatos, Calif.).
Betting that your project will complete on schedule is like
betting on a seven roll in craps. Only 15% of all IC design projects
complete on time, according to Ron Collett of Numetrics (Cupertino, Calif.), who
has benchmarked over 1,000 IC design projects. The bottom line is that the
design process for a complex SoC is no longer an engineered process; it has
become a game of statistical chance. What's going on here?
Let's look at the usual suspects. Is it deep submicron effects? While new
nanometer effects do add new problems, new tools are being brought to market
to deal with these problems. High-functioning design teams continue to close
the silicon-capacity-vs.-tool-capacity "design gap," just as they always
have.
Is it design abstraction level? Doesn't appear to be. Over the years we've
moved from masks to polygons to gates to RTL, reuse, and beyond. The total
amount of time spent developing design descriptions is decreasing as a
percentage.
Let's look at what's happening to the design process itself. As design
intent abstraction levels have risen and deep submicron effects have
increased, the process by which we reduce descriptions into chips has grown
dramatically. Design flows are now extremely complex many tools, many
steps.
Yet the manner in which we specify, manage and maintain these flows
has remained largely unchanged. We still use scripts and makefiles to
"automate" our implementation and verification design flows, just as we have
for the past 20 years.
Taken as a whole, the design process description is a mess. The number of
lines of script-ware is staggering. A large SoC may require more than
100,000 lines of scripting that is no trivial amount of software
development!
Scripts are hard to debug and extremely fragile, which makes
them costly to operate and maintain. Scripts are hard to read and only
understood by the engineer who wrote them; they offer no reuse of best
practices. And script management is expensive: design managers report that
at least 50% of their engineering resource is devoted to managing the push
of design data through the tools in their flows.
We are all attempting to write million-line software systems in assembly language. At some point, the system's complexity gets to be too high to handle. Based on poorly performing project outcomes, that time appears to have arrived.
Before we look at solving these problems, perhaps we can learn something by
looking at the slightly different field of software development. The
parallels are very clear. In the early stages of software design, projects
would most often be worked on by a single developer or a very small team.
As software has grown to be omnipresent, being used in everything from our
home computers to digital phones to the sophisticated computer technology
used in modern cars, the techniques used to manage software projects have
grown. We've graduated from using make, vi or emacs, and gcc to using advanced tool suites with integrated source-code control, project partitioning, and distributed project management.
What can we learn from the software experience to help us solve the chip
implementation problems? We need to introduce flow automation technology
that raises the abstraction level for the design process itself. Such a
shift abstracts away the lower-level details that are guaranteed to change.
The description is less verbose, so it is easier to understand, improve, and
maintain, and is truly reusable. True flow automaton technology enables
reliable, fast, and tool-expert-independent design iterations, so that
engineers can focus on solving design problems and not be consumed with tool
data and operation problems.
How do we know when flow automation is real?
When we have fast,
predictable, and repeatable netlist-to-layout turnaround times. When we
have a system that can manage the implementation of a chip as a group of
blocks designed by smaller teams dispersed throughout the globe, operating
on a 24/5 schedule (yes, we still need time off). When the system is built around a real-world ECO technology that understands how often design changes occur, often daily at first, and many times even after tapeout. The real test is reuse of design flow best practices across the enterprise along with the habitual reuse of IP.
The design process has become the wild card. The time has come to regain
control over design schedule and performance outcomes with new flow
automation technology. Do it, or continue to play against staggering odds
that are not in your favor.
Mark Bales is a fellow at ReShape (Mt. View, Calif.). He can be reached at mark@reshape.com.