MOUNTAIN VIEW, Calif. Synopsys Inc. has undertaken a pioneering research effort aimed at C-language compilation for reconfigurable programmable architectures. The project could prove a turning point for the EDA industry, which has focused on design tools for custom silicon.
If programmable, platform-based systems-on-chip cut deeply into the custom silicon market, as expected, EDA companies that derive their revenue from ASIC design could be threatened. Synopsys is taking the bull by the horns with the Nimble Compiler project, which presents both a new hardware architecture and a software-based design methodology that bears virtually no resemblance to the existing EDA tool chain.
Funded partially by the Defense Advanced Research Projects Agency (Darpa), the project is a joint effort among Synopsys' Advanced Technology Group (ATG), Lockheed-Martin, the University of California at Berkeley and Germany's Technical University of Braunschweig. The compiler under development works with what the project terms the Agileware architecture, which includes a general-purpose CPU, a dynamically reconfigurable data path coprocessor and a memory hierarchy.
The compiler takes in pure ANSI C code and automatically selects computationally intensive loops that can be greatly accelerated in hardware. The code is turned into data-flow graphs and compiled into the data path, which in the research prototype is represented by a Xilinx Virtex FPGA, although researchers expect that other reconfigurable data path arrays will offer better cost and performance in the future.
Few real-world applications use dynamically reconfigurable architectures, and the group has no immediate plans to commercialize Nimble Compiler. But Synopsys ATG believes that architectures similar to Agileware could serve a broad range of computationally intensive, embedded DSP tasks. Such architectures could be implemented in system-on-chip (SoC) designs as purely programmable platforms that require no custom silicon and are differentiated solely on the basis of embedded software and programmable or reconfigurable hardware.
It's therefore not surprising that Nimble Compiler is utterly unlike anything Synopsys has commercialized thus far. It has no relationship to the company's synthesis products and doesn't even use the SystemC class library. And commercialization isn't likely anytime soon, said Randy Harr, director of research for Synopsys ATG.
"I started this project to think out of the box," Harr said. "I'm enabling people to avoid building custom chips. This may not be in the mind-set of a company that derives 100 percent of its revenue from people building custom chips."
Ahead of silicon
Harr acknowledged that Nimble Compiler is "ahead of the silicon" but said reconfigurable architectures such as Agileware are surfacing. The Nimble Compiler project has concluded that such architectures can offer better performance than very-long-instruction-word (VLIW) or super-scalar architectures for a wide range of compute-intensive applications. In particular, Harr said, Agileware could prove useful for asymmetric digital subscriber line, asynchronous transfer mode, voice-over-Internet Protocol, image-processing and set-top box applications.
But will dynamically reconfigurable SoC architectures replace ASICs for some of those applications? "They will have to," Harr said. "More and more standards are designed for lots of variance. ASICs won't win in the long run when set-top boxes have to support 20 different formats."
Synopsys has never formally announced the Nimble Compiler project, but it was described in a Design Automation Conference 2000 paper authored by Harr and several collaborators. Synopsys ATG has also been offering occasional demonstrations for selected customers and members of the academic community, including one at Synopsys headquarters on Aug. 17.
The idea of a purely programmable system-on-chip platform, of which Agileware could be one variety, is not new. The seeds of such platforms are already being sown today, as FPGA vendors mix programmable logic blocks with processor cores on the same chip. As the trend continues, many observers believe that an increasing number of SoC designs will involve little or no custom hardware design.
The multi-university Gigascale Silicon Research Center (GSRC) has a group dedicated to "fully programmable systems" that's headed by Kurt Keutzer, professor of electrical engineering and computer science at the University of California at Berkeley. Perhaps not coincidentally, Keutzer is a former chief technology officer for Synopsys and during his tenure there had hired Harr to launch the project that turned into Nimble Compiler.
Keutzer has predicted that software-programmable platforms will replace ASICs in many applications and will give birth to a new EDA industry that will provide such tools as compilers, estimators, schedulers, performance visualizers and debuggers. Nimble Compiler may be an early sign that such an industry is indeed emerging.
Keutzer's current work with the Gigascale Silicon Research Center has a somewhat different focus, however. For one thing, he said, GSRC is targeting networking applications, and its Mescal architecture puts a heavy emphasis on multiple processing elements. But the main difference is that GSRC is focusing on programmable rather than reconfigurable logic, although its scope could include some special-purpose execution units that are reconfigurable.
What's significant about Agileware is that its hardware is dynamically reconfigurable in the field. It thus differs from programmable architectures that are "configurable" only at the mask level, prior to fabrication.
There has been a surge of interest in reconfigurable architectures, most recently demonstrated by revelations of the Silicon Spice architecture. Earlier this month, Quicksilver Technology disclosed plans in this area. Other reconfigurable architectures announced this year include Malleable Technologies' Meca and Chameleon Systems' Reconfigurable Communication Processor.
While advocates of reconfigurable architectures point to their immense flexibility, not everyone agrees they're the best answer. "The reconfigurable approach presents significant challenges in creating efficient and predictable silicon solutions," said Cary Ussery, president and chief executive officer of Improv Systems, which provides a programmable VLIW architecture. "The Nimble approach is similar to Chameleon, Silicon Spice and Malleable; each of these has significant issues in providing scalability that can address the complexity of real-world applications and increasing performance."
The Nimble Compiler project is a long way from having a real-world application that would enable comparisons with other kinds of programmable systems. Harr acknowledged that the current AceV research prototype used by Synopsys is impractical for general-purpose use because of the slow clock speed and the cost of reconfiguring the Xilinx Virtex 1000.
Still, Harr said, early benchmarks run by the UC-Berkeley Garp project show that an Agileware-like architecture can extract more instruction-level parallelism out of C language applications than a VLIW or superscalar processor can exploit. That presumably translates into higher performance for less area and power. The Garp project developed many of the concepts used in the Nimble Compiler effort.
The key to making any of the new programmable architectures work is the compiler, said Gary Smith, chief EDA analyst at Dataquest Inc. (San Jose, Calif.). "The problem we're having with VLIW right now is that the compilers aren't very good," he said.
But there is "no question" that programmable platforms will replace many ASICs if compilers can be made to work, Smith said.
From C to silicon
In its effort to come up with a compiler that works, the Nimble project has made extensive use of work done by Tim Callahan, a graduate student at UC-Berkeley. Callahan's research with the Garp project resulted in a prototype compiler that included vector/loop transformations, profiling, analysis and visualization, as well as a VLIW-like kernel technique that transforms loop bodies into hardware data flow graphs.
Callahan transferred his prototype compiler to the Nimble Compiler group, which has continued to expand it. The project is now targeting Agileware, which features a data path with tens to hundreds of configurable ALUs, configurable interconnects between ALUs and registers, and a configurable memory hierarchy.
The current implementation of Agileware is the AceV card. Based on a PCI processor card from TSI Telsys, it includes a 100-MHz Microsparc IIep processor and a reconfigurable logic array that includes a Xilinx Virtex 1000-4 with four 1-Mbyte SRAM banks.
The idea of the compiler is to exploit instruction-level parallelism in loop-centric code. It looks for a relatively small number of loops that offer the greatest opportunity to accelerate parallelism. Those loops are converted into data flow graphs and compiled into the reconfigurable data path. The rest of the program executes as C code in the processor.
In the current Nimble incarnation, ANSI C code is first read by the Chai front-end compiler. Key to Chai is an innovative profiling capability that can determine which loops should be accelerated in hardware. "We can determine how many configurations will take place before compiling," said Harr. "Loop entry trace profiling tells us exactly what will happen."
The Adapt data path compiler takes in data flow graphs at a high level of abstraction. It performs module mapping, floor planning, scheduling and sequencing, and outputs a preplaced FPGA netlist. All that's needed to program the Xilinx Virtex device is routing. The entire compilation process can be as short as 5 minutes, Harr said, and most of that time is taken up by the routing performed by Xilinx's tool.
"All this is still being explored, but we're pretty pleased with what we've been able to do," said Harr. "I'm convinced you can have a configurable data path that can work with a wide range of applications without giving up too much."
Berkeley's Keutzer, however, believes that the Nimble project will ultimately be driven in a more application-specific direction and that it may be most amenable to application-specific reconfigurable architectures such as those from startups Silicon Spice, Malleable and Chameleon.
The Nimble Compiler approach differs significantly from Improv Systems' compilation technique. "In the Nimble approach, loops are offloaded into the reconfigurable coprocessor," said Improv's Ussery. "Our approach is to use microscheduling and software pipelining to optimize execution of the inner loop." Having a separate coprocessor, he said, will create overhead as loops are moved between registers.
Nonetheless, Ussery said, the project foreshadows a shift in the EDA industry. "There is a general trend away from large blocks of custom logic toward software-programmable processors," he said. "For EDA, this means a shift from tools that support block-based design of logic to configurable processor architectures and advanced compilation tools.
"This is a methodology, and a mind-set, that flips around the traditional thinking in the EDA industry."