High-level synthesis (HLS) is a key technology that links electronic system-level (ESL) design to register transfer-level (RTL) implementation. In addition to automating the ESL-to-RTL design flow, HLS enables efficient design space exploration that helps designers quickly achieve a micro-architecture that meets their goals. However, traditional HLS technologies were mainly applicable only to datapath-dominated design and were not effective for control-intensive design. Also, traditional HLS technologies required specific design styles and use models to achieve good quality of results (QoR).
In this article, we describe how we were able to apply a commercial HLS tool (Cadence C-to-Silicon Compiler) to a NAND flash controller with an error correction code (ECC) block. The initial ECC design was based on an ECC software program, which led to a large area due to two large arrays. We then used our domain knowledge of the ECC coding theorem to structure the code for hardware implementation. The implemented results show that (1) the HLS tool can achieve QoR comparable to handwritten RTL for a control-intensive design; (2) a design flow that properly considers the hardware implementation is a key factor in achieving good QoR in an HLS flow; and (3) an HLS flow gains a factor of two design productivity compared to an RTL flow.
Requirements of electronic system-level design and high-level synthesis To address the challenges of today’s comprehensive system-on-chip (SoC) designs, raising the level of design abstraction is becoming mandatory. Similar to the shift from gate-level design to RTL in the mid-1990s, a transition from RTL to ESL is now emerging. Compared to RTL, the ESL methodology allows designers to expend less effort at design implementation, debugging, and verification. Furthermore, ESL provides sufficient flexibility and visibility so that hardware and software designers can co-design and co-optimize system architecture through a unified platform. To implement the ESL hardware description in silicon, HLS must automate the ESL-to-RTL transformation while optimizing the design for implementation.
In the past, many academic and commercial HLS tools were developed and announced, but most of them were not widely adopted by the industry. Those early products had many limitations?they required a proprietary language or use model; they supported only datapath-dominated designs; and the performance, area, and power QoR were unqualified. Since RTL had been accepted by the industry, designers did not have much motivation to move to HLS at that time.
Today, the demand for mobile products that support myriad applications has increased dramatically. These devices often need specific hardware accelerators to improve system performance and must achieve low power consumption to prolong battery duration. In addition, more business competition among companies is forcing engineers to shorten product development cycles to meet their time-to-market windows. These factors—combined with the fact that HLS enables design space exploration to help designers quickly optimize RTL implementation—have motivated the industry to re-consider and adopt HLS into the design flow.
Choosing a compiler In our experiments, we chose the Cadence C-to-Silicon Compiler to implement the design. The compiler supports ESL standards including the SystemC language and transaction-level modeling (TLM). This support keeps design consistent. In other words, a transformation from ESL design to another model for the HLS tool is unnecessary. The compiler is also capable of handling full-chip synthesis where designers can put datapath and control logic together. This overcomes the restrictions found in traditional HLS tools.
The compiler is also tightly coupled to a production implementation engine to estimate timing, area, and power during HLS. This ensures that the generated RTL is realistic and meets the original design goals when subsequently performing RTL synthesis.
To improve QoR, the compiler has scheduling and resource binding algorithms, and it gives designers full controllability by Tcl commands or interactive graphical exploration to adjust micro-architectures so as to achieve different types of implementations. A set of analysis tools such as “Area Map Tree,” “Resource Binding Viewer,” and “Cycle Analysis Viewer” helps designers tune and refine designs.
In addition, the compiler can automatically generate fast behavioral Verilog simulation models after scheduling to help system performance analysis and validation. The verification wrapper supported by the compiler enables synthesized RTL circuits to be verified in the Cadence Incisive SystemC environment. With this support, verification efforts can be greatly reduced.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.