Design teams commonly use system models for verification. System models have many advantages over register transfer level (RTL) code for verification, notably, because of their ease of development and runtime performance. The ability to leverage the system-level verification to create functionally correct RTL code has challenged many a design team until now. A methodology known as Sequential Logic Equivalence Checking (SLEC) has the unique capability to formally verify RTL implementations against a specification written in C/C++ or System C.
This article will describe the system-level design flow of a commercial graphics processing chip. In this flow, system models have been developed to validate the arithmetic computation of video instructions and then used to verify the RTL implementation using the SLEC methodology.
Defining a System-level Flow
As design complexity increases, system-level modeling becomes an inescapable approach to quickly simulate the entire system. The design complexity associated with functional partitioning and block interfaces, along with hardware/software co-design, expand exponentially making system verification mandatory.
C/C++ and SystemC are commonly used for system-level design and verification. In this example, C/C++ was used to model a video processing arithmetic block.
Once the system model was refined and verified, RTL designers wrote the Verilog code. While high-level synthesis tools can generate RTL code from system code, it is by far more common for design teams to manually code the RTL. Even though the RTL implementation was verified with numerous verification techniques, it was impossible to test all possible conditions with simulation-based methods.
Many verification tools and methods are in use today, including assertion-based verification, random stimulus generation and coverage-driven verification. None of them effectively leverage system models, though they should because system models are functionally trustworthy. A verification methodology such as SLEC can transfer the confidence of system models directly to the RTL implementation.
Verification of a Video Processing Arithmetic Block
The graphics processor market is driven by the quality of images, performance and consumer buying seasons. For designers creating the latest graphics processor chips, this mandates rapid development of new algorithms and designs. To meet this requirement, system models are used to close the gap between initial specification and tapeout. When this project started 18 months ago, directed random simulation was being used to verify the RTL code. This type of simulation ran for days, but left the design team uncertain and concerned that they had missed a design bug.
The block under test was the arithmetic block of an ongoing project that would create the next-generation video processing chip. This RTL block implemented both video and non-video instructions. The verification effort focused on 21 video instructions, ranging from "parallel shift" to "absolute difference with reduction" operations.
The goal for using the SLEC verification methodology was to improve RTL code verification prior to chip-level regressions. This methodology was used to catch RTL design bugs more efficiently, as well as find bugs that simulation missed, by using system models written in C++.
A system model of the arithmetic block was implemented in ~2,400 lines of C++ code. The first step in this project consisted of amending the C++ code to make it amenable to the SLEC methodology which requires that the constructs in the code must have a corresponding hardware representation. Since this model was not originally written for equivalence checking, some of the code constructs used to model the design did not fit this requirement. The design team used statement to filter constructs that did not have clear hardware concepts, such as reinterpret cast and static cast. These were modified in the C++ code.
Moving forward, the use of coding guidelines during C++ development will eliminate the need for recoding in future design blocks.
The design team then set up the verification environment. SLEC was controlled through a TCL interface that specified source files, reset states and sequential differences, such as timing and interface differences.
Sequential differences were specified as I/O mappings and design latency. For the system model, the designers created a SystemC wrapper that introduced a reset state and clock. By using a SystemC wrapper, they did not have to modify the C++ model. (Figure 1: The SystemC wrapper.)
1. The SystemC wrapper.
(Click this image to view a larger, more detailed version)
The RTL implementation of the video processor arithmetic block was ~4,600 lines of RTL code with a latency of seven clock cycles. The C++ system model had a latency of one cycle added by the System C wrapper. The design team then looked at how often a new set of input data was sent to each design, including a pipeline in the RTL code that accepted a new set of input data every clock cycle. Therefore, the throughput for both C++ and RTL was equal to one.
2. The input/output mapping and sequential parameters, such as throughput and latency, between the system and RTL models.
(Click this image to view a larger, more detailed version)
The SLEC methodology used sequential analysis and mathematical formal algorithms to verify that for the two models, all input combinations resulted in the same output over all time. Unlike simulation, SLEC verified all input conditions in parallel. For this project, that translated into simultaneously verifying all instructions under all possible input stimuli.
Because each video instruction implemented a specific arithmetic function, the designers decided to verify one video instruction at a time to improve debug efficiency. By knowing the instruction being tested, it was easier to identify the logic associated with the falsifications compared to debugging all instructions at the same time. Additionally, the SLEC runtime was reduced by verifying one instruction at time, further increasing debug efficiency.
While verifying the first instruction (VEC4ADD), nine design bugs were found in the RTL model and one in the system model, which showed the designer how to enhance the C++ code for future implementations. The following result indicated a design bug found by SLEC, causing a corner case condition missed by the clamping logic.
Notice that the port names, values and time of the bug are displayed.
Output-pair: spec.result ( 1, 1) 'h7f6e7f7f @ 533
impl.res_out ( 1, 7) 'h7f6e7f2f @ 653
The verification methodology identified design differences in short, 10 clock cycles or less, counter examples. For each counter example waveform, a VCD (or FSDB) file was generated to show the precise sequence of inputs that caused the design difference, making it easy to debug and locate errors.
For each of these instructions, SLEC found the difference and created the counter example waveforms in less than five minutes, SLEC also generated test benches representing the counter example that ran in simulation with the original C and RTL design. The testbenches contained monitors to expose the same design bug shown in the waveforms. In this project, the counter-example testbenches were then added to the local unit-level regression test suite.
After fixing the code for the VEC4ADD instruction, SLEC proved equivalency in 361 seconds using 52 MB. This can be compared to running 3.7 x 1034 test vectors to exhaustively simulate this instruction which would take more than a lifetime on an emulator processing one-million cycles per second.
The project to verify the first instruction (VEC4ADD) was completed in four days, including setup time, debugging multiple design bugs and achieving comprehensive proof. The second instruction used the same setup scripts as the first instruction, allowing the designers to immediately focus on debugging. They were able to find, fix and confirm nine bugs on the second instruction (VEC2ADD) in less than two days. By extrapolating these results, it could take five to seven weeks, depending on the number bugs found, to verify all 21 instructions. This compares to the four to six months the design team spent on previous designs using simulation-based methods.
The RTL verification methodology for graphics instructions using system models was a success, with SLEC finding 19 functional bugs. The methodology improved the quality of verification and reduced the debug cycle with short, concise counter examples. The bugs where caused by a variety of issues, including incorrect sign extension, missing clamping logic and initialization errors that would have led to reduced image quality, software workarounds or a re-spin of the chip.
Sequential Logic Equivalence Checking is a unique methodology that finds bugs missed by traditional verification methods. It can improve functional verification effectiveness without requiring additional testbenches or assertions, minimizing design risk by exhaustively verifying RTL implementations using system models.
About the Author:
Anmol Mathur is the CTO of Calypto Design Systems. He can be reached at email@example.com.