After years discussing verification strategies with hundreds of ASIC designers, it finally hit me: We're at the point where designers are trying to manage billions of cycles of simulation.
Take the video chip business where H.264 and high- definition TV are hot. These chip designers need to simulate hundreds of conformance streams to make sure that the chip is ready to ship. In the wireless handheld market, firmware is key. Ultimately, the device must boot Linux and run a Java application on its LCD screen. And, designers of network routers need to stress their chips through pseudo-random traffic to benchmark key performance metrics, such as packet drop rate.
All these tasks have one thing in common: they require billions of cycles of simulation. Until a designer realizes that he or she can't manage billions of cycles like any other simulation, they hit a wall that I call the "Billion-Cycle Challenge."
The first challenge, of course, is how long it takes to simulate a billion cycles. For a design of average size, register transfer level (RTL) simulation would take in the order of 10 days. This is not practical when one line of RTL code changes and all the tests need to be rerun to make sure nothing broke. Minutes would be better, right?
The second challenge is the bandwidth required to keep a chip busy for so many cycles. Let's take the example of an HDTV chip which is handling about 3 gigabits of data in real-time. If design teams have 10 days to simulate that 1 billion cycle, the testbench only needs to provide about 10 kbits of data to the design--a reasonable number. That's why everyone tends to forget about bandwidth. If the goal is to have that simulation complete in minutes, that means feeding in the order of 200 megabits to the design. It takes a special kind of testbench to move that much data around.
The third challenge is debugging. How can designers navigate through 1 billion cycles? If even feasible, a full RTL trace in a compressed format such as fast signal database (FSDB) would occupy four terabytes of data on disk. Simply reading back that file from a fast disk would take several days. Full tracing is not the way to go, either.
As in the real estate busoiness, the answer is "location, location, location." Designers want to be able to navigate between different levels of abstraction. Embedded software is the highest level and questions that are addressed at that level include:
Has the operating system booted yet?
Is the processor stuck in an interrupt handler?
Why is the device driver not handling the data properly?
Once the designer has localized--at a high-level-- where something is going wrong, he or she can start zooming in and lower the level of abstraction. That second level is implemented by monitors, checkers and assertions that help narrow the problem down and trace its likely origin.
Only when designers have exhausted the information available at those two levels would they go down to the signal level and get an RTL waveform of the interesting period of time that's been identified. The ability to navigate between those different abstraction levels, from software all the way down to hardware, is the way to avoid getting lost in those huge simulations.
Only by solving those three challenges simultaneously can design teams really overcome the "Billion-Cycle Challenge."
Alain Raynaud (alain@eve-team.com) is technical Director at EVE (Santa Clara, Calif.), a hardware/software coverification company.