In a previous blog Brian talked about what makes a prototype unique, and said he would come back to the issue of emulators and accelerators... so here he is...
In a previous blog, I talked about the confusion that seems to exist about the similarities and differences between emulators, accelerators and prototypes. In that blog I talked about what makes a prototype unique, and said I would come back to the issue of emulators and accelerators, so here we are.
First off, most of the time, as a user, you don’t really care. The differences are in the internals of how they are implemented, and this may affect the features they make available, but for the most part the confusion is created by the manufacturers of these devices so that they can say they are #1, or unique, or have highest capacity or speed or whatever. Let’s cut through all of that and look at what is fundamentally different. There is an umbrella term for all of them which is hardware assisted verification.
The starting point is a model. Today most of those models will be at the RTL level, but over time we can expect to see higher abstraction models being accepted. It is all a matter of the sophistication of the synthesis technology available. Today we are starting to see ESL synthesis being sold as a standalone product, and so we can expect some of this technology to move into the hardware assisted verification products before long. The ultimate goal is to make the model execute faster than it could in a software simulator.
A second issue is that software simulation slows down significantly when it exceeds the physical memory of the computer – so there is a capacity issue that plays here as well. There are two general ways to solve this problem, and I will refer to them as direct and indirect implementations.
When we compile a model into an FPGA, we have created an actual implementation of that model in hardware. It may not be the same implementation as would be used in an SoC, but it is an implementation none-the-less. This is what we mean by direct. When this technique is used they are generally referred to as emulators, because the mapping emulates the function of the intended hardware. We are directly executing an implementation of the model.
Let’s contrast that to a simulator. Here we artificially execute the model by keeping track of changes that would propagate through the model. Mechanisms are devised that allow the effects of concurrency to be evaluated even though the simulator is actually incapable of doing more than one thing at a time. So a simulator is an indirect implementation example.
There are some simulation accelerators that contain a large number of simple processors, each of which simulates a small portion of the design and then they pass the results between them. Each of these processors runs slower than the processor on your desktop, but the accelerator may possess thousands or millions of these smaller processors and the net result is significantly higher execution performance. They can of course deal with parallelism directly as all of the processors are running in parallel. An example of this type of hardware assisted solution is the Palladium product line from Cadence. Each of the processors could have arbitrary capabilities, such as dealing with visibility, debug etc.
Within the direct implementation solutions there are again two main types. These are based on custom solutions or off-the-shelf solutions. We will start with the custom solutions. With these there is going to be an FPGA like structure somewhere in the device, although in general they employ very different types of interconnect than would be seen in an FPGA. The custom chip could also contain debug circuitry, visibility mechanisms and a host of other capabilities. Each chip is capable of emulating a small piece of a design, and larger designs are handled by interconnecting many of the chips together, again with sophisticated interconnect capabilities. An example of this type of emulator is Veloce from Mentor Graphics.
The other way to implement an emulator is by using off-the-shelf components such as FPGAs. Here we not only map the design into the FPGA, but also implement the visibility, debug and other such capabilities into the FPGA as well. As with the custom chip case, multiple FPGAs can be put together to handle arbitrary design sizes. An example of this type of emulator is ZeBu from EVE.
The next level of confusion comes when we talk about how hardware-assisted verification solutions are used. First there is in-circuit emulation. This is where an emulator or accelerator is connected into a real world application. For example, you may be designing a USB device. In this case, you would connect the emulator to a physical layer for USB and then plug it into a computer, or other device that forms the other end of the USB connection. Now you operate it as if it were the real device.
There is one issue here which is that emulators or accelerators are generally not able to run nearly as fast as the real world. Most emulators can only muster a few MHz of clock speed, especially when full visibility is made available. So it is often necessary to insert a speed bridge that can handle the difference in execution rates each side of the bridge. This may involve data buffering or manipulation of the protocols to artificially slow down the real world to the rate that the emulator can handle.
The next major way they are used is standalone. This means that the entire model fits into the emulator or accelerator, along with a set of stimulus to exercise the model. They can run as fast as the emulator is capable of, stopping only when additional stimulus is required, or when captured data has to be flushed out of the device. If the design contains a processor, it is also likely that a version of the processor will exist for the emulator.
Emulator vendors provide special boards that make many of the popular processors available. But if parts of the design or testbench cannot be mapped into the emulator, then it has to be coupled to a software execution environment. This is usually called co-simulation, as it inherently involves two “simulation” engines cooperating to solve the problem. This solution suffers from the same problem that most software co-simulators have in that communications slows them down a lot. The emulator can now only run as fast as the simulator, or actually even less because the communications makes it even slower.
A more modern alternative is what is called co-emulation. The primary difference is that communications is raised to the transaction level rather than being at the implementation level, but a full description of this and the way it is done will have to wait for another blog.
Brian Bailey – keeping you covered