Blog
Comment
Joe Gianelli
Mick.Posner
Hi Brian -- I also noted that FPGA-based prototyping debug was a focus at this ...
A prediction ahead of its time?
Brian Bailey
6/15/2011 1:30 PM EDT
Weeks before DAC this year, I made the bold prediction that this would be the year in which the FPGA-based prototyping woke up from its sleepy position. Historically this has been something to help the software guys along during the back-end of the process, when waiting for the silicon to become ready.
Of course, my prediction was not made in a void. I have seen continuous and growing rumblings about how badly the software simulator is keeping up with the demands placed upon it. The free lunch for the simulators stopped when processors stopped becoming faster and the simulation guys, in general, have not managed to work out how to make it faster on multi-processors.
One exception to that at this year’s DAC was a company called Rocketick, who have brought together RTL simulation and the nVidia GPU. They claim that it provides about a 10X speedup in many cases and can be greater than this on designs that are well suited to their approach. Others have tried this and failed in the past, so I hope I learn more about how they have done this, and to hear more success stories from this company. This could significantly extend the life of the RTL simulator in a verification flow using reasonably priced hardware.
So, back to the FPGA-prototypes. What I had expected to see were offerings that provided:
I was a little disappointed with what I saw. Cadence recently announced that they have been attacking Problem #1, and assure us that if it runs on the Palladium emulator it will also run on their prototype. I haven’t heard how difficult it is to get a design running on Palladium recently, so not too sure how good that is. Synopsys also published their book that tells you how to prepare a design for prototyping and they of course have the industry leading partitioning software.
Nobody seemed to really think that it would be a good idea to automatically hook the prototype up to the simulation testbenches so that a set of regressions could be run, or even better a formal verification approach to tell you that the design as modified for the FPGA was functionally identical to the RTL. It is possible that Calypto could do this with their sequential equivalence checking, but I’m not sure if it has been packaged that way. This to me is a no brainer and shouldn’t be difficult to do in a high-speed fashion.
I also heard very few announcements about improvements in the connectivity of the prototypes into either a complete HW/SW debug environment or the creation of hybrid prototypes. Synopsys has talked about this for a while, and Cadence says it has the same connectivity as their emulator, except that Cadence has not yet released their virtual prototype. Cadence also says that the same speed adapters can be used between the emulator and the prototype. I still want to see a standard interface for this that will enable easier creation of third-party modules. Maybe I should propose this within Accellera.
All of the action was on Problem #3. Debugging the FPGA-prototype. Up until now, many have relied on the debug capabilities supplied by the FPGA venders. This includes ChipScope from Xilinx and SignalTap from Altera. The problem is that these are single FPGA solutions and most FPGA-prototypes use multiple FPGAs, so how do you go about debugging the complete design?
There are several approaches to doing this. One way is to do it all in software. This involves putting a debug engine into each FPGA that can capture data, coordinate triggering between multiple FPGAs, and get the data out into a traditional RTL debugger. An example of this was shown by Veridae with their Certus tool.
Then there are the hardware solutions as shown by Springsoft and S2C. The big advantage of added hardware is that you don’t have to use the memory inside the FPGA for capture, instead streaming the data out of the chip and utilizing larger memories on a special purpose board to hold the data. This enables either wider or deeper traces to be captured, which is important for long runs.
Sitting somewhat in between is InPA who have partnered with the Dini Group. The Dini Group has been a provider of multi-FPGA boards for a long time and this will provide better debug solutions for their boards. Basically an additional smaller FPGA is placed on the board that houses a debug controller that InPA utilizes. This is similar to the approach that EVE uses in their emulators where a separate FPGA deals with host connectivity and other such non-design issues.
All of them claim to do this fully at the RTL level and with a minimum amount of recompilation necessary. This generally means that inside the FPGA, probing is over specified and then the debug engine can do the final selection dynamically.
So this is all a smaller step than I had expected, but none-the-less it is an important step. I hope none of them expect to sit back and relax now – they have a lot of work ahead of them.
Brian Bailey – keeping you covered.
Of course, my prediction was not made in a void. I have seen continuous and growing rumblings about how badly the software simulator is keeping up with the demands placed upon it. The free lunch for the simulators stopped when processors stopped becoming faster and the simulation guys, in general, have not managed to work out how to make it faster on multi-processors.
One exception to that at this year’s DAC was a company called Rocketick, who have brought together RTL simulation and the nVidia GPU. They claim that it provides about a 10X speedup in many cases and can be greater than this on designs that are well suited to their approach. Others have tried this and failed in the past, so I hope I learn more about how they have done this, and to hear more success stories from this company. This could significantly extend the life of the RTL simulator in a verification flow using reasonably priced hardware.
So, back to the FPGA-prototypes. What I had expected to see were offerings that provided:
- An improvement in getting a design onto a prototype
- Better tools for verifying the design once mapped onto the prototype
- Debugging the prototype
- Tools that enabled HW/SW co-simulation capabilities using the prototypes
- Hybrid prototypes combining FPGA and virtual prototypes
I was a little disappointed with what I saw. Cadence recently announced that they have been attacking Problem #1, and assure us that if it runs on the Palladium emulator it will also run on their prototype. I haven’t heard how difficult it is to get a design running on Palladium recently, so not too sure how good that is. Synopsys also published their book that tells you how to prepare a design for prototyping and they of course have the industry leading partitioning software.
Nobody seemed to really think that it would be a good idea to automatically hook the prototype up to the simulation testbenches so that a set of regressions could be run, or even better a formal verification approach to tell you that the design as modified for the FPGA was functionally identical to the RTL. It is possible that Calypto could do this with their sequential equivalence checking, but I’m not sure if it has been packaged that way. This to me is a no brainer and shouldn’t be difficult to do in a high-speed fashion.
I also heard very few announcements about improvements in the connectivity of the prototypes into either a complete HW/SW debug environment or the creation of hybrid prototypes. Synopsys has talked about this for a while, and Cadence says it has the same connectivity as their emulator, except that Cadence has not yet released their virtual prototype. Cadence also says that the same speed adapters can be used between the emulator and the prototype. I still want to see a standard interface for this that will enable easier creation of third-party modules. Maybe I should propose this within Accellera.
All of the action was on Problem #3. Debugging the FPGA-prototype. Up until now, many have relied on the debug capabilities supplied by the FPGA venders. This includes ChipScope from Xilinx and SignalTap from Altera. The problem is that these are single FPGA solutions and most FPGA-prototypes use multiple FPGAs, so how do you go about debugging the complete design?
There are several approaches to doing this. One way is to do it all in software. This involves putting a debug engine into each FPGA that can capture data, coordinate triggering between multiple FPGAs, and get the data out into a traditional RTL debugger. An example of this was shown by Veridae with their Certus tool.
Then there are the hardware solutions as shown by Springsoft and S2C. The big advantage of added hardware is that you don’t have to use the memory inside the FPGA for capture, instead streaming the data out of the chip and utilizing larger memories on a special purpose board to hold the data. This enables either wider or deeper traces to be captured, which is important for long runs.
Sitting somewhat in between is InPA who have partnered with the Dini Group. The Dini Group has been a provider of multi-FPGA boards for a long time and this will provide better debug solutions for their boards. Basically an additional smaller FPGA is placed on the board that houses a debug controller that InPA utilizes. This is similar to the approach that EVE uses in their emulators where a separate FPGA deals with host connectivity and other such non-design issues.
All of them claim to do this fully at the RTL level and with a minimum amount of recompilation necessary. This generally means that inside the FPGA, probing is over specified and then the debug engine can do the final selection dynamically.
So this is all a smaller step than I had expected, but none-the-less it is an important step. I hope none of them expect to sit back and relax now – they have a lot of work ahead of them.
Brian Bailey – keeping you covered.
Navigate to related information



Max the Magnificent
6/15/2011 1:49 PM EDT
Hi Brian -- as you say there was a lot of stuff to see at DAC -- I have SO many things to write about -- but I must admit that the Ricketick stuff sounded very interesting... More on this later -- Max
Sign in to Reply
Mick.Posner
6/16/2011 1:54 PM EDT
Hi Brian -- I also noted that FPGA-based prototyping debug was a focus at this years DAC. I wanted to mention that Synopsys offers a co-simulation mode which is designed just as you described. They were demo'ing it in their suite. It enables a DUT running in the HAPS FPGA-based prototyping hardware to be validated against it's original simulation testbench. Doing this block by block will obviously reduce the numbre of surprises later as you integrate the blocks with each other in the system level prototype. Easier to debug as well as you have already individually validated the blocks operation before the integration.
Sign in to Reply
Joe Gianelli
6/16/2011 8:16 PM EDT
Hello Brian,
Good blog content. FPGA-based prototyping has become mainstream but using these is still very difficult. I'm sorry that you were disappointed with what you saw at DAC but there is much to improve in this area. Unfortunately, not enough information get's out to the engineering community about who's got what. As Mick posts above, Synopsys has had co-simulation capability from their Chip-It (HAPS 6000) platform for a few years...at least.
At InPA we too have co-simulation capability where the users RTL test bench drives the design in the FPGAs. In fact, we interface to all popular RTL simulators. In our flow, this is used to help verify that your design running in the FPGAs functions as the test bench expects, addressing your item #2 and #4. In our methodology co-simulation is important in that it transfers checkpoints from the simulation test bench to our Embedded Micro Machines (EMMs) giving the in-circuit debug flow a more reusable and qualified test plan.
What engineers tell us they'd really like is a debug capability that looks at the system view of the design....and not just the individual logic states. What they mean by system view is debug technology that can track overall datapath activity, stimulated by I/O and controlled and monitored by the firmware.
Sign in to Reply