I've been gripping my seat in anticipation for ages, but now the wait is over, because the guys and gals at Achronix have started shipping the first member of their Speedster 1.5 GHz FPGA family – the SPD60 (pronounced "Speedy-60").
These little scamps run faster than some ASICs (the Speedster FPGAs, not the folks at Achronix). So how do they do this? Well, although the chip appears synchronous to the outside world, the FPGA fabric inside is implemented using self-timed asynchronous logic.
Let's see how this works. First, let's consider a traditional FPGA based on globally-clocked logic. In this case, a data value being presented to a flip-flop when a clock edge occurs can be considered to be a "Data Token". Only valid data (data at a clock edge) is propagated, and each register outputs a new Data Token (value) at every clock edge as shown below:
Of course there would typically be some combinational logic (in the form of look-up-tables and multiplexers and suchlike) between the register stages, but this has been omitted from the above diagram for the sake of simplicity.
By comparison, the Speedster fabric doesn't contain any registers (except at the chip's primary I/O pins). Instead, it comprises what Achronix call picoPIPE pipeline stages as illustrated below.
Once again, combinational logic would appear between each of these picoPIPE stages, but this has been omitted in order to keep things simple.
We can regard each picoPIPE stage as containing and propagating Data Tokens; in this case, however, each token uses two signals instead of one (these two signals are decoded to provide states that we might consider to represent "I'm not ready yet," "I now have a valid 0," and "I now have a valid 1"). Data validation (which provides clock-like functionality) is performed by each stage on an individual basis using local acknowledge signals instead of a global clock.
In the case of a traditional FPGA, its globally clocked logic is not balanced. By this we mean that the clock rate must allow for the slowest path in the entire clock domain; any combinational logic that is faster than the slowest path (which is, by definition, all of the remaining logic) has to wait for the slowest one to finish. Another way to view this is that there is only one data value "in flight" as illustrated below (and again, remember that this is a gross simplification):
By comparison, the Achronix solution in the form of the Speedster fabric based on picoPIPE technology supports incredibly fine-grain pipelining, which allows multiple data values to be in flight simultaneously, which allows the data rate to be much faster, which equates to dramatically faster throughput as illustrated below:
The really cool thing about all of this is that, with the exception of the picoPIPE pipeline stages and the underlying asynchronous implementation, the rest of the Speedster fabric looks like a regular FPGA in that it comprises "stuff" like 4-input look-up tables (LUTs), blocks of RAM, multiplier cores, and suchlike. What this means is that you can use industry-standard front-end tools, such as Synplify Pro from Synopsys (formerly Synplicity) and Precision Synthesis from Mentor Graphics for the RTL synthesis.
Even better, you don't have to learn asynchronous design techniques. You can take existing VHDL or Verilog code that contains things like multiple clock domains with gated and/or un-gated clocks ... and synthesize it down for use in the Speedster fabric (the synthesis tools make the clocks "evaporate" and replace them with asynchronous equivalents while you aren't looking).
I'm really very impressed with everything I'm hearing about this technology, but "the proof of the pudding is in the eating," as the old saying goes, so I'm very much looking forward to seeing some "How To" design articles written by designers in the trenches who are actually using this technology... watch this space...
Questions? Comments? Feel free to email me – Clive "Max" Maxfield – at email@example.com). And, of course, if you haven't already done so, don't forget to Sign Up for our weekly Programmable Logic DesignLine Newsletter.