PALO ALTO, Calif. Despite daunting technical hurdles, minuscule market acceptance and rampant vendor mortality, reconfigurable computing has demonstrated significant advantages in some applications. And that fact keeps luring design teams to try their hands.
The latest entrant to make public disclosure is a faculty-student team from Carnegie Mellon University (Pittsburgh) that was represented here this week at the Hot Chips conference by authors Benjamin Levine and Herman Schmit. The team's architecture is PipeRench.
The authors' description of PipeRench began with a significant departure from all other announced reconfigurable computing devices. Rather than being based on a specific piece of hardware, PipeRench is a virtual machine. Specifically, it is a virtual, reconfigurable pipelined data path. Each stage in the pipeline includes a stripe of computing elements, storage registers and interconnect designed to implement nearly any particular case of a computing data path with high efficiency.
In the Carnegie Mellon approach, data path requirements are described in a high-level language called DIL and then compiled into the virtual architecture. The virtual architecture can be actualized in any number of ways, including software simulation, conventional FPGAs or as described at Hot Chips a specialized programmable logic fabric.
The chip in question is architecturally very similar to the virtual machine, comprising a pipeline of stripes, each stripe holding computing elements and registers. The primary difference between the virtual and actual data path is that the virtual machine contains an arbitrary number of pipeline stages, while the chip contains precisely 16.
Each of these stages, or stripes, contains 16 computing elements, each of which in turn comprises an 8-bit ALU with preshifting and carry logic and a file of 8 8-bit registers. The carry and shift logic strings together each of the elements in a stripe so that elements can be grouped to provide wider data paths or can be isolated to perform several independent operations in a stripe.
A unique part of the design is that the hardware more or less automatically moves all of the register contents from each stripe to the next stripe in the pipeline on each cycle. So data flows automatically through the pipeline until it is replaced by ALU results or killed. Global buses span all the pipeline stages to provide for data insertion, bypassing and the like.
The chip has been implemented in partnership with inveterate experimenter STMicroelectronics in 0.18-micron, six-metal CMOS. The 16-element by 16-stripe array, with accompanying buses, control and configuration logic, adds up to 3.65 million transistors in just under 50 mm2. The fabric cycles at 120 MHz and consumes under 3 watts.
The researchers stated that the entire fabric can be reconfigured in 133 ns and that it can switch between applications presumably on the assumption that both configurations are resident in the configuration registers in 8 ns.
The presenters said that the chip can perform a 40-tap, 16-bit FIR at 41.8 Msamples/s. As is characteristic of reconfigurable techniques, the chip avoids having to implement multiply-accumulators by compiling the tap coefficients into the hardware configuration, reducing a multiply-accumulate to a logic transform.
Similarly, an Idea encryption was compiled into the virtual machine in less than 1 minute and implemented on the chip with 450-Mbit/s performance. That's about six times the throughput of a Pentium III, according to the authors.
The approach appears similar to that used by Adaptive Silicon, which also had a fabric based on an array of narrow ALUs but without such explicit pipelining. That organization was purchased by LSI Logic and the design experimented with and abandoned. Now founders of the company are reportedly trying to reconstitute it and to resurrect the architecture.
In the past, achieving adequate performance and power dissipation over a wide range of applications has been an issue for many reconfigurable architectures, which after all pay a price for their run-time configurability. In addition, programming has proved daunting.
But with the virtual-machine concept possibly offering an easier path from behavioral description to chip configuration, and with the noteworthy power figures reported, the PipeRench architecture has begun to set itself apart from its neighbors.