PITTSBURGH Applying reconfigurable logic, researchers at Carnegie Mellon University have created a programmable data path architecture that can virtualize hardware through self-managed dynamic reconfiguration.
Unlike other run-time reconfigurable devices, the architecture manages its own reconfiguration without any host or user interaction. This virtualization of the hardware can be made by run-time configuration of the programmable hardware fabric.
A Carnegie Mellon research team headed by Herman Schmit and David Whelihan combined efforts with STMicroelectronics to implement the architecture, called PipeRench. It uses a six-metal-layer 0.18-micron CMOS technology resulting in 3.65 million transistors running at 120 MHz. Details of the chip were presented in a paper at the Custom Integrated Circuits conference earlier this year.
With numerically intensive applications in mind, the chip was designed for portability and scalability without needing redesign. According to the paper, the architecture uses a six-stage or stripe virtual hardware pipeline. The system reconfigures itself with on-chip configuration bits for specifying the virtual hardware. The configuration file is moved from on-chip memory into the configurable pipeline.
Each stripe consists of 16 processing elements (PEs) with their own logic and registers. All of the PEs are interconnected within the stripe. If the stripe is the first in the pipeline, the input is connected to a global bus and if a stripe is last, the output is connected to a global output bus. The research team noted that the PEs also connect to a dedicated line connected to the other PEs in the stripe. This output can be programmed to connect to the outputs of previous stripes or the output of the functional unit, said the researchers. The configuration process takes a single cycle so the pipeline can be configured one cycle before the first data of the pipeline arrives.
There are 3.3-volt and 1.8-V supplies for the I/Os and core. The core is divided into a switch-fabric area and a configuration and logic area.
According to the team, the PipeRench performance is competitive with high-end DSP architectures and more than five times faster than a commercial microprocessor.
Specifically, at 120 MHz, PipeRench executes a 40-tap 16-bit FIR filter at 41.8 Msamples per second. Performance on mainstream applications is in the same range as high-end DSPs such as the C64X family from Texas Instruments Inc.
The researchers said that PipeRench performs this operation at a much lower clock frequency than the high-end DSPs without an actual multiplier within its fabric.
On the Idea encryption application algorithm, it performs encryption or decryption at 450 Mbits/second. The encryption is compiled directly into the hardware. By comparison, an 800-MHz Pentium III processor executes this encryption algorithm at a rate of 75.4 Mbits/s.
The configurable data path approach might ease DSP vs. dedicated processor decisions for real-time critical designs such as media processors. While DSPs offer more flexibility by being programmable, they also devour more watts and will always take a performance hit compared with dedicated chips.