San Francisco -- Rambus Inc. and a group of leading academics will set a high-water mark this week in bringing fast chip-to-chip links into the era of low-power design. Their advance suggests the industry needs to rethink some fundamental assumptions in the way it defines, measures and implements power in interconnect designs.
At the International Solid-State Circuits Conference here, the Rambus group will describe a 6-Gbit/second serializer/ deserializer (serdes) transceiver that draws just 2.2 milli- watts per gigabit per second in 90-nanometer process technology. Today's designs can suck as much as 10 times that amount of power from a processor.
By comparison, in the same Wednesday ISSCC session, Texas Instruments Inc. will disclose a 12.5-Gbit/s serdes made in 65-nm process technology that consumes 27.5 mW/Gbit/s. Sony Corp. will present a paper on a 10-Gbit/s transceiver in 90-nm technology that dissipates 25 mW/Gbit/s.
Even Rambus' own current products are power hogs by comparison. Both the Rambus XDR and FlexIO interconnects consume about 20 mW/Gbit/s.
Following the trend line, Intel Corp. engineers have prepared a paper for an upcoming IEEE circuit design conference showing ways to hit as little as 10 mW/Gbit/s. Today's mainstream PCs using PCI Express links typically deliver I/O at power rates of 15 to 30 mW/Gbit/s.
Rambus officials won't comment on just when or how the company will bring the technology to market. But they did say there are no major hurdles standing in the way.
"This test chip is amazingly good, and we are all very happy about that," said chief scientist Mark Horowitz, a Rambus founder who was one of the paper's co-authors. "For many papers you have to stand on your tippy-toes and pray to the demo gods that it will be just so. That's not the case here. There are some robustness and volume-manufacturing issues, but nothing that concerns me." Horowitz is also a professor of computer science and electrical engineering at Stanford University.
Indeed, the technology may represent a more mainstream opportunity than the high-end, speed-driven interconnects for which Rambus is usually known.
"This is intended for high-volume computer and consumer applications," said Kevin Donnelly, senior vice president of engineering at Rambus. A separate project in the works at Rambus aims to hit the same 2.2-mW/Gbit/s power level but will deliver as much as four times the performance using 65-nm technology. "The power budget for the next generation has to be the same as the old power budget," said Donnelly.
In fact, the company hopes to establish the metric of 1 mW/Gbit/s as the new gold standard for measuring I/O power, just as Mips per watt has largely replaced megahertz as the main figure of merit for microprocessors.
Where did the power go?
A rough analysis of typical serdes transceiver shows
power dissipation is widely distributed.
Besides resetting the bar in low-power serdes design, the paper's message to the industry is that the 1-volt signaling written into a host of interconnect standards is no longer acceptable. The Rambus team used signaling rates below 200 mV to drive a new level in low-power interconnects.
Engineers will need to sharpen their pencils on a broad range of techniques the team used to build receivers that can handle the low-voltage signaling. Those techniques include aggressively driving functions from logic to software and shifting much of the focus on adaptive signal equalization from the transmitter to the receiver.
"A demo like this can change people's views about what's possible," said Horowitz. "If standards specify 1-V signaling, they are not doing low-power I/O."
"The reason people require 1-V swings is, they can't figure out how to make better receivers. It requires use of sophisticated and uncommon techniques," said Bill Dally, a co-author of the paper and chairman of the computer science department at Stanford.
Low-power chip-to-chip links are fast becoming a requirement as high-end microprocessors begin demanding as much as a terabit per second of I/O. "We have to do this work or I/O power will dominate systems as we move to chips with hundreds and even thousands of pins in the future," said a senior engineer who asked not to be named. "Most high-end chips being developed today are power-limited," added Horowitz.
Although ISSCC keeps a tight lid on the details of its papers until they are presented, some researchers are already critiquing the Rambus work. "This might be a breakthrough for Rambus, but it's hardly anything state-of-the-art," said an anonymous posting at the EE Times blog on interconnects. The poster pointed to a 2004 paper from researchers at Samsung Electronics and UCLA that described a gigabit/second signaling system using on-chip capacitive coupling. It dissipated just 1.92 mW using a 100-nm process technology.
Indeed, in the same ISSCC session where Rambus will present this week, other researchers from UCLA will describe a 10-Gbit/s-per-pin interconnect that uses two RF techniques along with capacitive coupling to create an ultrawideband link that dissipates as little as 2.7 mW/pin in 180-nm technology. The design aims to link dice in a 3-D stack of chips.
Sun Microsystems Inc. has been developing a form of capacitive-coupling technology for chip-to-chip links for at least three years. Sun's Proximity technology was closely tied to its efforts on a government supercomputing project opportunity that Sun lost late last year. At the time, Sun executives said development would move forward for the technology. Proximity promises 100 times more links than today's ball grid array packages.
How they did it
The Rambus work used a combination of techniques to achieve its 2.2-mW/Gbit/s milestone, including a voltage-mode signaling technique developed by UCLA researcher Ken Yang. "There isn't any one thing you can do to reduce the overall power," said Horowitz. "You need innovations in logic, clocking, signaling, as well as in the basic transmitter and receiver. We had some synergies in some areas and in others we just had to be clever."
Two big areas of innovation were in shifting the focus of adaptive equalization from the transmitter to the receiver and driving more of the equalization work from dedicated hardware to software on a basic microcontroller. "If you start at the far end of the receiver and work your way back after you deliver a receiver that is as sensitive as possible, you get a cycle of goodness in power savings all across the system," said co-author John Poulton, a computer science professor at the University of North Carolina at Chapel Hill.
Researchers found a way to use existing information from a clock-data recovery (CDR) unit to measure the loss of high frequencies in the signal and compensate by effectively turning up what Dally called "a treble knob" in the receiver.
CDR data on when the system clock is early and late "gives you exactly what you need to know about whether the [signal] eye is closed or open," said lead author Robert Palmer, who will present the ISSCC paper. Having that data, along with some clever use of other existing blocks, eliminates the need for oscillator and interpolator logic used in typical receivers.
"There was no extra hardware to handle the adaptive equalization," said Palmer. "In a typical design, the interpolation logic uses as much power as our whole link," said Dally. The resulting sensitivity of the receiver allowed engineers to eliminate signal pre-emphasis and multitap equalization with hardware FIR filters typically employed at the transmitter stage for high-speed serdes."It takes a lot of energy to use precompensation," said Horowitz. "Many people will be surprised at how aggressive we have been in pushing more functions from logic into software." n
See related image