SAN FRANCISCO – Closer cooperation between chip and app developers is needed to scale the rising wall in energy efficiency that's making it hard to fulfill expectations of smaller, cheaper, faster systems, said the opening keynoter at the International Solid-State Circuits Conference (ISSCC). Stanford professor Mark Horowitz called for a combination of specialized silicon and better algorithms to combat stalled clock frequency and rising power consumption.
"In the mid 2000 decade, we really hit a power limit and were not able to scale up power because of various thermal issues," Horowitz said in the opening plenary session. "In the desktop/server community this happens around 100 watts, in laptops it's 30 watts, cell phone 1-3 watts, which means that all computing systems are power limited."
Horowitz said the only way to increase power is to decrease the energy per operation in new ways. The veteran researcher said he isn't hopeful that engineers will find a new technology to replace power-limited CMOS and, instead, advocated for specialized processing across multiple cores.
"I can see why people get multi-core processors. Instead of one processor, put four processors so the performance curve is going to shift. By backing off on peak performance on a per processor basis, we can lower the energy per op which allows us to put more processors per die in the same energy budget," he said.
Developing dedicated hardware for specialized application processing would be 1,000 times more energy efficient than a general processor. Horowitz advocated for stencil applications, where the outputs of one operation are forwarded directly to another intense computation.
Achieving the highest energy efficiency levels requires a very specific combination: very low energy operations, and extreme locality, according to Horowitz's paper. These levels are only possible if the application works on short integer data -- 8 bits to 16 bits -- tens of data operations are completed for each local memory fetch, and roughly 1,000 operations are completed for every DRAM fetch.
"If you think about applications that don't access memory very much, you can see why specialization can help," Horowitz continued. "Specialization is not so much about hardware, but you have to move algorithms to a much more restricted space."
By restricting algorithms to specific kinds of processing, Horowitz believes developers will be able to build a general engine that can handle those tasks more efficiently. To get to this point, Horowitz said the industry will have to change the way it does algorithm design.
"If we want algorithm designers to play and create better computing devices, we have to minimize the cost to them to do exploration. We have to give them a much higher level development platform in which to play," he said, adding that it's possible to make an app store for hardware. "I don't think it's inconceivable that we could do the same thing [as the Apple store]. We can take sets of hardware and build a strong environment so designers can write code."
Not all problems require the "bleeding edge" of efficiency, however, and many applications will continue to work with current processors. In his paper, Horowitz said the people who know how to use current, adaptable parts are "likely a distinct group from the people who have applications that they want to implement."
"If technology is scaling more slowly, there's not going to be a killer microprocessor that outdoes you in two years when your product first comes out," Horowitz said. "The techniques that we need to do design are stabilizing, so let's codify those and make them easier to do."
— Jessica Lipsky, Associate Editor, EE Times