There has been a great deal of discussion of late about power dissipation in advanced ICs. Sorry, but power is the wrong question. Except at genuinely enormous levels, instantaneous power is of little interest in a circuit. The issue, for both reliability and battery life, is energy, not power.
The primary misconception, wielded like a bludgeon by marketing people everywhere, is that power in a circuit is proportional to CV2F. In logic and memory circuits, where large portions of the circuit are typically inactive during many operations, the generalization is nonsense.
But more important, the statement says nothing about total energy. For a given technology, the energy required to switch a node from one logic state to another is a function of the technology and the temperature-not the frequency. So for a given process technology at a given level of cooling, the energy required to complete execution of an algorithm is a function of how many total node transitions are needed. This has little to do with clock frequency, but a great deal to do with architecture.
At the simplest level, the closer an architecture approaches a hardwired state machine, the more energy-efficient it is likely to be. This is because such an architecture spends most of its node-switching activity on selecting the next state, and computational work is captured in the state transitions. Once you start separating state transition from computation-that is, once you have both data flow and control flow-you are likely to require a lot more node switching. Add high-capacitance nodes and everything gets complicated.
Work at IMEC has shown that as architectures get more complex, memory references dominate energy consumption. The amount of energy actually expended on the algorithm vanishes in the noise. In a paper last year, MIT's Krste Asanovic estimated that in a simple RISC microprocessor performing an integer add operation, only a fiftieth of the energy is used by the adder.
The idea that you can make this better by piling on an additional task to traverse, map and pre-decode the instruction stream needs no further comment. Nor, really, does the idea that somehow by running a half dozen processors in parallel you can reduce the energy. It's not about clock frequency.
There are important areas of energy conservation to be studied. Asanovic's paper explored one. Many others exist, including self-timed circuit design and adiabatic logic. All are hard. All lack design tools. But all are more meaningful than yet another fruitless marketing battle over clock frequency.