As a DSP guy, I'm used to treating every clock cycle as a precious commodity. But changing hardware technology makes this thinking obsolete.
As a DSP guy, I hate to see resources wasted. I'm used to treating every clock cycle as a precious commodity. And I'm not alone: many signal processing applications combine heavy processing workloads with tight cost constraints and low power budgets. Traditionally, the only way to meet all these goals was to run your hardware as close to 100% capacity as you could.
But changes in hardware technology are making this strategy obsolete. For example, high-end DSPs like TI's 'C64x can execute eight instructions per cycle. For most applications, there is no way to keep the DSP fully loaded. Even in the inner loop of a filter or an FFT, you might not average more than six instructions per cycle, i.e., 75% utilization.
And things are just getting worse: For example, TI doubled the multiply-accumulate (MAC) capabilities of its 'C64x architecture last year, but this only gave an 20% speedup. Clearly, the new MAC hardware is sitting unused most of the time. As another example, FPGA users have told me that the routing tools often can't use more than 50-60% of an FPGA's gates.
To me, this suggests that DSP engineers need to re-think our design strategies. We have to accept that large parts of our systems will be lightly loaded or even completely unused. Instead of focusing only on using hardware efficiently, we need to think carefully about what that hardware should be in the first place. For example, it's OK to use only 50% of an FPGA if that gives you a better solution than using 75% of a DSP.
At the end of the day, it's all about making the customers happy. If the best way to do that is to "waste" hardware, so be it.
(Disagree? Stop by the forum and tell me what you think.)