Design Article
Designing low-energy embedded systems from silicon to software
Keith Odland
11/28/2012 5:03 PM EST
Software decisions
Performance scaling. Implementing energy-efficient embedded applications relies on software design that uses hardware resources in the most appropriate way. What is appropriate depends not only on the application but also on the hardware implementation. Likewise, the more flexible the hardware in terms of CPU, clock, voltage, and memory usage, the greater the potential energy savings the developer can achieve. Hardware-aware software tools provide the embedded-systems engineer with greater awareness of what further savings are achievable.
One option is to employ dynamic voltage scaling, as shown in figures 3 and 4. On-chip dc/dc converters and performance-monitoring circuits enable this technique by providing the ability to reduce the supply voltage when the application does not need to execute instructions at the highest speed. Under those conditions, the system operates with reduced power consumption. The resultant benefits are a function of input voltage and can vary over the life of a product. The figures show the relative differences between no voltage scaling (VDD fixed), SVS (static voltage scaling), and AVS (active voltage scaling).
![]()
Figure 3 Effects of voltage scaling are shown with VBAT=3.6V.
![]()
Figure 4 Effects of voltage scaling are shown with VBAT=2.4V.
An interesting artifact of AVS is that the AVS strategy can change depending on the input voltage to the system. In this example, when the input voltage is 3.6V, it is more efficient to power the internal logic as well as the flash memory from a high-efficiency internal dc/dc converter. As the input voltage falls as a result of battery discharge over the product life cycle, however, it is more efficient to power the flash-memory subsystem from the input voltage directly because the internal logic can operate at lower voltages than the memory. The SiM3L1xx MCU family from Silicon Labs, for example, has a flexible power architecture with six separate and variable power domains to enable this kind of dynamic optimization.
Typically, CMOS logic circuits will operate more slowly as their voltage is reduced. If the application can tolerate lower performance—as is often the case when dealing with communications protocols that demand data be delivered no faster than a certain standardized frequency—then the quadratic reduction in energy consumption with lower voltage can provide large energy savings. Leakage provides a lower limit on voltage scaling. If each operation takes too long, leakage will begin to dominate the energy equation and increase overall energy consumption. For that reason, it can make sense to execute a function as quickly as possible and then put the processor into sleep mode to minimize the leakage component.
Consider a wireless sensor application that needs to perform a significant amount of digital signal processing, such as a glass-breakage detector. In this example, the application uses a fast Fourier transform to analyze the vibrations picked up by an audio sensor for the characteristic frequencies generated by glass shattering. The FFT is relatively complex, so executing it at a lower frequency governed by a reduced voltage is likely to increase leakage substantially, even in older process technologies. The best approach, in this case, is to execute at near maximum frequency and then return to sleep until the time comes to report any findings to a host node.
The wireless-protocol code, however, imposes different requirements. Radio protocols have fixed timings for events. In these cases, the protocols can be handled entirely in hardware. It makes more sense to reduce the processor core’s voltage. Therefore, the code needed for packet assembly and transmission runs at a speed appropriate to the wireless protocol.
The addition of hardware blocks such as intelligent DMA (direct memory access) can further change the energy trade-offs. Many DMA controllers, such as the one provided by the native ARM Cortex-M3 processor, require frequent intervention from the processor. More intelligent DMA controllers that support a combination of sequencing and chaining, however, let the processor compute packet headers, encrypt data, assemble packets, and then hand over the work of passing the packets at appropriate intervals to the memory buffers used by the radio front end. For much of the time that the radio link is active, the processor can sleep, saving significant energy.
Next: page 4

