Recently, many methodologies have been introduced for reducing dynamic power for systems-on-chip (SoCs). These methodologies, however, impose restrictive physical constraints which have schedule impact or which are heavily dependent on logic functions such as clock gating.
This article presents an elegant methodology using pulsed latch instead of flip-flop without altering the existing design style. It reduces the dynamic power of the clock network, which can consume half of a chip's dynamic power. Real designs have shown approximately a 20 percent reduction in dynamic power using the methodology described below.
Dynamic power is consumed across all elements of a chip. The clock network is one of the large consumers of dynamic power. According to a recent IBM study , half of dynamic power is dissipated in the clock network.
Therefore, reducing power in the clock network can impact the overall dynamic power significantly. Designers already use a variety of techniques to reduce the clock power using smaller clock buffers, reducing the overall wiring capacitance, employing clock gating to reduce the dynamic power , and de-cloning to move the clock buffers at higher levels of hierarchy.
Even with these techniques, the dynamic power of clock network can be large since registers are used as state elements in the design. In general, a flip-flop is used as the register. A conventional flip-flop is composed of two latches (master and slave) triggered by a clock signal.
Flip-flop synchronization with the clock edge is widely used because it is matched with static timing analysis (STA). Timing optimization based on STA is must for SoCs. On the other hand, designers may choose to use a latch for storing the state. A latch is simple and consumes much less power than that of the flip-flop. However, it is difficult to apply static timing analysis with latch design because of the data transparent behavior.
A methodology has been developed which uses latches triggered with pulse clock waveforms. With this methodology, designers can apply static timing analysis and timing optimization to a latch design while reducing the dynamic power of the clock networks. The following describes this pulsed latch design methodology in detail and gives some guidelines as to how designers can apply this methodology in their designs.
Pulsed latch concept
A latch can capture data during the sensitive time determined by the width of clock waveform. If the pulse clock waveform triggers a latch, the latch is synchronized with the clock similarly to edge-triggered flip-flop because the rising and falling edges of the pulse clock are almost identical in terms of timing.
With this approach, the characterization of the setup times of pulsed latch are expressed with respect to the rising edge of the pulse clock, and hold times are expressed with respect to the falling edge of the pulse clock. This means that the representation of timing models of pulsed latches is similar to that of the edge-triggered flip-flop.
The pulsed latch requires pulse generators that generate pulse clock waveforms with a source clock. The pulse width is chosen such that it facilitates the transition. The following diagram represents a simple pulse generator and the associated pulse waveform.
Figure 1 Pulse generator and waveform
In this methodology, the pulse generators are automatically inserted to satisfy several rules during clock-tree synthesis. Along with pulse generators, this approach also uses a number of matching delay cells to allow for match clock insertion delays with or without pulse generators.