Clock gating is a well-understood power optimization technique employed in both ASIC and FPGA designs to eliminate unnecessary switching activity. This method usually requires that the designers add a small amount of logic to their RTL code to disable or deselect unnecessarily active sequential elementsregisters, for example.
Despite the obvious value of reduced dynamic power afforded by this method, the designer faces significant challenges when attempting to perform these optimizations manually:
Truly reducing activity in the design requires intimate knowledge of the design itself and typically requires numerous changes to the RTL.
Most ASIC and FPGA designs today comprise some combination of new, legacy, and third-party IP circuit designs, but typically only the new designs are
candidates for clock-gating optimizations. Designers rarely if ever attempt these optimizations on legacy and IP design. They usually do not have sufficient depth of knowledge about the design and operation of the legacy RTL code, and it requires too much time to manually develop meaningful clock-gating
Applying clock-gating optimizations usually requires the addition of more tools and more steps to the design flow and can precipitate the creation of an intricate set of new clocks requiring complex timing analyses (as is often the case for ASIC optimization). Unless the gains in power efficiency are sufficient and essential to the success of the design, the additional complexity and time can be prohibitive and add risk.
With the release of ISE Design Suite v12, Xilinx has introduced an automated capability linked to the place-and-route portion of the standard FPGA design flow
that uses a set of innovative algorithms to perform an analysis on all portions of the design (including legacy and third-party IP blocks). Having analyzed the logic equations to detect sourcing registers that do not contribute to the result for each clock cycle, the software utilizes the abundant supply of clock enables (CEs) available in the logic of Virtex-6 and Spartan-6 FPGAs to create fine-grain clock-gating or logic-gating signals that neutralize superfluous switching activity, as shown in figure 1.
Fig 1: Intelligent clock gating dramatically reduces switching power consumption.
Each CE is ideally suited for power optimization because it connects to the basic cluster of the FPGA logic (the slice). It controls a small number of registers (only eight), see figure 2, providing the level of granularity that matches the minimum size of buses used by the vast majority of designs. The smallest member of the Virtex-6 FPGA family (the XC6VLX75T) offers more than 10,000 CEs, while the largest (the XC6VLX760) offers more than 100,000.
Fig 2: Clock Enables in the Virtex-6 FPGA slice.
It is important to note that these optimizations do not alter the pre-existing logic or clock placement, nor do they create new clocks. The resulting design is logically equivalent to the original and the additional logic created is separate from previous logic, adding only 2 percent more LUTs (on average) to the original design. As a result, the optimization does not affect timing in the vast majority of cases because it does not add levels of logic to the original logic paths.