Design Article

How Different Silicon Platforms Are Handling the Power Crisis

Jim Lipman

11/18/2004 12:00 AM EST

Ask a digital chip designer what is the most important design consideration for his next chip, more often than not the answer will be, "power dissipation." Whether battery or line operated, the amount of power a chip uses is becoming a critical issue for many chips that are being fabricated in 130nm CMOS processes or below.

This power crisis has not sprung up overnight—it has been brewing for several years. Faster clocks and process node shrinks have resulted in many more transistors on a piece of silicon, moving data at much faster rates. This has led to a large increase in watts dissipated per device, meaning batteries discharge faster and chips get hotter. The heat problem leads to more expensive packaging and system cooling to dissipate the heat, which also adds weight and size to a system. Battery limitations mean larger batteries, adding more weight and bulk.

Fueling the power problem is the exponential increase in transistor leakage currents as processes migrate from 130nm to 90nm and beyond, leakage that occurs even when devices are "turned off." All of these issues have led to a big effort on the part of semiconductor foundries, chip vendors, and EDA tool vendors to estimate, measure, and minimize chip power dissipation for many applications.

A Platform for Every Purpose
Initially, ASIC and ASSP chips dominated high-performance target applications, the former designed for a single customer and the latter for several customers developing products within the same application domain. ASICs were "the" path to system-on-a-chip (SoC) designs, since they offered the highest performance and lowest per-unit cost, ideal for high-volume applications. Unfortunately, they also were marked by high NRE costs and long design cycles, both of which continued to grow as process nodes shrunk.

The introduction of FPGAs gave designers a way to develop a chip at substantially lower cost and with a shorter design time compared to an ASIC. Unfortunately, the tradeoff in going to an FPGA was high unit cost and lower performance. Similar to ASICs and ASSPs being targeted for high-volume applications, particularly those with very high performance needs, FPGAs fit into low-to-medium volume applications without "bleeding-edge" speed requirements.

A couple of years ago, structured and platform ASIC (SA/PA) architectures began to make an appearance. With up-front cost and design times somewhere between those of an FPGA and an ASIC, SA/PA chips target mid-to-high volume applications with medium-to-high performance and logic-density requirements. SA/PAs are not a single type of architecture, but a variety of architectures developed by several vendors. Up to now, SA/PA acceptance has been limited by two factors:

  1. A wide range of capabilities, cost, and user design effort represented by the dozen or so SA/PA offerings available to the designer
  2. Upward pressure by newer FPGA architectures and technologies, and downward pressure by non-traditional ASIC vendors, such as those who provide an aggregation of design and manufacturing activities and services.

Each type of platform—ASIC/ASSP, FPGA, and SA/PA—handles the power crisis differently. Furthermore, different vendors in each of these platform "domains" have different techniques for minimizing the power dissipation of their chips. A look at some representative vendors and products in each domain will provide some insight into the ways chip manufacturers are dealing with the power crisis.

FPGA
Xilinx
Basing their products on SRAM technology for programming capability, Xilinx is one of the two major FPGA suppliers (the other is Altera). SRAM-based products dominate the FPGA market, giving system designers the capability to configure their products during system development and reconfigure them after deployment of the final systems.

The disadvantage of SRAM-based FPGAs is that they require several times more transistors than cell-based mask-programmable chips, ASICs and ASSPs, to implement equivalent functionality. The extra transistors equate to larger chip area—along with higher per-unit cost—and additional power dissipation over ASICs and ASSPs. Potentially, the extra power can become critical at 90nm and below, since even turned-off transistors will have significant leakage currents.

Xilinx is addressing the power issue on several levels with their recently introduced Virtex-4 FPGA family. Making use of the company's "domain-optimized" ASMBL architecture, Virtex-4 provides different configurable platforms for programmable SoC designs, targeting three classes of applications: those that need high logic density, those that require high DSP capability, or ones that need embedded processor or high-speed serial communications resources.

Xilinx also lowers device power by implementing silicon IP in hardware that was traditionally done in software. For example, their DSP core—the XtremeDSP Slice—contains an 18 x 18 2's-complement optionally pipelined multiplier and integrated adder/subtractor, with optionally registered inputs and outputs, that can operate at up to a 500 MHz clock rate. Since it is a hard core, the XtremeDSP Slice uses 1/7 the power of previous generation FPGAs performing the same DSP or arithmetic functions using soft DSP IP. Xilinx implements an EMAC (Ethernet Media Access Control block) hard core on the Virtex-4 using 1/10 the area and 1/100 the number of transistors as a previous version of the EMAC done as a soft core. Further power reductions are accomplished through innovative use of algorithms running on the various processing cores.

Xilinx claims that their new Spartan-3L FPGAs have the lowest total power consumption among the reconfigurable FPGA competition. The devices sport a new hibernate mode with two levels of power reduction—a simple stand-by mode allowing up to 68% lower quiescent power, and an active power management mode for up to 98% lower quiescent power (below 6mA for the XC3S1000L and below 8mA for the XC3S1500L) than comparable Spartan-3 devices. The latter requires power to be shut off to the FPGA core; this means that reactivating the device requires reconfiguring the logic, with the accompanying in-rush current spike.

QuickLogic
Instead of using SRAM technology to configure their FPGAs, QuickLogic uses a patented metal-to-metal interconnect structure—ViaLink—that offers non-volatility and results in much fewer transistors and smaller chips for a given number of logic gates. In addition, ViaLink-based FPGAs have a power advantage over their SRAM-based counterparts in that they do not require a high in-rush current to configure the device from an external source on power-up. The tradeoff with ViaLink-based FPGAs is that they are one-time programmable (not reconfigurable); however, they also provide a level of programming security that is unobtainable with SRAM-based FPGAs.

QuickLogic also attacks the power problem at several different points. ViaLink interconnects, when programmed "on," offer a very low resistance—around 50 ohms—and low capacitance. This results in lower power dissipation through the links than through an SRAM-based FPGA interconnect. QuickLogic's chips go through a single-threshold, high-speed-logic, transistor fabrication process, since the company feels that multi-threshold transistors do not buy much power saving for their device architectures.

Besides power-saving circuitry techniques on their FPGAs, such as clock gating and turning off I/O banks that are not used, QuickLogic also has some tricks for saving power on the board level, such as gating the system clock from the board to the chip with a MUX on the chip's master clock input pin. For FPGA design, the company puts several design tools—the PowerAware Placer, Power Calculator, and Power Simulator—into the hands of the chip designer as part of their FPGA design-tool suite.

PowerAware Placer, used during logic placement on the programmable chip, reduces power by giving priority to power consumption. The tool uses special placement algorithms to reduce the number of clock column buffers by minimizing the number of logic columns in the design—this reduces power. According to QuickLogic, with PowerAware Placer large designs see better than a 14% reduction in dynamic power reduction. Power Calculator estimates dynamic power consumption after logic placement and routing but before the part is actually programmed. The tool lets you see power consumption early in a design, so you can make adjustments as needed to keep the design within a specified power budget. Power Simulator is a power-analysis tool that you use during design Verilog or VHDL simulation. A description of the programmable logic design along with testbench information lets you see which operations and functions of a design consume the most power. A GUI shows dynamic current flow along with logic levels of the design's clock and data nodes, displaying the power consumption of the design as it exercises individual functions (Figure 1).


Figure 1:  QuickLogic's Power Simulator window lets you see your design's power consumption on a cycle-by-cycle basis, showing where power spikes occur.

QuickLogic also has a Reference Design Kit for the company's Eclipse II low-power FPGAs. The kit's hardware and software tools let you directly measure actual power consumption of Eclipse II designs, and also let you calculate, analyze, and simulate power dissipation for Eclipse II designs under development.

Structured ASIC/Platform ASIC
LSI Logic
An established vendor in the SA/PA arena, LSI Logic's RapidChip Platform ASIC architecture embeds hard IP cores (processors and processor subsystems, memory, and communication and network-interface blocks) on a chip. Interspersed between the IP is user-customizable logic in which the designer adds additional functions by defining a subset of the chip's total metal layers. LSI optimizes the embedded IP by defining a subsystem around a processor (MIPS or ARM) and considering the subsystem as an IP system-level core. For example, the ARM926 Processor Subsystem comprises the ARM926 core along with bridges, an Ethernet MAC, and other blocks (Figure 2). Optimizing processor IP on a subsystem rather than a processor-core basis lets LSI optimize subsystem performance for speed and power dissipation more effectively than optimizing the subsystem as a group of separate IP blocks.


Figure 2:  By using an entire ARM926 subsystem as a drop-in IP core, LSI logic gives the designer a more optimized subsystem design than what can be achieved by integrating separate IP blocks for each of the functional blocks.

Since RapidChip families are currently implemented in 130nm technology, LSI does not feel that transistor leakage is a problem and uses a single Vt-transistor process. The company plans, however to use a triple-oxide (three different Vt devices) process when it migrates RapidChip to 90nm. LSI also expects a lot of customer interest in moving proven RapidChip designs to ASICs, during which additional design effort can lower power further.

AMI Semiconductor (AMIS)
The XpressArray families are AMI Semiconductor's Structured ASIC products. XpressArray targets the FPGA-to-ASIC conversion market, offering cost and power reductions compared to the original FPGA chips. The company's main objective with its new XpressArray-II architecture is cost reduction, but the SA implementation offers substantial power savings as well.

Implemented in a 0.15-micron process, XpressArray-II runs at a 1.5V core voltage, with I/Os operating from 1.8V to 3.3V. The base fabrication process goes through two layers of metal—AMIS customizes the chip by personalizing the next five metal layers (plus a sixth if the part will be in a flip-chip package). The chips have embedded memory blocks, DLLs, and PLLs. Other silicon IP can be implemented, but only as soft cores.

AMIS claims significant power savings compared to Xilinx and Altera FPGAs (Table 1), more than an order of magnitude for logic gates, with comparable power dissipation for embedded memories.

Parameter XpressArray II Xilinx Virtex II Pro Altera Stratix
Logic Power (nW/MHz/gate) 55 1000 700
Memory Power (nW/MHz/bit) 13 10 23

Table 1: Comparison of logic-gate and memory dynamic power dissipation. (Source: AMI Semiconductor)

The significant power savings per gate for XpressArray II derives from the elimination of the FPGA's configuration transistors, which also reduce the size and cost of a core-limited chip. Static power dissipation is also lower for XpressArray II devices, since configuration transistors contribute to the chip's total leakage power.

ASIC/ASSP
STMicroelectronics
ST designed its Nomadik multimedia silicon platform using several power-saving features. Nomadik is based on a low-power ARM 9 32-bit RISC processor along with two hardware accelerators, one for video and the other for audio processing (Figure 3).


Figure 3:  The multimedia Nomadik platform uses an ARM 9 RISC core with smart accelerators for audio and video signal processing.

The accelerators operate independently or concurrently with the CPU, depending on task requirements. By handling all audio and video functions, including pre- and post-processing, these engines free the ARM CPU for control and program-flow tasks. This allows the ARM processor to spend substantial amounts of time in power-saving modes, including Idle (standby), Doze (slow clocking), and Sleep (near static operation). Both smart accelerators include embedded multimedia DSP (MMDSP+) cores, which reduce memory accesses and power consumption. The MMDSP+ processor also saves power by executing VLIW instructions, thus handling multiple instructions in a single operation.

Another system-level power-saving feature is the optimization of memory partitioning on Nomadik, which minimizes power-hungry off-chip memory accesses. Implementing power savings at the system and architectural levels provide very good results—potentially a 10x or more reduction in total chip power.

On a circuit level, ST uses clock-gating and operand-isolation techniques to turn off inactive parts of Nomadik. To reduce static power dissipation, most of the chip can be powered off and reawakened in under 3 ms. The use of on-chip frequency and power scaling, whereby different parts of the chip run at different clock rates and different operating voltages, save power as well. Voltage scaling has its limitations however—it doesn't scale well with reductions in processing nodes. Smaller process nodes support lower operating voltages, so for future devices at 65nm and 1V, there is not a lot of margin for scaling the voltage lower. Circuit-level "tricks" can provide a few tens of percent of power savings.

Nomadik is implemented with a combination of low-Vt (high speed, high leakage) and high-Vt (low speed, low leakage) transistors. ST designed the chip so that low-Vt devices are used only where needed for performance. High-Vt transistors are used elsewhere to minimize leakage for 'off' devices. Back biasing of the chip along with the use of optimized cell libraries also result in lower power.

Packaging is another area in which power can be saved. Nomadik can be packaged with stacked Flash memory, minimizing memory-processor interconnects and further reducing power for the packaged chip.

ASICs
Since the ASIC vendor develops a fixed-function chip for one customer, usually targeting a single application, they can do everything necessary at all design levels—system, architectural, circuit, and device—to reduce power. Some popular system- and circuit level techniques for ASIC power reduction include multiple clock sources (running different sections of the chip from different clocks), multiple voltage regions (beyond the usual one voltage for the chip's entire core and one for the I/Os), and operation modes that shut down or run at very low speed sections of a chip not in use. If certain chip domains are powered by power lines that are separate from the main power grid these domains, when not used, can have their power totally cut off, eliminating leakage current for these modules. Optimizing on-chip memory configurations reduces power-hungry off-chip memory accesses, which can be a significant part of the chip's dynamic power consumption.

Physical layout for ASICs targeted for ultra-low-power applications benefits from squeezing chip functions into a minimum area, reducing interconnect wire lengths for both clock and data lines. This has a significant impact on dynamic power consumption, since shorter signal and data paths translate to lower power dissipation. For power-critical applications, ASIC designers may also custom-design individual logic blocks, or even individual transistors, to minimize power commensurate with chip timing specifications.

The Verdict
As you can see, power-saving techniques for silicon chips do not fit into a "one size fits all" designation. Different platforms and architectures all require choosing from a wide range of techniques to minimize power, spanning several different design levels and having varying impact on design complexity, design time, chip cost, and chip performance. Only one thing is certain—every chip vendor is aware of the power crisis and, unlike the weather, everyone is talking about and trying to do something about it.


About the Author
Jim Lipman is currently Vice President, Client Services for Cain Communications, specializing in the development and implementation of communication and marketing services programs for companies serving the semiconductor, silicon-IP, EDA, and other high-tech electronics-industry segments. Jim's experience includes chip-design R&D, marketing, marcom, consulting, technical editing, technology training, and on-line publishing of technical content for engineers. His email address is jlipman@caincom.com.





Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form