This article reviews the relative strengths and weaknesses of microcontroller (MCU), digital signal processor (DSP), field programmable gate array (FPGA) and application-specific integrated circuit (ASIC) technologies for embedded applications, and proposes a customizable microcontroller as a cost-, performance- and power-effective tradeoff between them.
Four challenges, four technologies
Any embedded application of integrated circuits seeks to minimize, simultaneously, four factors:
- The number of transistors employed, which impacts die and package size, unit cost and power consumption. Advances in process technology continuously reduce transistor area, but both static and dynamic power consumption depend on the transistor count. The transistor count remains an important metric of system efficiency.
- The number of clock cycles required, which impacts performance and power consumption. Increasing clock frequencies associated with smaller process geometries permit more clock cycles in a given time interval, but at the expense of increased power consumption. Fewer clock cycles means less power consumption.
- The time taken to develop the application, which strongly influences its market acceptance. A product that misses its market window is a total waste of development effort. In many cases software development takes more time and costs more than hardware development.
- Nonrecurring engineering (NRE) costs such as mask manufacturing and the cost of hardware and software development. The increased NRE costs associated with leading-edge process technologies are putting these out of reach for many applications.
Today four technologies exist to address these requirements:
- Microcontrollers (MCUs) are general-purpose devices for information processing and control that can be adapted to a wide variety of applications by software. Application development effort is limited to software development and validation, and NRE costs are amortized amongst all the users of a particular MCU architecture. Clock cycle optimization is determined by code optimization, and the code footprint influences the number of transistors required for memories. Compact code that makes the most efficient use of the MCU architecture is essential. MCUs generally use transistors and clock cycles efficiently, but not optimally.
- Digital signal processors (DSPs) hard-wire the basic functions of many signal-processing algorithms. This optimizes transistor use and clock cycles for the required operations, at the expense of flexibility. Code is simpler than that required for MCUs. In many cases a DSP is an optimal solution for some but not all of the functions required of an application. Many MCUs include basic DSP operations in their instruction set, which enables them to do simple signal processing without the need for a dedicated DSP.
- Field programmable gate arrays (FPGAs) limit development effort to the coding required to configure them, and share NRE costs amongst a very large population of users, at the expense of a high level of transistor redundancy (and therefore high unit costs) and a limited optimization of clock cycles. Power consumption is far from optimal.
- The above three technologies are delivered as standard products, generally with a wide range of options, but nevertheless for any application the closest match always contains some redundant transistors and input/outputs. Application-specific functions, in particular analog operations, must often be implemented off-chip. Die size, package size, pinout and power consumption are less than optimal compared with what can be achieved by the fourth technology, namely ASICs. An ASIC is custom-designed for a particular application, possibly embedding one or more MCU or DSP cores, with as much as possible of the total system functionality implemented on a single die. This optimizes the number of transistors and clock cycles (and therefore unit cost and power consumption), at the expense of development time and NRE cost that are generally an order of magnitude higher than those for MCUs, DSPs or FPGAs.
The four technologies represent different tradeoffs towards achieving the four optimizations. The choice for any particular application is an engineering compromise. In most cases, the choice depends on a complex combination of factors, and no single technology is ideal. Different technology mixes are often most appropriate at different stages of the lifecycle of the end-user product. During prototyping and production ramp-up an FPGA or MCU/DSP-plus-FPGA solution may be preferable, in order to reduce development time and cost. When the product goes into high volume, its functionality can be re-mapped into an ASIC that embeds the MCU or DSP core from the standard product, and absorbs the logic from the FPGA, thereby optimizing die size, unit cost, clock cycles and power consumption without the need to rewrite the software. The high NRE costs associated with ASIC development are amortized over the high production volume.
A customizable microcontroller positioned between MCU, DSP, FPGA and ASIC technologies
An alternative technology that exploits the strengths of MCUs, DSPs and FPGAs, and can provide an intermediate step towards an ASIC, is a customizable microcontroller. A customizable microcontroller consists of a fixed portion comprising an MCU (processor, memories, peripherals and interfaces) together with a metal-programmable (MP) block of digital logic that can be customized to implement a DSP or an additional MCU together with application-specific logic. Figure 1 illustrates how a customizable microcontroller is positioned relative to MCU, FPGA, DSP and ASIC technologies.
To see a bigger version of this graphic click here.
Figure 1: A Customizable Microcontroller Positioned between an MCU&FPGA&DSP Combination and an ASIC
Consider a typical application where overall system control, networking, data management and user interface are handled by an MCU, signal processing is taken care of by a DSP, and application-specific logic is implemented on an FPGA. An initial configuration of the application is a three-chip solution, where most of the development effort is the programming of the MCU, DSP and FPGA. No hardware NRE is incurred; however the unit costs are relatively high, notably for the FPGA. System performance is not optimal, because of the inter-chip data transfers between the MCU, the DSP and the FPGA that often runs at a clock frequency below that of the other two components. Power consumption is also relatively high, with the FPFA making the major contribution. The three ICs and their inter-connections occupy considerable board space. This implementation is ideal for system prototyping and volume ramp-up, but not for high-volume production.
This three-chip configuration can be transformed into a customizable microcontroller (Figure 2) with minimal re-writing of the MCU or DSP software. The industry-standard processor embedded in the customizable microcontroller is likely to be code compatible with the standard-product MCU. This limits the effort required for software transformation. The DSP is mapped onto the MP block of the customizable microcontroller using the HDL code of its architecture. The FPGA logic is mapped onto the MP block using the same FPGA tools as were used to develop it. Apart from an increase in clock speed, the functionality of the logic in the FPGA and that in the MP block are identical.
To see a bigger version of this graphic click here.
Figure 2: Customizable Microcontroller Architecture
The DSP and FPGA implementations in the MP block can be optimized by exploiting the multiple embedded RAM and Dual Port RAM (DPRAM) blocks that are distributed within the MP block. Also, the DMA Controller and the parallel ports that link the MP block to the high-speed multi-layer bus matrix can be exploited to transfer data between the MP block, internal and external memories, peripherals and interfaces without processor intervention. This represents a major saving of clock cycles for any application that requires simultaneous data transfer and processing.
In addition to the savings of board space and the reduction in bill-of-materials for the application, the mapping onto a customizable microcontroller scores favorably in terms of all of the four factors cited at the start of this article:
- The transistor count is significantly reduced, notably by the replacement of the logic in the FPGA by that in the MP block.
- The number of clock cycles to perform a given function is reduced, in particular if the DMA architecture of the customizable microcontroller is fully exploited.
- The time required for the customizable microcontroller implementation is kept to a minimum by the code re-use between the MCU-plus-FPGA-plus-DSP solution and the customizable microcontroller. Design time and risk are further reduced by an FPGA-based emulation platform that is frequently supplied as part of the design flow for a customizable microcontroller. See Figure 3 for an example.
- NRE costs are limited to the placement & routing of the MP block, and the metal masks that are required to personalize it.
To see a bigger version of this graphic click here.
Figure 3: FPGA-based Emulation Platform for Customizable Microcontroller
The implementation of the application on a customizable microcontroller is optimal for medium-to-high production volumes of the end-user product. However, if the product goes into extremely high volume, a further optimization becomes cost-effective, by re-mapping the customizable microcontroller onto a standard-cell ASIC, and eliminating all the unused peripherals, interfaces and memory blocks and any unused logic in the MP block.
The re-mapping is carried out using the HDL code for the customized microcontroller, with minimal modifications. Exhaustive simulations ensure equivalent functionality of the two versions before committing the ASIC to placement & routing and mask fabrication. Mask costs are not negligible, but are amortized over the unit cost savings in high volume production.
A customizable microcontroller represents a cost-, performance- and power-effective tradeoff that exploits the advantages of MCU, DSP and FPGA technologies, and can be a transitional solution for medium- to high-volume fabrication, leading to a standard-cell ASIC for very high volume manufacturing.
The CAP family of customizable microcontrollers from Atmel Corporation implements the architecture and features described in this article.
Peter Bishop is with Atmel.
This story appeared in the February 2009 print edition of EE Times EuropeEuropean residents who wish to receive regular copies of EE Times Europe, subscribe here.
You can download a digital edition of the latest EE Times Europe print edition here.