Common memory solutions in MCU designs
Among the memory solutions currently being used, embedded flash memory is found in most existing MCU offerings. Embedded flash memory provides the performance and capacity demanded by the MCU of an IoT. Incorporating flash into a standard logic CMOS process adds a 30% to 40% cost premium to the die as a result of more process and manufacturing steps, however. In addition, the extra thermal cycles needed during manufacture degrade logic performance. Finally, the number of foundries providing an embedded flash process is limited, the embedded flash memory process does not scale with standard logic processes, and embedded flash memory is not currently available below 65 nm.
A second method for storing program code for a microcontroller is using external serial EEPROM and loading its contents into a shadow SRAM on-chip at power-on or during restart from sleep mode. This alternative also provides the performance and capacity that an MCU for an IoT device demands. There are, however, drawbacks, including the static power drain of the on-chip shadow SRAM; the form-factor, added pins, and supply chain/bill of materials (BOM) cost of the additional external serial EEPROM; and the slow start-up performance. In addition, this solution is the least secure since the MCU program content is easily accessible via the pins between the serial EEPROM and MCU.
The third memory solution commonly found in MCU designs is ROM. This solution provides all of the requirements of an IoT device: performance, power, and capacity. The greatest disadvantage to ROM is its inflexibility. The memory contents are loaded before the chip is fabricated. Fixing a bug or changing the program requires a new mask set and full manufacturing cycle. Meanwhile, revenue lost while waiting for software implementation, silicon manufacturing, and chip and product qualification will never be recovered. Using many ROM versions of the same base design is costly and presents operational challenges such as supply forecasting and inventory management and not having the right product mix at the right time.
Vertical crosspoint memory
Developed nearly a decade ago, antifuse memory stores data by producing gate oxide breakdown (BVox) in a transistor, thus converting open to nonreversible low-resistance path. A typical antifuse memory bit cell consists of 1.5T/2T (transistor) and 1TH (horizontal). The drawback is that the technology requires more chip-level real estate to address applications that require large storage capacity, for example, program storage in a microcontroller, that currently use costly flash even in applications that seldom if ever change the memory contents.
To address the problem, we redesigned the existing antifuse memory bit cell to greatly boost bit density, improve memory access speed, and further reduce power consumption. The redesigned bit cell, called vertical crosspoint memory (VCM), reduces the area needed to store a bit of data by a factor of four over the comparable footprint of an embedded flash bit cell (see figure 1).
Click image to enlarge.
Figure 1: Bit cell comparison.
The small size of the VCM bit cell is enabled by a new transistor construction in silicon. Constructing a conventional transistor requires a P+ gate atop an N-well, which is deeper than the shallow trench isolation (STI) that isolates adjacent transistors (see figure 2). Building an antifuse transistor using this technique requires a large bit cell to accommodate the deep N-well beneath the gate and its associated spacing rules. The new VCM bit cell incorporates an N-well that is shallower than the STI and uses the P substrate to provide the insulation between adjacent bit-cell transistors.
Figure 2: A conventional transistor (left) requires a P+ gate atop an N-well that is deeper than the shallow trench isolation (STI) that isolates adjacent transistors, while a VCM bit cell incorporates an N-well that is shallower than the STI (right).
The design enables the use of a smaller VCM bit cell, and because of the shallower N-well, the spacing between transistors can be reduced. The result is a high-density, high-efficiency memory design that delivers a useful degree of programmability with only a modest increase in cost and complexity.