In a saturated handset market, OEMs are pushed by the carrier's desire to increase average revenue per user (ARPU) and by their own need to drive users to upgrade their handsets. As a result, all handset manufacturers carry new data-centric devices offering rich multimedia and productivity features aimed at savvy business customers and the trendy teenager segment. Many of these devices have complex operating systems (OS), large suites of applications and a large amount of memory for user data. More and more, this memory is NAND flash as opposed to NOR flash. But among the NAND solutions, there are several implementation alternatives offering different merits and challenges.
This is a two-part article that discusses these alternatives. The first part focuses on the differences between NOR and NAND flash, why NAND flash is becoming the memory of choice among mobile handset manufacturers requiring high-capacity memory, inherent NAND limitations, and advanced technologies that are being implemented to overcome these limitations and even add value to NAND. The second part of this article will address the most popular NAND design options that are available in the market, and their merits and challenges.
Why NAND Wins the Port
Two main technologies dominate the non-volatile flash memory market today: NOR and NAND (Figure 1). Until recently, hardware engineers were unfamiliar with the differences between these two technologies. But the trend to store an increasing amount of data on mobile handsets, in less space and at lower costs, has brought these differences into focus.
Figure 1: Diagram illustrating the NOR and NAND flash architectures.
While NOR offers eXecute In Place (XIP) capabilities and high read performance, it is expensive per megabyte (Mbyte), thus mostly cost-effective in low capacities such as 1 to 4 Mbyte. Another disadvantage of NOR is its extremely low write and erase performance.
The NAND architecture, on the other hand, offers high cell densities and high capacity, combined with fast write and erase rates. But since it is accessed in blocks of 512 bytes (called pages) it cannot be used for XIP and naturally has some bad blocks. It is also prone to low reliability due to random errors generated by physical effects in the geometry of the NAND gates.
There are many other differences between these two technologies that will not be covered in this article, but the technical characteristics strongly differentiate the types of applications using them. Table 1 summarizes the differences between NOR, NAND, and a NAND-based DiskOnChip device.
Table 1: Major Differences Between NOR and NAND
NOR is typically used for executing small amounts of code, mainly in capacities up to 4 Mbyte, and is common in applications such as simple consumer appliances, low-end cell phones, and embedded applications. Raw NAND is mainly used for data storage in applications such as memory cards and MP3 players.
Since NAND flash does not have a standard memory interface and because it requires extensive handling to prevent errors in the stored data, it requires an additional controller (as in the case of removable memory cards).
From 16 Mbyte and upwards, NOR flash loses its cost-effectiveness. Furthermore, its sluggish write speed and very slow erase time do not provide suitable performance for a data storage device.
The high densities of NAND flash and its better cost per Mbyte make it the best media for data storage. However, NAND technology has some limitations that must be dealt with to insure reliability of the data stored and of the overall system and handset. Let's look deeper at these reliability issues.
NAND Reliability Issues
One of the main considerations of working with a flash media is its reliability. This reliability is impacted by three major factors: bit-flipping, bad block handling, and Life span (number of erase cycles allowed). Let's see how these apply to NAND.
1. Bit Flipping
All flash architectures today suffer from a phenomenon known as "bit-flipping." On some occasions (usually rare, yet more common in NAND than in NOR), a bit is either reversed, or is reported reversed. This is the result of the following effects:
- Drifting Effects: A phenomena that slowly changes a cell's voltage level from its initial value.
- Program-Disturb Errors: This is sometimes referred to as "over-program" effects. A programming operation on one page induces the flip of a bit on another, unrelated page.
- Read-Disturb Errors: This effect causes a page read operation to induce a permanent change of a bit value in one of the bits read.
A flip in one bit may seem insignificant. However, this "minor" glitch may hang your system completely if it corrupts a critical file. When the problem is just of reporting, repeating the read operation may solve it. But if the bit was actually reversed, error detection/correction code (EDC/ECC) must be applied.
Since bit flipping is more common in NAND devices, all NAND vendors recommend using an EDC/ECC algorithm. When using NAND for multimedia information, this problem is not critical, but when using it as a local storage device to store the system OS, configuration files and other sensitive information, an EDC/ECC system must be implemented.
2. Bad Block Handling
Due to yield considerations, NAND devices are shipped with bad blocks randomly scattered throughout them. Shipping NAND devices free of bad blocks comes with a very high price tag caused by the low production yield rate, and is therefore not a cost-effective option.
Working with NAND devices, especially for local storage, requires initially scanning the media for bad blocks, and then mapping them all out so they are never used. Failing to do so in a reliable manner may result in a high failure rate of the final device, and even a recall.
3. Life Span/Endurance
Flash permits you to write, erase, and save information on it for at least ten years. However, like any good thing, if you use it too much, it will eventually wear out.
Each flash block can be erased some 100,000 times before you can no longer be sure if what you write is stored properly. Think of it, if you will, as a piece of paper on which you write using a pencil, then erase, then write, then erase...Eventually, you will dig a hole in the page.
Since the block size of a NAND device is usually about eight times smaller than that of a NOR device, each NOR block will be erased relatively more times over a given period of time (especially significant when working with small files) than each NAND block. This extends the gap in favor of NAND.
Endurance and reliability are closely linked. When reaching the maximum allowed erase cycles, the reliability of the flash deteriorates dramatically. Therefore, maximizing the endurance of the flash has a positive effect not only on the life of the flash but also on the quality of life of the flash.
To overcome the erase cycle limitation, a simple solution is implemented: the same file is never written twice to the same place. Instead, the file is moved around the flash media. The file's location is managed by a table that translates the virtual file/sector address used by the file system with its current physical address on the flash. In this way, the flash lifetime is prolonged without changing its physical characteristics.
To illustrate this mechanism, often called a flash translation layer (FTL), assume that the file allocation table (FAT) must be updated about 100,000 times a month, and that 1,000 free blocks are available on the media. By erasing and writing the FAT to the same block, the flash memory device will be used up after one month. By moving the FAT around the 1,000 free blocks, each capable of 100,000 erase cycles, the device life span can be extended to a whopping 1,000 months (83 years and 4 months).
Ease of Use
Using a NOR-based flash is a straightforward process. It is connected like any other memory device, and can run code directly, if slowly. Using NAND, on the other hand, is a tricky issue. NAND flash has an I/O interface and uses a protocol that includes commands (read/write/erase), address and data. NAND also does not permit access to a random memory address.
Instead, NAND flash devices must be accessed by pages of 512 bytes, one at a time. This means that a software driver is required to perform any type of access to NAND flash. This also means that although raw NAND flash is very convenient as a disk replacement (mechanical hard drives are also accessed in blocks of 512 bytes, and are referred to as block devices), it cannot be used to run code. This simple, low-level software module performing read, write and erase operations is often referred to as the memory technology driver (MTD).
Writing information to NAND is also tricky since it is essential to verify that data is not being written to a bad block. This means that virtual mapping must be implemented on NAND device at all times.
A distinction must be made between two levels of software support: basic read/write/erase operations, and high-level software for disk emulation and flash management algorithms (including wear-leveling, performance optimizations, etc.).
Running code from NOR devices requires no special software support. Running code from NAND requires an MTD. Both NAND and NOR require MTDs for write and erase operations. While MTDs are basically all that are required for NOR write/erase, a NAND driver must also have bit error and bad block management code.
Higher-level software is available for NOR devices from many vendors. NAND devices, on the other hand, lack noticeable software support. However, their high capacity, low cost and fast performance make them an ideal candidate for data storage in general, and hard drive emulation (block management) specifically.
A typical flash management software is shown in Figure 2.
Figure 2: Three major blocks required for flash management software.
In Figure 2, the top block (the driver layer) provides block device services to the OS file system. This code must be ported for each new supported OS.
The middle block, called the flash translation layer (FTL) or simply TL, can be referred to as the "brains" of the package, and contains the block device emulation algorithms and the flash management algorithms. It is responsible for translating the file system's sectors into physical flash blocks. While translating the addresses, it transparently handles the wear-leveling process, the bad-block mapping and all related issues (such as folding or garbage collection).
The lowest block is the MTD layer. It provides the basic read/write/erase interface to a NAND-based chip. This unit is controlled by the flash management layer.
Booting from NAND
Since NAND is not a random access device, it has no XIP functionality. However, there is a strong incentive to enable boot from NAND to achieve greater cost savings by enabling the designer to eliminate expensive NOR flash from the platform.
To boot from NAND, the designer must utilize a very small XIP device. This device would contain system initialization code as well as some code used to shadow the main boot code and OS image from the NAND media into the host RAM and execute from there.
This is further complicated by two conflicting requirements: performance and reliability. Copying the boot code and the OS image from NAND to the system RAM is a time-consuming task that affects the boot time of the device. But higher performance cannot be provided at the expense of reliability, since an error in copying boot code from the NAND to the system RAM will very likely make the platform inoperative.
On to Part 2
That wraps up Part 1 of our series on NAND flash alternatives. In Part 2, we'll further the discussion by examining the most popular NAND design options that are available in the market along with their merits and challenges.
About the Authors
Arie Tal is a technical support director at M-Systems. Mr. Tal is responsible for recognizing emerging market trends and defining the next generations of DiskOnChip and TrueFFS software features. Arie received a BA in economics from The Technion, and MBA from Haifa University, and an Electronic Engineering degree from Netanya College. He and can be reached at firstname.lastname@example.org.
Ziv Paz is product manager at M-Systems. Ziv received a B.Sc. with honors in Computer and Electrical Engineering from Ben Gurion University, Israel in 1994. He can be reached at email@example.com.