Hardware design challenges (memory interfaces)
Advanced ECC operations, like Reed Solomon or BCH, are computationally very expensive. Many solutions offer hardware (HW) support for ECC. However, these fixed solutions lag behind the growing ECC requirement. MLC NANDs may now require more than 16 bits ECC per 512 bytes, yet HW support designed just a few years ago may not support 16 bit. In that case, the HW ECC support would become useless, and either the NAND couldn’t be used or the ECC computation would need to be done in the software (SW), which takes CPU cycles from other important tasks and assigns it to ECC calculation.
Firmware and ROM Bootloaders
The dynamic raw NAND market (raw NANDs have relatively short lifecycles of about two years) and the initial lack of standardization has resulted in heterogeneous interfaces from different NAND manufacturers. Not only do these approaches vary between manufacturers, but, at times, they also differ between different generations of NANDs from within the same manufacturer! This offers significant challenges for firmware and middleware that might not be updated very often (if at all).
The main concern when designing ROM bootloaders for raw NAND booting is whether or not future NAND devices will work. The ONFI standard alleviates this somewhat since it provides a way to guarantee device identification commands that should not have to change in the future.
Another major concern related to the hardware design issue is what level of ECC is sufficient. Since the NAND parts that will be connected to the system cannot be known a priori, the safest solution is to leverage the maximum ECC possible with the memory interface or controller. Using more ECC than required for booting simply improves robustness, with the possible downsides being increased boot time and more complex factory programming procedures.
Since NAND manufacturers don’t guarantee that all blocks of the memory are good (nor will all blocks remain good over the device’s lifetime), another issue is how to handle bad blocks with unrecoverable errors if encountered during booting. Some strategies include placing multiple copies of the boot image and letting the boot loader locate and load the first good one or having the boot loader respect a bad block table stored somewhere else in the NAND. Another useful strategy is to have the system run-time software periodically check and correct any issues with the boot block.
Middleware / OS software issues
The middleware, or run-time software, suffers from similar issues to those faced by the ROM boot loader. Although it might be easier to adapt the middleware to handle newer devices, newer detection schemes and newer command sets offered by more recent device, there is overhead every time a change has to be propagated through different support structures, from middleware teams to customers. For example, the memory technology device (MTD) layer of the Linux® OS kernel had issues when device sizes reached 4GB since the size had originally been defined as a 32-bit value. In another case, there was no support for NAND device with page sizes larger than 2KB. Modern NAND devices have 4KB or 8KB page sizes. Fixing these issues isn’t necessarily trivial.
In addition, the run-time middleware must deal with activities such as wear-leveling and bad block management. Wear-leveling is software mechanism to spread the write/erase cycles around the chip so that all blocks wear evenly. Failure to do so will result in oft-used blocks failing very early. This is more important now than ever before, as the cells of the MLC devices have much reduced endurance rating (3,000 - 5,000 compared to 10,000 to 100,000 for SLC NANDs). The middleware must also track which blocks are bad and make sure they aren’t used for any further reads and writes. The more stringent requirements of recent and future NAND devices may require even more complicated schemes to be enacted to manage wear-leveling and bad blocks.
Supporting different custom ECC hardware is another challenge as one ports the middleware from one generation of processor to the next and improves the ECC capabilities in new drivers. Additionally, there is no good solution if the ECC HW cannot meet the ECC requirement of the NAND, as software ECC has proven to be too slow and cumbersome for most embedded processors.
Potential solutions to NAND challenges
In our opinion, the main issue with using raw NAND devices in the embedded processor space is the skyrocketing ECC requirements. In five years, ECC requirements have gone from 1 bit/4 bit to pushing past 24 bits. Memory interface hardware designed six years ago is very likely incompatible with any new chips available on the market today. Though the issue of device identification and parameterization has been a problem, it isn’t considered as critical as the fundamental incompatibility resulting from insufficient ECC hardware in the memory controller/interface of embedded processors. Given the lifetime of products based on such devices (10-15 years), the lack of NAND supply that fits the original design requirements could force major rework in both hardware and software part of the way through the product life cycle.
Fortunately, the memory manufacturers have realized the issues with rapidly increasing ECC requirements and have taken steps to address it. The solution is managed NAND.
Managed NANDs perform some or all of the three NAND management tasks (e.g. ECC, wear leveling and bad block management) on the memory device instead of in a host controller so that they are no longer a concern for the system developer. Perhaps the most compelling of these is the embedded multimedia card (e•MMC). e•MMC is a Joint Electron Devices Engineering Council (JEDEC) standard that combines electrical and physical chip specifications with the interface commands and protocols of the MMC 4.3 into a single entity.
Figure 3: e•MMC block diagram (courtesy Samsung)
There are also “partially-managed” NAND devices that maintain the NAND interface but move the ECC into the memory device. The solution, which Micron has branded as ClearNAND™, may be attractive to those looking to replace the NANDs in current designs with minimal changes to the system SW or HW . Toshiba has recently released a nearly identical solution, dubbed SmartNAND™. It seems certain that other NAND vendors will soon follow the same path.
Figure 4: Standard RAW NAND versus ClearNAND (courtesy Micron)
There are additional costs to the managed NAND solutions, but having seen the problems of a rapidly maturing NAND market as both a silicon producer and a silicon consumer, it’s clear that some form of managed NAND is the only sensible choice for future design and development. Embedded processor vendors, such as Texas Instruments, will continue to support the existing ECC hardware in their memory interfaces, but there is little reason for them to spend design time and resources trying to keep pace with the skyrocketing ECC requirements. At this point in the evolution of NAND, it’s best to leave the implementation of advanced ECC solutions to the memory vendors as they push their devices to the physical limits of CMOS technology.
About the Authors
Daniel Allred is a senior applications engineer for Texas Instruments. He works in the C6000™ DSP software architecture team, developing ways to make TI's DSP technology more accessible. He has been with TI for four years, working with the DaVinci™ digital media processor and catalog OMAP™ embedded processor products. He holds a Bachelor of Science degree from the University of Florida and a Master of Science degree from Georgia Tech, where he studied microphone array signal processing and novel computational methods for signal processing in hardware.
Gaurav Agarwal is an applications engineer for Texas Instruments. He works in the C6000™ DSP application team. Gaurav has been with TI for five years, working with the VoIP, DaVinci™ digital media processor and catalog OMAP™ embedded processor products. Since last year, he has been leading efforts to design generic and bug-free boot loaders. Before joining TI, he worked at Motorola, where he published several research papers in the field of video-transcoding systems. He holds a Bachelor of Technology degree from IIT Kanpur, India and an Master of Science degree from University of Maryland, College Park, where he developed a novel machine vision system for leaf recognition.
I'm not privy to their financials, but these guys are in business to make money, so imagine it must cost-effective :).
I would guess the additional complexity and area required for this logic is currently being offset by the die shrinks (i.e. revenue per wafer is still higher). I question whether that can remain true for very long, though. The die shrinks cause more errors, forcing more robust ECC techniques, then the implementation of those techniques requires more die area,power, etc. Seems like a little bit downward spiral to me.
"Is there any model avaliable to simulate ECC in our own flash controller? Thanks in advance. "
This is something that should probably be investigated with the memory vendors (Micron, Samsung, etc.). I'm quite sure they have this, but I don't know if they would share it.
Very interesting and informatic article.
1. In an "Errors and error correction" section there might be a typing mistake which is "the ECC requirements are 4 bits per 512 bytes". It should be "4 bytes per 512 bytes".
2. Is there any model avaliable to simulate ECC in our own flash controller? Thanks in advance.
Very interesting article. In general it is very difficult to implement logic on a memory process, this explains the demise of "embedded memory". Is it really cost-effective to implement ECC on the die of a NAND chip?
Daniel and Agarwal presented this paper at ESC Silicon Valley this spring, and I asked them to put together a version for us on the designline. I think they did an outstanding job laying out the issues facing NVM technologies. Please add your comments or questions for the authors below.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.