Editorís note: This work was first presented at the 2013 International Memory Workshop and appears here courtesy of the IEEE.
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM technology is experiencing difficult technology scaling challenges that make the maintenance and enhancement of its capacity, energy-efficiency, and reliability significantly more costly with conventional techniques.
In this paper, after describing the demands and challenges faced by the memory system, we examine some promising research and design directions to overcome challenges posed by memory scaling. Specifically, we survey three key solution directions: 1) enabling new DRAM architectures, functions, interfaces, and better integration of the DRAM and the rest of the system, 2) designing a memory system that employs emerging memory technologies and takes advantage of multiple different technologies, 3) providing predictable performance and QoS to applications sharing the memory system. We also briefly describe our ongoing related work in combating scaling challenges of
NAND flash memory.
Main memory is a critical component of all computing systems, whether they be server, embedded, desktop, mobile, sensor. Memory capacity, energy, cost, performance, and management algorithms must scale as we scale the size of the computing system in order to maintain performance growth and enable new applications. Unfortunately, such scaling has become difficult because recent trends in systems, applications, and technology exacerbate the memory system bottleneck.
Trends and requirements
In particular, on the systems/architecture front, energy and power consumption have become key design limiters as the memory system continues to be responsible for a significant fraction of overall system energy/power . More and increasingly heterogeneous [14, 68, 28] processing cores and agents/clients are sharing the memory system, leading to increasing demand for memory capacity and bandwidth along with a relatively new demand for predictable performance and QoS from the memory system [50, 55, 67]. On the applications front, important applications are usually very data intensive and are becoming increasingly so , requiring both real-time and offline manipulation of great amounts of data. For example, next-generation genome sequencing technologies produce massive amounts of sequence data that overwhelms memory storage and bandwidth requirements of todayís high-end desktop and laptop systems [69, 3, 72] yet researchers have the goal of enabling low-cost personalized medicine.
Creation of new killer applications and usage models for computers likely depends on how well the memory system can support the efficient storage and manipulation of data in such data-intensive applications. In addition, there is an increasing trend towards consolidation of applications on a chip, which leads to the sharing of the memory system across many heterogeneous applications with diverse performance requirements, exacerbating the aforementioned need for predictable performance guarantees from the memory system. On the technology front, two key trends profoundly affect memory systems. First, there is increasing difficulty scaling the well-established charge-based memory technologies, such as DRAM [47, 4, 37, 1] and flash memory [34, 46, 9, 10, 11], to smaller technology nodes. Such scaling has enabled memory systems with reasonable capacity and efficiency; lack of it will make it difficult to achieve high capacity and efficiency at low cost. Second, some emerging resistive memory technologies, such as phase change memory (PCM) [64, 71, 37, 38, 63] or spin-transfer torque magnetic memory (STT-MRAM) [13, 35] appear more scalable, have latency and bandwidth characteristics much closer to DRAM than flash memory and hard disks, and are non-volatile with little idle power consumption.
Such emerging technologies can enable new opportunities in system design, including, for example, the unification of memory and storage subsystems. They have the potential to be employed as part of main memory, alongside or in place of less scalable and leaky DRAM, but they also have various shortcomings depending on the technology (e.g., some have cell endurance problems, some have very high write latency/power, some have low density) that need to be overcome.