Editor’s note: This work was first presented at the 2012 IEEE International Electron Devices Meeting (IEDM) and appears here courtesy of the IEEE. For more information about IEDM 2013 (Washington DC; December 9-11), click here.
We demonstrated lower power consumption of mobile CPU by replacing high-performance (HP)-SRAMs with spin transfer torque (STT)-MRAMs using perpendicular (p)-MTJ. The key points that enable the low power consumption are adapting run time power gating architecture (shown in figure 1), and satisfying both fast and low-power writing, namely, 3 ns and 0.09 pJ, of p-MTJ cell (shown in figure 3). As shown in Table 1, only our developed p-MTJ has achieved 3 ns, 0.09 pJ. Thanks to the fast and low-power p-MTJ, the power consumption of cache memory could be reduced by over 80% without degradation of performance.
Table 1: The comparison of write power consumption for the latest data.Introduction
Spin torque transfer (STT)-MRAM is expected to be used for various memory applications [1-3]. One of the biggest challenges in these applications is to replace high-performance SRAM (HP-SRAM) such as that used as a CPU cache memory, since its static leakage power has increased with CMOS scaling and accounts for a major portion of the power consumed in mobile CPUs or mobile SoC. The “power gating” technique enables reduction in the leakage power of HP-SRAM during long standby time when the application is not running (see figure1(1)). However, the power gating technique cannot be used while the application is running, even though there are frequent short standby states for SRAM. On the other hand, whereas SRAM has leakage path, STT-MRAM with one transistor and one magnetic tunnel junction (1T-1MTJ) does not, and therefore STT-MRAM can eliminate the leakage power even while the application is running (see figure1(2)). Here, it should be noted that if the active energy of STT-MRAM, of which programming energy is the major component, is larger than total energy of the standby leakage current, energy consumed by STT-MRAM is larger than that of SRAM (see figure2(1)), and as a result, there is no benefit from employing STT-MRAM instead of SRAM in CPU.
Click image to enlarge
Figure 1 (1): The power gating technique enables reduction in the leakage power of HP-SRAM during long standby time when an application is not running. However, the power gating technique cannot be used while the application is running, even though there are frequent short standby states for SRAM. (2): Whereas SRAM has leakage path, STT-MRAM with one transistor and one magnetic tunnel junction (1T-1MTJ) does not, and therefore STT-MRAM can essentially eliminate the leakage power even while the application is running.
Figure 2 (1): In the case of high-PE (programming energy) MTJ, if the programming energy of MRAM is larger than total energy of the standby leakage power, when the standby time is short, total energy consumed by MRAM is larger than that of SRAM and, as a result, there is no benefit from employing MRAM instead of SRAM. In the case of low-PE MTJ, even when the standby time is relatively short, total energy can be reduced. As the PE is decreased, total energy consumption can be further reduced. (2): Relationship between total power and standby time for SRAM-based and STT-MRAM based memories in mobile CPU. The standby time for resistor file is the shortest, and that for L1-cache is the 2nd, L2 the 3rd, and L3 is the longest. Based on our estimation, MTJs of previous works would not enable power reduction if they were to replace SRAM-based memories. If the MTJ of this work has much less PE, it enables power reduction for L2- and L3-cache memories.
Therefore, achievement of smaller active energy or smaller programming energy (PE) for STT-MRAM than the standby energy of SRAM is a key factor for reducing the total power of memory in CPU or SoC. The PE is expressed as follows: PE = Write Voltage × Write Current × Write Time
The energy reduction Ered
by replacing SRAM with STT-MRAM can be approximately expressed as: Ered= Leakage Current (SRAM) × Short Standby Time(average) × Voltage(SRAM) + PE(SRAM) -PE(STT-MRAM)
To enable the energy reduction by this replacement, the write operation of MTJ is required to be performed with both fast speed and low power simultaneously, which can largely reduce the PE. However, even though there are some reports [4-8] on fast write speed or low write power of MTJs, no MTJs having both fast write speed and low write power have been presented. By our approximate estimation, these MTJs do not enable energy reduction of memories of mobile CPUs by replacing SRAM, and therefore, PE has to be much further decreased (see figure2(2)). This is the first paper on an MTJ having the smallest PE, 0.09 pJ, ever reported by realizing both short write time of 3 ns and low write current of 50 µA. This paper also describes power and performance evaluations from the viewpoint of realistic mobile CPU applications.