Design Article
PCM Progress Report No. 7: A view of Samsung's 8-Gb array
Ron Neale
4/30/2012 2:05 PM EDT
After Samsung’s presentation [1] of their 8-Gb PRAM at ISSCC2012 and while phase-change memory (PCM) watchers wait for the other shoe to drop in the form of a possibility of an associated PCM product announcement, there are a number of features of the array that merit discussion.
The architecture of Samsung’s 8-Gb array, from the top down to the individual PCM cells runs as follows: eight partitions of 1 Gb, each of which is divided into 128 sub-arrays (tiles), organized as a matrix of 4096 word lines (WLs) by 2048 bit lines (BLs). WL strapping contacts are situated at 64-cell intervals, which the authors assert provides a 19% area gain in over earlier 1 Gb work [2] in addition to the advantages introduced by moving from 58-nm to 20-nm technology. The result is a 9.43 x 6.30 mm2 chip. The device operates at 1.8 V and uses a low power, double-data-rate non-volatile memory (LPDDR2-NVM) interface. No quantitative data on chip power dissipation was provided.
Write bandwidth
Although the headline write bandwidth claimed for the 8-Gb array was 40 MB/s, under certain conditions this could be increased. The array architecture employed parallelism to increase the bandwidth over the Samsung’s earlier 1-Gb demonstration array [2]. Parallel write operation, upped to 128-bit from the 32-bit of the earlier 1-Gb array, provides the increased bandwidth. There was an indication that when the device was optimally implemented, this could be increased to 133 MB/s. To understand in part how that is possible, it is necessary to explore the form of the write/erase (w/e) pulse. Pulse shaping of the leading and trailing edges and multi-pulse trains has long been a feature of PCM development, usually in order to optimize a particular device characteristic such as w/e lifetime, on/off resistance ratios or values, and elevated temperature data retention characteristics.
Other effects that increase the write time, shown as TA and TB in figure 1, are there to allow the high-voltage programming charge pump to recover. The reason they are not lumped together is because while charge pump recovery is the main purpose, the authors state that it is not the only role. I would suggest that other roles might be to allow for the thermal recovery of the PCM cell and to allow discharge of all of the parasitic capacitance. The latter is especially important when the passive parasitic parts of the array are used for pulse shaping.
The earlier mentioned ability to increase the bandwidth from 40 MB/s to 133 MB/s is facilitated by providing the required high voltage from an external source, thereby removing any increase in program time related to charge pump recovery. The high voltage also extracts a chip area penalty.
To provide a constant programming current to all PCM cells, irrespective of the position and resistance along the array interconnect, a cascode write-pulse generator design, with an output impedance of several mega-ohms, is used. This means the current at any cell is less influenced by variable parasitic series resistance. As a result, the programmed cell resistance is more constant and adds to the width of the sense amplifier read window.
The reductions in write pulse-widths and the reduced write current allow the opportunity for the parallelism that has provided the bandwidth. The authors indicate that the write current is on the order 80 to 100 μA and show the form of on/off (set/reset) resistance as function of current for reset/set pulses of duration 100 ns and 150 ns, respectively. The device uses pulses of the same maximum current amplitude for both set and reset, with, for any pair, the longer set pulse having a sloping trailing edge.
The architecture of Samsung’s 8-Gb array, from the top down to the individual PCM cells runs as follows: eight partitions of 1 Gb, each of which is divided into 128 sub-arrays (tiles), organized as a matrix of 4096 word lines (WLs) by 2048 bit lines (BLs). WL strapping contacts are situated at 64-cell intervals, which the authors assert provides a 19% area gain in over earlier 1 Gb work [2] in addition to the advantages introduced by moving from 58-nm to 20-nm technology. The result is a 9.43 x 6.30 mm2 chip. The device operates at 1.8 V and uses a low power, double-data-rate non-volatile memory (LPDDR2-NVM) interface. No quantitative data on chip power dissipation was provided.
Write bandwidth
Although the headline write bandwidth claimed for the 8-Gb array was 40 MB/s, under certain conditions this could be increased. The array architecture employed parallelism to increase the bandwidth over the Samsung’s earlier 1-Gb demonstration array [2]. Parallel write operation, upped to 128-bit from the 32-bit of the earlier 1-Gb array, provides the increased bandwidth. There was an indication that when the device was optimally implemented, this could be increased to 133 MB/s. To understand in part how that is possible, it is necessary to explore the form of the write/erase (w/e) pulse. Pulse shaping of the leading and trailing edges and multi-pulse trains has long been a feature of PCM development, usually in order to optimize a particular device characteristic such as w/e lifetime, on/off resistance ratios or values, and elevated temperature data retention characteristics.
For their 8-Gb array, the Samsung team brings something new to the PCM table in the form of the write pulse presented to the PCM cell. By design, the pulse that is delivered to the cell is formed in part by the parasitic resistance and capacitance of the array. The write pulse uses what is described as “pre-emphasis” technique (see figure 1). In this approach, the write generator produces a pulse output in which the initial period of the pulse is of greater amplitude than is actually required to reset or set the device. This pre-emphasis pulse is then integrated by the parasitic capacitance and resistance of the word and bit lines to produce, at the PCM cell, a leading edge with a rise time to the required current that is significantly less than would be the case if pre-emphasis was not used. The net effect is to reduce the overall write time.


Figure 1: Features of the programming pulses illustrate how pre-emphasis reduces rise time and contributes to reducing total pulse width.
Other effects that increase the write time, shown as TA and TB in figure 1, are there to allow the high-voltage programming charge pump to recover. The reason they are not lumped together is because while charge pump recovery is the main purpose, the authors state that it is not the only role. I would suggest that other roles might be to allow for the thermal recovery of the PCM cell and to allow discharge of all of the parasitic capacitance. The latter is especially important when the passive parasitic parts of the array are used for pulse shaping.
The earlier mentioned ability to increase the bandwidth from 40 MB/s to 133 MB/s is facilitated by providing the required high voltage from an external source, thereby removing any increase in program time related to charge pump recovery. The high voltage also extracts a chip area penalty.
To provide a constant programming current to all PCM cells, irrespective of the position and resistance along the array interconnect, a cascode write-pulse generator design, with an output impedance of several mega-ohms, is used. This means the current at any cell is less influenced by variable parasitic series resistance. As a result, the programmed cell resistance is more constant and adds to the width of the sense amplifier read window.
The reductions in write pulse-widths and the reduced write current allow the opportunity for the parallelism that has provided the bandwidth. The authors indicate that the write current is on the order 80 to 100 μA and show the form of on/off (set/reset) resistance as function of current for reset/set pulses of duration 100 ns and 150 ns, respectively. The device uses pulses of the same maximum current amplitude for both set and reset, with, for any pair, the longer set pulse having a sloping trailing edge.
Navigate to related information


conroe
5/3/2012 11:17 PM EDT
Neale this was a very nice analysis of the Sumsung paper. You mention that "Micron recently issued a statement that it has developed a new PCM process that it plans to deploy sometime this year". Is there more information on this anywhere?
Sign in to Reply
R G.Neale
5/4/2012 5:22 AM EDT
Conroe-This was provided in the Micron Q2 2012 conference call for analysts to discuss earnings.
Part of the quote that I have reads “….as well as our 20-nanometer DRAM node. We also made progress scaling up to 300-millimeter substrates on our 45-nanometer NOR process and in developing a new 300-millimeter phase-change memory process. Moving forward, we look to deploy both of these new 300-millimeter NOR and phase-change processes in the manufacturing fabs over the next year….”
What would have been more interesting is to hear more on the type of PCM device (bit capacity etc.) they plan to use the process for and the timing for a fully qualified device. As IBM state they are reviewing seven technologies for the SKA perhaps a PCM device from Micron will be in that mix.
In my original piece as submitted I characterized this as Micron’s much needed good news message, as the rest of the their news at the time was not so good. The reviewers and editor decided it was not part of my technical brief and did their job.
Sign in to Reply
bingdao
6/4/2012 3:38 AM EDT
I am your fans and I like your article very much! This is really something that PCM guys should focus and solve. Recently, I kind of read that Hynix also published some paper showing 1GB PCM. Did you hear that? What do you think of it?
Sign in to Reply