Design Article
Addressing the challenges of transition to DDR4
Fred Rastgar and Tom Rossi, InStryde LLC
1/22/2013 1:00 PM EST
Managing DDR4 JEDEC specifications
DDR4 system designers face three key technical challenges:
The DDR4 POD I/O structure, in conjunction with the adoption of the point-to-point topology rather than the traditional multi-drop bus interface, delivers better timing margins, but its data transfer rates are expected to more than double. Furthermore, initial DDR4 adoption is aimed at the more expensive server platforms, putting the burden on those developers to ensure that platforms are designed with sufficient margin and headroom for the increased speeds and variations in DIMMs.
The JEDEC specification defines the timing requirements between memory controllers and DRAMs, the majority of which are described as minimum timing requirements—timing specifications that establish a minimum time to elapse before subsequent events are allowed. A primary objective of the timing specifications is to avoid memory collisions caused by overlapping commands. Memory controllers and DRAMs must be designed and tested with automated test equipment (ATE) for adherence to the JEDEC specifications across process, voltage, and temperature variation. Additional variables introduced at the system level, such as DIMM design, socket, and motherboard design and layout, can contribute to system-level timing violations and must be taken into account.
Signal integrity engineers typically focus on detailed platform signal characteristics using a high-speed oscilloscope. We found that a dedicated protocol analyzer gave us a more comprehensive view of the timing marginalities of the system under test across a range of different traffic patterns that can be expected under normal system operation.
Consider a series of subsequent accesses to the same DRAM bank group (see figure 1). The DDR4 specification requires a minimum time tRRD_L of six clock periods between subsequent accesses (i.e. activate command delays to the same bank group). We can see from the trace in the figure that this rule has been violated, since it only shows five clock intervals. More than likely, we will also find this violation can cause DRAM bus contention. Further examination of the lower pane in Figure 1 illustrates the total number of timing violations in the captured traffic, while the traffic violation tool tip highlights variation between the performance specified in the standard and the actual measured timing. In this scenario, we found that a dedicated protocol analyzer provided greater visibility into the memory bus with timing waveforms and real time triggering on JEDEC ordering violations.

Some timing violations can be attributed to design issues in the memory controller, while others are caused by signal integrity marginalities introduced at the board level or sequence/pattern specific failures. In the latter case, validation engineers can leverage the flexibility of a protocol analyzer’s trigger state machine and its deeper recording memory to set up more complex triggering scenarios, or optionally in conjunction with an externally triggered oscilloscope for deeper signal integrity evaluation and analysis.
The timing analysis methods discussed above allow designer to quickly identify system-level timing violations. Robust system designs should allow for platform, component, and DIMM variations, however. This requires a deeper characterization of critical timing specifications to ensure sufficient system design tolerances. In the course of our evaluation, we found that today’s instrumentation can provide system designers with greater flexibility to selectively sweep and measure critical timing parameters and identify the actual system design timing margin (see figure 2).

DDR4 system designers face three key technical challenges:
- Designing for margin relative to JEDEC specifications
- Power optimization techniques
- Balanced system design considerations
The DDR4 POD I/O structure, in conjunction with the adoption of the point-to-point topology rather than the traditional multi-drop bus interface, delivers better timing margins, but its data transfer rates are expected to more than double. Furthermore, initial DDR4 adoption is aimed at the more expensive server platforms, putting the burden on those developers to ensure that platforms are designed with sufficient margin and headroom for the increased speeds and variations in DIMMs.
The JEDEC specification defines the timing requirements between memory controllers and DRAMs, the majority of which are described as minimum timing requirements—timing specifications that establish a minimum time to elapse before subsequent events are allowed. A primary objective of the timing specifications is to avoid memory collisions caused by overlapping commands. Memory controllers and DRAMs must be designed and tested with automated test equipment (ATE) for adherence to the JEDEC specifications across process, voltage, and temperature variation. Additional variables introduced at the system level, such as DIMM design, socket, and motherboard design and layout, can contribute to system-level timing violations and must be taken into account.
Signal integrity engineers typically focus on detailed platform signal characteristics using a high-speed oscilloscope. We found that a dedicated protocol analyzer gave us a more comprehensive view of the timing marginalities of the system under test across a range of different traffic patterns that can be expected under normal system operation.
Consider a series of subsequent accesses to the same DRAM bank group (see figure 1). The DDR4 specification requires a minimum time tRRD_L of six clock periods between subsequent accesses (i.e. activate command delays to the same bank group). We can see from the trace in the figure that this rule has been violated, since it only shows five clock intervals. More than likely, we will also find this violation can cause DRAM bus contention. Further examination of the lower pane in Figure 1 illustrates the total number of timing violations in the captured traffic, while the traffic violation tool tip highlights variation between the performance specified in the standard and the actual measured timing. In this scenario, we found that a dedicated protocol analyzer provided greater visibility into the memory bus with timing waveforms and real time triggering on JEDEC ordering violations.

Click image to enlarge
Figure 1: Protocol analyzer shows five clock intervals, whereas the DDR4 standard calls for six.
Some timing violations can be attributed to design issues in the memory controller, while others are caused by signal integrity marginalities introduced at the board level or sequence/pattern specific failures. In the latter case, validation engineers can leverage the flexibility of a protocol analyzer’s trigger state machine and its deeper recording memory to set up more complex triggering scenarios, or optionally in conjunction with an externally triggered oscilloscope for deeper signal integrity evaluation and analysis.
The timing analysis methods discussed above allow designer to quickly identify system-level timing violations. Robust system designs should allow for platform, component, and DIMM variations, however. This requires a deeper characterization of critical timing specifications to ensure sufficient system design tolerances. In the course of our evaluation, we found that today’s instrumentation can provide system designers with greater flexibility to selectively sweep and measure critical timing parameters and identify the actual system design timing margin (see figure 2).

Click image to enlarge.
Figure 2: Timing sensitivity analysis performed by protocol analyzer shows timing errors relative to DDR4 specification. In response, engineers can modify select parameters such as those shown in the red box, then recheck the margins.
DDR4 power optimization techniques
Memory power consumption historically has not been considered a major contributor to overall platform power usage. As processors, chipsets, and other major platform components become more aggressively power managed, however, the contribution of the memory subsystem to the overall power consumption becomes increasingly important. In mobile platforms, this contribution directly affects battery life and user experience while in server platforms, increased power consumption impacts power distribution, system cooling costs, and overall power utility operating costs.
DDR4 specifications put special emphasis on new and improved power-reduction features while promising significant power savings over DDR3. Although DDR4 is being targeted in the near term as the memory technology of choice for server platforms, its power efficiency makes it a suitable solution for mobile and handheld platforms. In the long run, DDR4 is uniquely positioned to move down its cost curve and expand to cover many system design segments.

The majority of power savings in DDR4 is expected to come from its reduced 1.2-V supply voltage. Each of the features above can contribute incrementally to overall system power consumption, however.
Command address latency (CAL), a new feature introduced in DDR4, allows Command (CMD) and Address (ADDR) receivers to be activated only when addresses and commands are latched. A programmable feature, tCAL specifies the number of clock delays between chip select (CS_n) and CMD/ADDR defined by Mode Register 4 (MR4 [A8:A6]). Having CAL enabled allows the DRAM to further reduce power consumption by turning off the receivers on the ADDR and CMD lines (see figure 3). This feature allows for incremental power saving with minimal impact on performance.

Low power auto self-refresh (LP ASR) represents another power saving innovation in the DDR4 specification. ASR allows DRAMs to dynamically manage refresh rates based on operating temperature. As a result, DRAM refresh rates are reduced at lower temperatures and increased as the thermal operating condition rises. LP ASR mode is activated using MRS commands (MR2 [A7:A6]). Once activated, DRAM will automatically manage the Self-Refresh entry. It is estimated that reducing refresh rates during periods of low activity can save power by as much as 20%.
DDR4 system developers can creatively use these programmable power-saving features to optimize the power profile of their designs.
Memory power consumption historically has not been considered a major contributor to overall platform power usage. As processors, chipsets, and other major platform components become more aggressively power managed, however, the contribution of the memory subsystem to the overall power consumption becomes increasingly important. In mobile platforms, this contribution directly affects battery life and user experience while in server platforms, increased power consumption impacts power distribution, system cooling costs, and overall power utility operating costs.
DDR4 specifications put special emphasis on new and improved power-reduction features while promising significant power savings over DDR3. Although DDR4 is being targeted in the near term as the memory technology of choice for server platforms, its power efficiency makes it a suitable solution for mobile and handheld platforms. In the long run, DDR4 is uniquely positioned to move down its cost curve and expand to cover many system design segments.

Click image to enlarge.
Table 2: DDR4 power management features and anticipated power savings.[1]
The majority of power savings in DDR4 is expected to come from its reduced 1.2-V supply voltage. Each of the features above can contribute incrementally to overall system power consumption, however.
Command address latency (CAL), a new feature introduced in DDR4, allows Command (CMD) and Address (ADDR) receivers to be activated only when addresses and commands are latched. A programmable feature, tCAL specifies the number of clock delays between chip select (CS_n) and CMD/ADDR defined by Mode Register 4 (MR4 [A8:A6]). Having CAL enabled allows the DRAM to further reduce power consumption by turning off the receivers on the ADDR and CMD lines (see figure 3). This feature allows for incremental power saving with minimal impact on performance.

Click image to enlarge.
Figure 3: CAL is activated via MR4 command, which is shown fully decoded along with tCAL command timing
Low power auto self-refresh (LP ASR) represents another power saving innovation in the DDR4 specification. ASR allows DRAMs to dynamically manage refresh rates based on operating temperature. As a result, DRAM refresh rates are reduced at lower temperatures and increased as the thermal operating condition rises. LP ASR mode is activated using MRS commands (MR2 [A7:A6]). Once activated, DRAM will automatically manage the Self-Refresh entry. It is estimated that reducing refresh rates during periods of low activity can save power by as much as 20%.
DDR4 system developers can creatively use these programmable power-saving features to optimize the power profile of their designs.
Navigate to related information

