by Ed Clarke,
Product Marketing Engineer,
Double Data Rate (DDR) SDRAM is now the most popular memory type for designers of embedded applications needing large amounts of low-cost, high-performance memory. It provides a performance boost over Single Data Rate (SDR) SDRAM using extra interface logic, which doubles the raw bandwidth of the data path by clocking data on both edges. Thanks to widespread adoption by the PC industry, and improved long-term availability over SDR SDRAM, it also makes good commercial sense for use in today's applications.
Today's system-on-a-chip (SOC) designs with large external RAM requirements therefore need to support a DDR SDRAM interface. System-on-a-programmable Chip (SOPC) designs using FPGA technology are no different (apart from being accessible to all designers without NRE barriers), and consequently it is essential that FPGAs support DDR SDRAM.
Looking at the requirements for connecting to DDR SDRAM, we can break them down into two categorieselectrical and timing. Electrically, we need to support SSTL-II single-ended I/O at 2.5 V for data and control signals and a 2.5 V differential clock signal, which are all easily supported by a modern FPGA. However, timing presents more of a challenge, especially with data being transferred on both edges of the clock. This means I/O cells must be capable of running at twice the frequency of the clock. You can do this by doubling the number of registers such that the I/O cell is able to latch data on both clock edges. The alternative is to run the I/O cell at double the clock rate, and use general-purpose logic to separate out the data on the rising and falling edges of the clock. The Stratix family of FPGAs includes 6 registers and supports up to 200 MHz (400 Mbits/s) DDR SDRAM connection, and the Cyclone family includes 3 registers per I/O cell, but still supports up to 133 MHz (266 Mbits/s) DDR SDRAM operation.
Perhaps the most challenging timing requirement of connecting DDR SDRAM is that presented by the DQS pin. The DQS pin is a bidirectional strobe used to clock the data on the DQ lines. Depending on whether the SDRAM is being read from or written to, the problem is that both the strobe direction and timing are different. When reading from a DDR SDRAM, the phase of the DQS signal should be shifted 90° to make sure the data are being captured from the center of the window. It is possible to insert external fixed delays such as using an extended PCB track relative to the DQS lines to help this, but this method has several problems. First, because the DQS pin is bidirectional, any delay included to ensure correct read operation will then need to be removed from the write phase. This means the write data clock may then also need shifting, necessitating an extra clock phase.
Depending on the flexibility of the on-chip clock managers, it may not be possible to provide this without an extra clock source. This uses further on-chip resources. Using extended PCB traces can also cause problems when PCB routing is limited, and in extreme cases it can force the addition of extra PCB layers. The number of extra traces that need to be added to the DQS line will depend on frequency; signals propagate at around 166 ps/in. on an FR4 PCB, and so an additional 7" 15" of track length per DQS may required for DDR SDRAM running at 200 MHz or 100 MHz!
It is also often necessary to develop a system that can work at reduced clock frequencies, particularly during development. Fixed delay elements will only provide the correct phase shift at a single frequency, making derating or prototyping and lower frequencies challenging.
To address this problem, Altera's Stratix and Cyclone families have built-in dedicated support for delayed DQS read sampling. This not only makes timing easy to meet over process, voltage, and temperature variations (PVT), but also minimizes general-purpose use of resources in the FPGA such as Logic Elements (LEs) and Phase-Locked Loops (PLLs). The key here is that the delayed DQS input signal directly clocks the input registers of the DQS pins (see Figure 1). The inclusion of this simple feature solves a huge headache for designers connecting not only DDR SDRAM, but many other high-speed memory types too.
Several clock sources are needed for a DDR SDRAM controller and the memory devices themselves. These include the differential SDRAM clock, the core system clock, the write data clock, and possibly a read capture clock, depending on the round-trip timing. Using simple Delay Locked Loops (DLLs) with limited outputs may mean instantiating up to 3 such blocks to meet the required timing. In most cases, a single Stratix or Cyclone PLL will provide all of these because of the large number of individually configurable outputs.
Once it has been established that the FPGA supports the correct functionality to connect to DDR SDRAM, it is necessary to prove that the memory controller and all external signals pass the timing analysis. While today's high-performance FPGAs can support 200 MHz system speeds and the approaching 1 Gbits/s synchronous I/O speeds, to make sure your DDR memory controller meets the specification still requires appropriate placement within the FPGA logic and I/O banks, as well as careful PCB layout and timing analysis.
There are four categories of timing analysis to work through. These are write data timing, address and command timing, read capture using DQS, and the resynchronization of captured read data to the system clock domain. By using a known working reference design based around an IP core, and by following the accompanying documentation, this need to meet the timing requirements is made straightforward.
In addition to the high-speed data path, the DDR SDRAM state machine must be implemented correctly, and care must be taken with regard to the proper initialization and refresh of the DRAM cells. Since DDR SDRAM is defined by a JEDEC standard, the memory controller must also be compliant with the JEDEC standard. This means further testing, especially if you would like the flexibility of specifying different DDR SDRAM configurations and sources.
Focusing on the functionality of your product will often be most rewarding commercially. The fact you are using DDR SDRAM in your application won't necessarily help differentiate your product from that of your competitor, unless memory bandwidth is a limiting factor. Indeed many designers may choose DDR SDRAM purely for commercial reasons, and don't particularly want to be concerned by the details of how it works.
When using an ASSP with DDR SDRAM support, once the memory has been initialized correctly, the application can simply treat the DDR SDRAM as a block of memory in the memory map. This will also be the case in an SOC or SOPC application with a well-designed DDR SDRAM controller, except the chip designers must also consider how to connect the memory controller internally to buses. The most straightforward way of doing this is by using a SRAM-type interface (address, data, and strobes) with arbitration signals. The Altera DDR SDRAM controller IP provides this, and solves all of the problems that have been discussed previously by providing an off-the-shelf, fully tested solution.
Each designer of a system including DDR SDRAM may have slightly differing requirements. One designer may wish to use a single low-cost 16-bit-wide discrete DDR SDRAM device, while another may favor a full 64-bit DIMM interface to support off-the-shelf DIMM modules and future upgrades. Depending on the memory device(s) selected, it may be necessary to support multiple chip selects and varying numbers of address lines. Other variables include CAS latency and refresh periods, hence any memory controller IP used must be fully parametrizable. The Altera DDR SDRAM IP is shipped as a graphically parametizable software that generates the VHDL source and reference design for inclusion into your project.
Figure 2 - Screen Shot of DDR SDRAM Controller IP Parameterization Using a Megawizzard
The logic use for a 32-bit DDR SDRAM interface for Cyclone is around 1000 Logic Elements (LEs), and is 800 LEs for Stratix. Using Stratix, even a full 64-bit interface consumes only 1000 LEs, so the overall system cost is kept extremely low.
The use of DDR SDRAM controller IP in conjunction with FPGAs designed with DDR SDRAM support features enables the designer to concentrate on the rest of the system. This not only saves time, but maximizes the chance of getting the system right the first time.
A free Open-Core version of the DDR SDRAM controller can be downloaded from www.altera.com. It allows you to perform a functional simulation of the IP, as well as place-and-route and static timing analysis.