Design Article
Quick tip: Memory bandwidth metrics for video processing
Jonah Probell and Matthew Proujansky
4/25/2007 3:00 AM EDT
Engineers use a variety of metrics to describe memory bandwidth. Common metrics are discussed below. The term access is used to refer to a read or a write transaction.
Note that the bandwidth required for encoding, decoding, or otherwise processing image sequences varies dramatically depending on the image content and processing algorithms used. When comparing video processing designs, it is therefore imperative to make the comparison using identical inputs.

Theoretical bit rate required by an application
This is a calculation that assumes an ideal system. For example, if a system decodes 1080p60 4:2:0 video with 8 bits per pixel, then the theoretical bandwidth required (disregarding motion compensation reads) is 2.9 million 8x8 pixel blocks per second, or 1.5 Gbps.
Real systems need much more bandwidth on the processor bus and DRAM bus.
Real systems require more bandwidth due to factors such as:
On-chip interconnect accesses rate
Shared buses can have a strong impact on performance, particularly for a heavily loaded system. Each access requires arbitration for the bus, and when a system is heavily loaded arbitration may add many delay cycles. For this reason, the maximum attainable number of on-chip interconnect accesses per second is a useful measure.
Consider the motion compensation block read of a 16x16 pixel macroblock of MPEG video data. Assuming 8 bits per pixel this comprises 256 total bytes of data. Depending on system design, the 256 bytes could be accessed with sixty-four 32-bit accesses, thirty-two 64-bit accesses, or sixteen burst accesses of 16-bytes each. The larger the number of accesses, the more likely it is that arbitration will add delay cycles.
On-chip interconnect data transfer rate
The size of the data bus is important when determining whether hardware can meet the data bandwidth required by an application. A wider bus can handle large data transfers in fewer cycles but "wastes" bandwidth during accesses of less than the full bus width. A large bus also wastes bandwidth at the beginnings and endings of unaligned transfers. (Note that the interconnect transfer rate is closely tied to the interconnect access rate. If arbitration delays lower the interconnect access rate, it will not be possible to achieve the expected data transfer rate.)
DRAM Data Transfer Rate
The DRAM interface, being off-chip, has a larger impact on performance, cost, and power consumption than on-chip busses. For this reason, DRAM data transfer rate is often the most useful measure of bandwidth.
Due to design constraints, the DRAM interface might have a narrower or wider data bus than on-chip busses. In double-data-rate (DDR) DRAM systems, data values can be transferred on both rising and falling clock edges. DRAM data transfer rates of systems may, therefore, vary considerably.
Total DRAM Transfer Rate (including RAS and bank switching delay)
Each DRAM access might require a bank switch or a precharge and row address switch (RAS) in the DRAM chip. These operations add clock cycles to the time of each access and cannot be pipelined for higher throughput. For this reason, total DRAM transfer rate is an excellent measure of bandwidth. However, because the DRAM is typically shared by all processors within a chip and is used by multiple pieces of software, it is difficult to characterize the bank and row switching performance of a system under real operating conditions.
Though real systems' DRAM usage is difficult to characterize, there are things that can be done to minimize both the total number of DRAM accesses and the probability of a RAS delay. These include storing images in tiled rather than raster order, coalescing smaller accesses into larger bursts, and organizing data structures to improve temporal locality.
Jonah Probell is a microprocessor architect and digital video expert. He runs the Web site VideoBits.ORG. Matthew Proujansky is Vice President of Engineering at Rampage Systems, Inc. where he designs digital imaging systems for print media.



