Design Article
Networking memories for buffering applications—Part I
Michael Sporer, MoSys
8/31/2012 2:04 PM EDT
The density of the individual DRAM component is immaterial; the quantities of DRAM listed are needed to fulfill the bandwidth requirements. In the 100GE example, these 10 DRAM, running at their limit and including IO power would consume approximately 10 Watts and 960 square millimeters of board space. Moving forward you can see the capacity supported continues to grow while the capacity required based on higher network performance, in terms of time is actually shrinking. As DRAM architected devices continue to optimize for computer main memory applications they become less aligned with the high performance buffer requirements for future networking applications consuming power and board space resources for capacity that is not needed.
Table 3 compares alternative means of implementing high performance, deterministic general purpose buffers for networking applications. DDR3 has an attractive component cost but the pin count, power and board area result in a prohibitive implementation cost. In the case of an oversubscribed system additional buffering is used on ingress to absorb micro-bursts of traffic. These micro-bursts can occur due to the statistical randomness of traffic, but more commonly in datacenter applications where a single user request can be handled by tens or hundreds of servers it is common to generate heavy bursts of East-West or scatter-gather traffic concentrated through a narrow number of links when the response is aggregated by the original requestor. In this case, the MoSys Bandwidth Engine using Burst mode achieves the performance needed for oversubscription buffers with a single device, resulting in substantially lower power and one third the board area. Furthermore, considering the caching implementation, the high access rate of the Bandwidth Engine also enables the host on-die SRAM to be reduced to less than 2Mb saving resources on the host device.
CONCLUSION
Although DRAM has been the memory of choice for packet buffer applications the target market and performance points for that device are not aligned with high performance networking applications. Despite the low component cost, the total system implementation cost becomes prohibitive due to the pin count, power and board area required. Traditional high performance networking memories are not much better, but these higher performance devices reduce the complexity and resources consumed on the host. Next generation devices, using a high efficiency serial interface and high performance, deterministic memory array architecture can meet the requirements of future buffer applications. As network performance continues to scale up it will be interesting to see how long the legacy requirement for 100ms packet buffering will continue. The Bandwidth Engine Burst device with its high efficiency serial interface has the potential to displace not only high performance networking memories in limited buffer applications but also displace commodity DRAM solutions as the needs for networking continue to diverge from that of general purpose computing.
SIDEBAR
Packet buffers have traditionally had a capacity sufficient to buffer one ‘round trip’ at a given line rate. Over time, as line rates have increased, the round-trip time has fallen and this buffer rule of thumb has been consistently in the 100mx range, which in turn has become a de facto marketing requirement. Now that the line rates have outstripped DRAM data rates, the number of DRAM and the capacities of these buffers are growing when, in fact, there is a convincing body of work that demonstrates that large buffers are not necessary, but even detrimental to network jitter and response time.
These buffers under uncongested traffic are minimally utilized, but under burst traffic conditions these oversized buffers can introduce pathological TCP timeout and result in latencies and response times orders of magnitude worse than if the buffer was sized appropriately. There are high throughput Layer 2 and Layer 3 switches, which can demonstrate oversubscribed 10G line rate lossless packet forwarding with under 40ms of buffer capacity per port. As the end-to-end network latency drops with incrasing line rates, the primary buffer capacity requirements will continue to diminish with 40G buffers under 20ms and 100G buffers under 15ms.
See YouTube video: Buffer Size requirements for non-burst traffic.
About the Author
Michael Sporer brings over 20 years of marketing, sales and engineering experience to MoSys. Prior to joining MoSys, he was a Technology Strategist for Micron Technology, an industry leader in semiconductor memory products. Previously, he was Director of Technical Marketing at LG Semicon and with Hewlett Packard in the Memory Technology Center. Mr. Sporer holds a Masters of Science in Engineering from Stanford University, and a Bachelor’s degree from the University of Michigan.


Dipanjan Lahiri
9/7/2012 3:46 AM EDT
great this
Sign in to Reply