It’s no secret that SoC architects have always wanted more on-chip memory. In fact, it’s not uncommon for SoCs to include hundreds of integrated memory cores. To satisfy this historical demand, embedded memory vendors made design choices that favored memory capacity at the expense of memory performance. Over the years, their circuit designers have made memories denser by shrinking transistors and packing them closer and closer together. In short, they defied layout design rules in order to reduce bit cell area, and now we must deal with the performance implications.
Today, due to faster processer speeds, parallel architectures, and especially multi-core processing, on-chip memory performance requirements are skyrocketing. SoC architects now need even faster memories. However, embedded memories can no longer be clocked as fast as processors or other logic on the same chip and this is causing performance bottlenecks which now pose one of the biggest challenges to new SoC product designs. It’s an amalgamation of problems—SoC architects want faster memories along with a better variety of memory configurations including multiple read and write ports that allow parallel accesses from multiple processor cores. But, at the root of these problems is a basic fundamental trade off: the same memory circuit and layout design decisions that enable high memory capacity also make it difficult to speed up memory performance or build multi-port memories.
There’s no argument that multi-port memories are an excellent design choice for multiprocessor systems, network processors, graphics chips, and other high performance devices. Traditionally, some multi-port memories have been used to support data communication between different clock domains. Today, they are an ideal match for multi-core systems where multiple cores can access memory simultaneously. SoC architects want to use multi-port memories but in many cases they are hesitant to do so because of the complexities, cost and long time periods associated with the silicon manufacturing processes.
Let's face it, conventional embedded memories have serious limitations. The question is how can we take advantage of denser bit cells to reduce memory area while also boosting memory performance? A completely new memory design approach is needed. As an analogy, increases in processor performance have come not only because of advances in circuitry, but also because of architecture improvements such as pipelined execution, and exploitation of instruction-level parallelism. A new technology, algorithmic memories, similarly takes an architectural approach and can breathe new life into embedded memory.
Algorithmic memories use algorithms synthesized in hardware to increase the performance of embedded memory macros up to ten times more. This technology is implemented in soft RTL. The resulting solutions appear exactly as standard multi-port embedded memories. Using this approach requires no change to existing memory interfaces and ASIC design flows. The technology is both process node and foundry independent. Algorithmic memories open the door to allow system architects to rapidly and reliably create customized memory solutions that can be optimized for specific applications: in essence, making memory performance a configurable entity.
About the Author Adam Kablanian (BA Physics Berkeley '83, MS EE SCU '91, EMBA Stanford '06) is CEO of Memoir Systems. Previously, he was co-founder (since '96), president and CEO (until Mar '07) chairman (until Mar '08), of Virage Logic; a company building embedded memories, which he subsequently took public on NASDAQ in 2000. Later, Kablanian was co-founder and CEO of iCON Communications, a premier Wimax broadband ISP in Armenia, which he successfully sold in '09. Adam has also served on numerous boards in the EDA Industry (Sequence Design) and Application SW (IconApps) companies and is currently a board member of Ambature LLC which develops technologies that will significantly improve the efficiency of electrical energy consumption, distribution and usage.
I don't dispute the content of this article - statements here are quite learned and valid. The problem is that it's just asking questions. It's clear we architects need new solutions. What is not clear is how those multi-port memories will work, and there's nothing in this article to state anything other than the obvious need.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.