PORTLAND, Ore. -- Analysts have been speculating wildly about the technical details of the massively parallel next-generation Knights Landing Xeon Phi. For instance, they have said 3D memory would be stacked on top of its die. However, today Intel and its 3D memory partner, Micron Technology Inc. put all that speculation to rest. (See Intel Massive Parallel Upgrades Due.)
The self-hosted many integrated core (MIC) Knights Landing version of the Xeon Phi will deliver more than 3 teraflops of double-precision peak performance per single socket. However, the Micron 3D memory that facilitates that speed will not be stacked atop the Knights Landing die. Instead, Intel and Micron have been working on a super-high-bandwidth parallel-path interface that will allow 3D Hybrid Memory Cubes (HMCs) to surround the Knights Landing die in such close proximity that the HMC will behave as if they were on the Xeon Phi die.
"The interface circuitry that goes out to the Knights' Landing Xeon Phi chip has been optimized with parallel paths for maximum bandwidth. Our HMC will be packaged with a very optimized interface, so they can be placed very close to the Xeon Phi using DDR4 channels," Mike Black, HMC technology strategist at Micron, told us. "And then all of that will be put into a common package that then drops into a single socket on the board."
Micron's 2-gigabyte and 4-gigabyte parts will ship to other customers this year with channel bandwidth of 120 and 160 gigabytes per second, respectively. For Intel, Micron is customizing a 16-gigabyte part to supply channels optimized to the massively parallel processors on the next-generation Knights Landing Xeon Phi.
This tactic works, Black said, because the application specific integrated circuit (ASIC) at the base of the HMC houses the circuitry to manage the DRAMs on top of it -- freeing up room on each DRAM. Also, the ASIC manages the communications of data on and off the chip. The interconnections between the ASIC and the DRAM die above it use up to 2,000 through-silicon vias (TSVs). The ASIC takes over many of the logic functions from each DRAM, freeing up room for TSVs without enlarging the die.
"To the programmer, the Micron memory will be transparent to the outside world--almost like a layer three cache inside the package with the processor," Black said.
Micron has been working with Intel for several years to optimize the interface channels to maximize bandwidth to its processors. At the 2011 Intel Developers Conference, it demonstrated a single interface channel with a bandwidth of more than 1 terabit per second (seven times greater than DDR3). It also claimed the lowest-ever energy consumption of approximately 8 picoJoules per bit.
Micron is the pioneering instigator of the HMC, but by no means will it be the sole supplier of them. "The Hybrid Memory Cube Consortium is 160 companies that are defining the interface specification, and it's an open platform that is available for use by anyone," Black said.
HMCs have 15 times the bandwidth of DDR3, he said. They consume 70% less energy than today's memory technologies, take up 90% of the footprint of RDIMM, and offer five times the bandwidth of traditional DDR4.
— R. Colin Johnson, Advanced Technology Editor, EE Times