The current demonstration, however, did not use the second stacked memory chip, but relied on the 5 kbytes of memory (2 kbytes for data and 3 for instructions) located inside each core.
"Each of our cores measured 3 mm2, including its two independent 32-bit floating-point processors with single-cycle instruction execution," said Jerry Bautista, director in Intel's Tera-Scale research program. "A separate 2 Mbytes of SRAM for each core will be mounted on a second chip vertically above the Teraflop Research Chip, with one of the ports in the five-port router communicating with it vertically."
Although the chip consumed so much powerfrom 62 to 265 Wthat it required a special bench-mounted liquid-cooling apparatus, Intel claims that the IC met its design goals and that final production versions will consume less power and require only fans for air cooling.
"We exceeded our design goal, which was 1 teraflops of performance for under 100 W, when in fact we got a teraflops at only 62 W," said Nitin Borkar, the engineering manager of the lab team for the Tera-scale Research Chip.
According to Intel chief technology officer Justin Rattner, the architectural goal of the Teraflop Research Chip was to learn how to use hardware to manage multicore processors that exceed the management capabilities of software alone. Intel has already achieved some of those goals, discovering that mesh networks make it possible to relax some of the timing restraints, compared with conventional processors.
"We want to understand how to manage cores at these high counts," said Rattner. "And one thing we have already come to understand about high core counts is that timing does not have to be as uniform as we are used to. For instance, cores communicating to each other don't need all their clocks synched with 3-picosecond accuracy across the whole chip, as would be required if it was one big core."
Besides architectural issues, Intel also fabricated the Teraflop Research Chip to learn how to cope with the inevitable nonuniformities that will be inherent in future processors as they scale down to the atomic level. At 22 nm and beyond, "dopants are going to get down in the tens of atoms, where there is no way to get uniformity," said Bautista. "Some units are just going to fail, so we have to plan for it. And with our mesh network, some units can fail and other units can transparently pick up their workload."
For future test chips, Intel will try creating even larger core counts, as well as making application-specific models that aim at particular problems. "There's nothing that says all our cores have to be the same in the future," said Bautista. "There are all sorts of specialized cores we could add for specific applications."
Counting the fabrication team in Ireland, the Teraflop Research Chip team was spread across three continents and consumed more than a year of effort. "We had a team of 30 engineers, half of which were in Bangalore, India [and half in the United States]--which made for some late-night work occasionally over our 18 month development effort," said Borkar.
Now Intel plans to enlist the expertise of its software engineers to begin creating specialized program development and management tools that can handle such high-core-count multiprocessors.
"The biggest hurdle for Intel will be software, because there is no operating system today that could take advantage of the power of 80 cores," said In-Stat's McGregor. "Even the smartest programmer on the planet today still could not take advantage of so many cores, except for specialized applications. So Intel will need to create a new generation of software tools and a new generation of software engineers trained to use them."
Intel has already started that effort and pledges to seed as many as 400 universities worldwide with multiprocessor development tools, as well as to assist them in creating curriculums that result in a new generation of multiprocessor software engineers.
See related image