SEATTLE—At the SUPERCOMPUTING 2011 show here Tuesday (Nov. 15), Intel Corp. showed off its 22-nm "Knight's Corner" co-processor compute accelerator, boasting over 50 cores and delivering a purported one teraflops of double precision floating point performance, outstripping Nvidia’s Tesla 2090 accelerator.
Showing off the very first silicon available, Intel’s Rajeeb Hazra, general manager of technical computing at the Intel datacenter and connected systems group said the single 1TFLOP/s chip was the equivalent of the entire AsciiRed system built back in 1997, consisting of 9298 Pentium II Xeon processors.
That system made up 72 cabinets of computing power.
“That was a very proud day for us, but today we can get a teraflop of sustained double precision performance in one 3-D tri-gate 22-nm chip running Linux,” said Hazra adding, “It’s not on Powerpoint, it’s a real chip, we have it in our labs and it’s working.”
Intel later showed off a makeshift system running the chip to select press, though very few additional specs were given out.
“We don’t know of any other mainstream architecture chip with this kind of performance,” said Hazra, explaining that the main benefits lay in the extreme programmability of the chip.
“The programming model story is clear to us,” he said noting that all the software tools being used on Xeons today would be able to scale to Knight’s Corner with minimal effort, giving Intel an advantage over rivals like Nvidia, which requires code to be adapted and ported before being accelerated on a GPU.
Intel’s MIC architecture also has the advantage of having been specifically designed to process highly parallel workloads, said Hazra.
“It’s a significant day. We are so excited about taking this architecture to market,” he said.
In addition, Hazra spoke briefly about Intel’s exascale efforts, saying the firm had set itself a firm goal of reaching the target by 2018, within a 20MW power envelope. To do so, he said, however, would require a large amount of investment and partnership.
“It’s not just a question of money, it’s a question of getting the right brains and eyes looking at solving the issues,” he said.
The ratio of brand advocacy to informed commentary seems extraordinarily high in many of the sentiments above.
Knights Corner gets most of those flops from a very wide SIMD micro-architecture. I happen to really LIKE SIMD micro-arches, and have done quite a lot of programming for them, and from what I see of the nLRBI (that's what the instruction set was called when the device was "Larrabee") it appears to be a very-well thought out SIMD ISA, far better than SSE.
But the claim that ordinary "scalar" procedural programs written in C, Fortran etc are automatically going to be accelerated to Tflops ... simply isn't so.
If you can't exploit the SIMD width efficiently ... its a 2-issue x86 core which isn't all that different from Atom. It's the SIMD extensions that make this design "powerful." Auto-vectorizing compilers haven't lived up to the hype so far (for any microarch ... GPGPUs included).
AMD advocacy is misplaced here, because so far as I know, AMD isn't trying to compete in specialized HPC processors and/or adjunct accelerators. The competition is IBM with its spectrum of Cell/Power7/BlueGeneQ processors, and to some extent the nVidia Kepler+ARM initiative.
It's going to be an interesting competition ... I wouldn't make any predictions of success. Folks should remember that both Cell and Power7 have successively not "conquered the HPC world," and for those thinking that Intel has avoided such experiences .. remember Itanium? Or for that matter that Knight's Corner is an updated Larrabee?
My engineering career began in 1970 and I was using the 8086 in 1976. It's always somewhat of an odd feeling to still see the x86 label being referenced. I would never have imagined it back then. Kudos to Intel for sustaining the product line.