SEATTLE—At the SUPERCOMPUTING 2011 show here Tuesday (Nov. 15), Intel Corp. showed off its 22-nm "Knight's Corner" co-processor compute accelerator, boasting over 50 cores and delivering a purported one teraflops of double precision floating point performance, outstripping Nvidia’s Tesla 2090 accelerator.
Showing off the very first silicon available, Intel’s Rajeeb Hazra, general manager of technical computing at the Intel datacenter and connected systems group said the single 1TFLOP/s chip was the equivalent of the entire AsciiRed system built back in 1997, consisting of 9298 Pentium II Xeon processors.
That system made up 72 cabinets of computing power.
“That was a very proud day for us, but today we can get a teraflop of sustained double precision performance in one 3-D tri-gate 22-nm chip running Linux,” said Hazra adding, “It’s not on Powerpoint, it’s a real chip, we have it in our labs and it’s working.”
Intel later showed off a makeshift system running the chip to select press, though very few additional specs were given out.
“We don’t know of any other mainstream architecture chip with this kind of performance,” said Hazra, explaining that the main benefits lay in the extreme programmability of the chip.
“The programming model story is clear to us,” he said noting that all the software tools being used on Xeons today would be able to scale to Knight’s Corner with minimal effort, giving Intel an advantage over rivals like Nvidia, which requires code to be adapted and ported before being accelerated on a GPU.
Intel’s MIC architecture also has the advantage of having been specifically designed to process highly parallel workloads, said Hazra.
“It’s a significant day. We are so excited about taking this architecture to market,” he said.
In addition, Hazra spoke briefly about Intel’s exascale efforts, saying the firm had set itself a firm goal of reaching the target by 2018, within a 20MW power envelope. To do so, he said, however, would require a large amount of investment and partnership.
“It’s not just a question of money, it’s a question of getting the right brains and eyes looking at solving the issues,” he said.
so intel finally made it to 1 Tflops. been able to buy this level of perforance for 2 or 3 years in a single slot pcie card. what is the flops/watt, flops/$ installed (floor space / cooling ...), memory bandwidth, max concurrent threds ?
who care anymore what instruction set is used in HPC. when programming in C / C++ / cuda / opencl / java / perl / and a few hundered programming languages ... all that is abstracted away. code will have to be ported in either case (from intel single thread to intel multi-thread, MIC or OPENCL/CUDA). if you are going to go through the effort of porting, ISA is not that important a factor. install cost, operating cost, tools availabity, feature support, perf/$$ are more improtant. don't buy the intel hype. MIC is still just a research project @ intel. You can actually buy AMD and NVIDIA products with 3rd party tools support.
For HPC, very little has been "abstracted away" since C was invented. Only the first four languages you list (plus Fortran) are actually used for HPC. Intel is promising the ability to use one (mature!) language/compiler/toolchain for both CPU and GPU/MIC code. This wouldn't be practical if the hardware wasn't x86(ish) ISA.
These are old Pentium III cores. No instruction level parallelism, no out of order execution, etc. Only I/O interface is PCIe. Only advantage this has is the tool chain and that you can prototype your code on normal multicore x86 workstations and move to MIC later. Plus, the cores can work independently. GPU cores can't really work on separate processes. They are too interconnected.
Intel might face some competition but in my opinion they have just gone too far to catch in terms of the technology. Now the question is how they would capture the imagination of the consumers to gain control of the tablet/ultrabook segments.