SAN JOSE, Calif. — Nvidia and Micron announced new high-end co-processors at the opening of Supercomputing 2013 today. Meanwhile, an online analysis said the next generation of Intel's Xeon Phi, which competes with them both, will be an integrated device.
Supercomputers have become a key testing ground for massively parallel co-processors because they typically are made of clusters of the most powerful chips available. Nvidia's new Tesla K40 is likely to have the biggest impact in this space. The graphics chip vendor dominates the Top 500 systems where its GPUs are used in 38 of 53 systems employing accelerators, thanks in part of the maturity of its Cuda programming environment.
Nvidia claims the Tesla K40 GPU provides a 40 percent boost over its existing chip and supports 12 GBytes GDDR5, twice as much memory as the prior GPU. The K40 packs 2,880 cores to deliver up to 4.29 teraflops single-precision and 1.43 teraflops double-precision peak floating-point performance. The chip also uses PCI Express Gen3, doubling I/O performance of the PCIe Gen2 links on the previous part.
The K40 is available now and is expected to appear in high-performance servers from Appro, Asustek, Bull, Cray, Dell, Eurotech, Hewlett-Packard, IBM, SGI, Supermicro, and Tyan. Engineers can try out the chip for free on remotely hosted clusters.
Intel's Xeon Phi is quickly gaining ground in high-end systems, finding sockets in 13 Top 500 systems in the latest rankings, including Tianhe-2, the world's largest supercomputer. The next generation of the chip, called Knights Landing, is expected to use a new custom core to work as a standalone chip rather than a co-processor, according to an analysis published today.
Knights Landing is a 14nm version expected to ship in late 2014 of Intel's current Knights Corner version of Xeon Phi. The chip "will [be] a bootable device, in contrast to Knights Corner which must be attached to an x86 server CPU via the PCI-E slot," said David Kanter, principal of Real World Technologies, in his online post.
"The 14nm node should deliver a substantial increase in density and modest gains in power efficiency," said Kanter. "The instruction set is moving closer to the mainstream x86 CPUs," adopting the AVX3 instructions of Intel's next-generation Core architecture (called Skylake) rather than the current 512-bit vector instructions, he added.
Kanter said he expects Intel will use a new custom core insight Knights Landing. Intel has not yet disclosed "the microarchitecture, core count and fabric" of the chip, he said.
Separately, Micron announced a non von Neumann architecture it calls the Automata Processor. It aims to compete on a wide range of high-performance tasks with co-processors such as GPUs and high-capacity FPGAs.
Automata uses a new approach to parallel programming Micron claims has applications such as bioinformatics, video/image analytics, and network, which use large amounts of complex, unstructured data. The chip uses "a computing fabric comprised of tens of thousands to millions of processing elements interconnected to create a task-specific processing engine," Micron said in a press release.
Several academics are working with Micron on Automata. The device "offers a refreshingly new way of solving problems that is very different from all other accelerator technologies," said Srinivas Aluru, professor of computational science and engineering at Georgia Institute of Technology, speaking in the release.
Next year, Micron will release graphic design, simulation tools, and a software development kit for Automata. Many startups, including CogniMem Technologies, have launched novel parallel processing architectures, but many did not get market traction due to difficulty programming them.
— Rick Merritt, Silicon Valley Bureau Chief, EE Times