CUPERTINO, Calif. – ARM has developed vector instructions to propel its 64-bit V8 architecture into high-performance computing. Fujitsu helped develop the extensions for use in a follow on to its K Computer, a Sparc-based system at Japan’s Riken Institute that hit 8 petaflops in 2011 making it the most powerful system in the world at that time.
The effort catapults the ARM processor core for the first time into the realm of supercomputers, a rarified territory Intel’s x86 has come to dominate. ARM hopes it can expand its presence there the way Intel did, slowly replacing homegrown processors from the likes of IBM and Cray.
ARM’s strength is in its potential for relative power efficiency compared to the x86. The trait could serve supercomputer designers who can’t practically deliver the massive power to drive the exascale-class systems they want to build.
The current Neon SIMD instructions ARM supports are limited to 128 bits focused on use imaging and video in client systems. Its scalable vector extensions (SVE) supports 128 to 2,048 bit lengths in increments of 128 bits. Users can write vector code once and run it on any size vector design without recompilation, something it claims no other architecture can do.
SVE is a new set of instructions aiming at scientific workloads not DSP-based media acceleration. Fujitsu said it hopes to use them deliver by 2020 a post-K Computer with 50x the capability and 15x the efficiency of the previous system.
SVE is a load/store architecture that uses up to 32 vector registers and 16 predicate registers plus control registers and a first-fault register. Predicates are used to manage a variety of decisions about control loops. ARM left room in its programming space for future extensions to SVE.
The SVE specification, being developed with numerous partners, will be available by early next year. ARM is just starting work on how it will make open source Linux patches for the extensions.
ARM showed significant scaling benefits with different vector lengths on SVE. Results are simulations based on compiled code with varied length of vectors. (Images: ARM)
Click here for larger image
All ARM’s 64-bit licensees have access to the SVE technology. ARM had multiple partners involved in developing it but would not release any other names, Nigel Stephens, an ARM fellow and lead architect said after a talk at the Hot Chips event here.
For Fujitsu the collaboration was an opportunity to forge a partnership at the start of ARM’s push into high-performance systems.
Sparc remains the preferred architecture for the company’s business servers, however Fujitsu sees an opportunity for a new class of technical and scientific systems based on ARM chips, said Toshio Yoshida, lead architect for the processor in Fujitsu’s post-K computer targeting exaflop performance in 2020.
The Fujitsu system will use a 512-bit SIMD vector unit with a version of its Tofu interconnect and other accelerator cores for I/O, Yoshida said. He would not comment on what “leading edge” process node the chip currently targets.
Fujitsu chose the 512-bit vector length because it adequately doubled the 256-bit SIMD of its prior Sparc-based system. "We want to step slowly into this area," said Yoshida.
SVE fits in 28-bit encoding region and is only available on 64-bit ARM cores.
— Rick Merritt, Silicon Valley Bureau Chief, EE Times