SAN JOSE, Calif. – A new startup will emerge from stealth mode and into the super-heated market for machine learning next week. Wave Computing
will describe a chip that can train neural networks ten times faster than Nvidia’s top GPUs while consuming the same power.
Wave’s chip is one of a handful of announcements at next week’s Linley Processor Conference. Also at the event, Ceva will describe a new DSP core for computer vision, Adesto will detail a NOR chip that can handle both storage and execute code and ARM will announce a new member of its family of on-chip interconnects.
The news comes at a time of growing specialization in microprocessor design, according to the event’s host, veteran chip analyst Linley Gwennap.
“In the 1970’s and ‘80’s people created a lot of different architectures, but over the past 20 years work has consolidated more on to ARM and x86,” said Gwennap, principal of the Linley Group. “As Moore’s law slows, Intel can’t keep delivering twice the performance every two years, so people have to innovate at the architectural level again…and we will see more custom accelerators with special instruction sets to get a 10x boost in performance,” he said.
Wave is the poster child for that trend at this year’s event, showing a custom chip for accelerating the TensorFlow algorithm Google developed and designed its own ASIC to run.
“We’ve looked at their architecture, and it looks very innovative, their preliminary performance numbers are 10x better than traditional GPUs in the same power as a high-end Nvidia GPU,” Gwennap said. “It’s a hot space, and they have a similar business model to Nervana [recently acquired by Intel] of selling systems to accelerate machine learning,” he added.
The systems approach makes sense given the difficulty of attracting the $50-100 million needed for a new chip startup and the higher revenues for selling systems, Gwennap noted. Indeed, even Nvidia is trying to sell machine learning systems that use four of its high-end Pascal-class processors.
Among other newsmakers, Adesto will describe a NOR chip with a special interface geared to speed output of programs executed on it. Typically flash is too slow to execute programs, so engineers provide SRAM, generating unacceptable costs for Adesto’s target markets in wearables and the Internet of Things.
For its part, ARM will detail a more feature rich on-chip interconnect. However it is not expected to support all the bells and whistles of offerings from companies such as Arteris and Netspeed.
Several presentations will look at server and networking chips that aim to attack the dominance of Intel’s Xeon processors. AMD, not presenting at the event, will have the best chance of grabbing Xeon market share with its x86-based Zen chips expected next year.
“Data center guys would love to see someone compete with Intel in a way they don’t have to move all their code,” Gwennap said.
Applied's X-Gene 3 could become the closest ARM server SoC to Intel's Xeon in performance. (Image: Linley Group)
ARM chips also have a shot in 2017, particularly Applied Micro’s X-Gene 3 which ranks highest in performance among the current crop. Cavium’s ThunderX has the most cores, but they are relatively small, low performance ones that keep overall performance below X-Gene 3.
“ThunderX 2 is significantly better, but it will probably be ThunderX 3 before they can really compete with Intel,” Gwennap said.
— Rick Merritt, Silicon Valley Bureau Chief, EE Times