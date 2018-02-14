7 Ideas for AI Silicon from ISSCC
SAN FRANCISCO — A Google engineering manager called for new AI architectures including a distributed approach to protect data privacy. His talk was followed by more than half a dozen academic papers describing novel approaches to machine learning at the International Solid-State Circuits Conference here.
Several ISSCC papers merged computation in memory, a long-pursued research idea that some believe machine learning could finally bring to broad commercial use. For its part, Google is exploring a hybrid approach where end users keep their data and just send neural network weights to parameter servers in the cloud for processing.
Ultimately Google and its peers need big leaps in compute power to fulfill the promise of AI in their data centers. Just one iteration of one task in a Google photo search powered by machine learning requires 11 billion operations/second, said Olivier Temam, who manages an unspecified AI program at the search giant.
Temam called for a distributed approach where edge devices and cloud services collaborate to train neural nets. Devices do some training using raw data locally, then send changes or neural network weights he called semantic data to the cloud where neural models are further trained and refined.
“For very understandable reasons, people or companies don’t want to send their data to the cloud, so we’ve shown its possible to create models with federated learning,” Temam said.
One observer noted such as approach could attract hackers trying to infer raw data from semantic data.
Google agreed to speak to the audience of several hundred chip designers here in hopes of spawning fresh ideas for more powerful AI accelerators. One challenge in designing such chips is the bottleneck between processors and the large amounts of memory neural nets require.
The search giant needs memory bandwidth on the range of a hundred terabits/s. Today’s high-bandwidth memory stacks are two orders of magnitude too slow and SRAM is too expensive and power hungry, he said.
Several academics described approaches that embed computing in memory. The area is particularly hot due to the rise of a number of specialty memories including memristors, ReRAM and others as well as brain-inspired computer designs that sometimes use large memory or analog arrays.
“We found most energy processing for neural networks was in data movement. You have lots of data and weights to manage, so data movement dominates energy consumption more than compute,” said Vivienne Sze, an associate professor at MIT who co-authored a 2016 paper on the Iris architecture to address the issue.
Her group at MIT is now working on flexible architectures that can run a growing variety of neural nets including many simplifications of them starting to emerge. They are also exploring how much neural net acceleration can be done on a single watt of power for applications like robots and drone cameras.
Google’s Temam said the company is open to all new ideas as long as they are practical and low cost. “We want to keep bringing the cost down so we can deploy more massively and eventually at the edge,” said Temam.
The following pages provide glances of six more ISSCC papers on AI accelerators. Most aimed to push energy consumption to new lows for inference jobs with several supporting some training work.
