Dongjoo Shin, a PhD student at the KAIST, said that the deep-learning SoCs that he called DPUs will be the next big thing in microprocessors. He described a device that essentially combined two processors in one, hitting 8.1 TOPs/W on 4-bit operations.
An impressive video demo showed the chip powering a robot that could recognize the image of a man, his gestures, and objects such as a drill. Shin said that he is already working on two more SoCs: a general-purpose device and one dedicated to very large images.
A Harvard researcher described a 28-nm research chip supported by funds from ARM and DARPA. It consumed 568 nJ/prediction at 1.2 GHz by processing groups of neurons at a time using data-level parallelism on highly sparse data with resilience to errors.
“Most of the execution time is in MAC operations,” said Paul Whatmough, who presented the paper.
ST's convolution accelerator was packed with MACs. (Image: ISSCC)
“IoT is a huge driver, devices are becoming sensor-rich, and apps need to turn data into information,” he said. Deep learning is increasingly seen as “a universal classifier to interpret noisy data, but it requires lots of memory and compute.&rdquo
Microprocessor veteran Marc Tremblay was one of several veterans attending the session. He agreed with others who noted that there are many efforts to define embedded deep-learning processors today both in high-profile startups and established companies, including his employer, Microsoft.
Next page: STT MRAM hits 4 Gbits