This is a long, long way from what Qualcomm claim they can fit in a handset where 3-4W is the total for the complete smartphone including PA, baseband, WiFi/Bluetooth/GPS etc., display, Android OS and last but not least the budget for applications of 600-700mW
Very remarkable results are obtained by the Purdue University professor, it seems a really working technology, yet it is not much explained how the individual NPUs are working as neurons, but these technique will be having many roles to be played beyond it is explained in the article and explained by Qualcomm
"The brain is also very power efficient, he explained, consuming only about 20 watts at a cost of under a quarter of a cent per hour, whereas simulating the brain on a conventional von Neumann computer would take up to 50 times more power"
This statement is badly wrong and about 5-6 orders of magnitude off both in FLOPS and Watts!
This article reports an 83000 processor cupercomputer being able to deliver about 1% of the calculations performed by the brain and such super computers typically dissipate megawatts!
According to the German supercomputer centre in Juelich it will take an exaFLOP machine to simulate the entire brain in about 2020 with a power budget on the order of 20MW
Given current supercomputers can only manage about 10GFLOPS/W even this figure is in considerable doubt and would require almost 2 orders of magniture improvement in FLOPS/W in the next 10 years wit Moore's law and supply voltages plateauing
More hype than substance I'd say at least when it comes to the claims about a neural network processing unit (NPU) in hardware.
Looking at the braincorp jobs page http://braincorporation.com/index.php/category/opportunities/ these guys are hosted inside Qualcomm and are focused on building robots, robotics algorithms and tools.
They are hosted inside Qualcomm and appear to be leveraging Qualcomm's GPUs using OpenGL and OpenCL, as well as C/C++, Python and Matlab to to the NN processing rather than some kind of specialised NN processor.
I'd say most of the NN work is being done in Matlab with some kind of back-end code generation of C/C++ code or potentially OpenGL/OpenCL ES code, the smart approach would be to have libraries for the GPU which are optimised at low level and be able to call identical libraries within Matlab or Python for rapid development rather than designing an esoteric compiler.
I'd guess this may lead to adding NPU support in the ISA of future GPUs at some point if they really need it
When I first saw this I thought, "Rats! Why can't all of the fantastic and very similar work that Eugenio Culurciello has been doing for years under Office of Naval Research (ONR) funding get this kind of press...?" This also shows why one should read figure captions. From said caption on the familiar-looking first figure I finally realized that this _is_ the next phase for Dr Culurciello's amazing chips! Since shutting down the government seems to be the theme-du-jour, it's worth pointing out just how huge a role federal funding plays in giving folks like Dr Culurciello a chance to move the ball on some wild new idea to the point where a large company like Qualcomm can see the potential and catch the pass. That's the kind of teaming where everyone benefits.
Next year Qualcomm will release its suite of software tools that work with an FPGA emulator for developers to use when creating applictions for its NPU. Regarding automotive, Purdue University professor Eugenio Culurciello has already shown that recognition of roadside scenes can result in realtime classification into pedestrians, vehicles, buildings, etc., but I suspect it will take a year or two for developers to start making good use of this type of information in collision avoidance and similar automotive applications.