SAN JOSE, Calif. — A group of mainly chip vendors released a draft standard that aims to act as an interface between software frameworks for creating neural network models and the hardware accelerators that run them. It shares goals with a separate effort started as an open-source project earlier this year by Facebook and Microsoft.
The Khronos Group is seeking industry feedback on a preliminary version of its Neural Network Exchange Format. NNEF initially aims to be a single file format to describe any trained neural network model to any chip performing inference tasks with it.
“We have dozens of training frameworks and potentially hundreds of inference engines on the way,” said Neil Trevett, president of Khronos. “That’s a horrible fragmentation.”
The working group that created the draft consists of more than 30 mainly semiconductor companies including AMD, ARM, Intel, Imagination, Qualcomm, and Samsung. The chip vendors see NNEF as a way to share the effort of creating a single software target for their chips, something many are already doing internally.
Web giants such as Amazon, Google, and others each develop their own software frameworks for creating neural net models. They see them as strategic tools to get an edge in efficiency and attract developers.
NNEF aims to be a universal file format used by neural net frameworks and inference chips. (Images: Khronos Group)
To jumpstart their support, Khronos created open-source versions of programs that can export NNEF files from Caffe and Google’s TensorFlow, two popular frameworks.
“We will need a bunch more exporters, but we have those two available now … we will do some paid RFQ-based projects with partners to develop more exporters,” said Trevett.
So far, the web giants seem to be coalescing around their own effort called the Open Neural Network Exchange format (ONNX). The open-source project had a version 1.0 release earlier this month and now has support from Amazon as well as a handful of hardware companies such as AMD, ARM, Huawei, IBM, Intel, and Qualcomm.
ONNX aims to translate models created with any of a dozen competing software frameworks into a graph representation. Trevett said that Khronos is open to collaborating with the effort but pointed out that NNEF is different in two key ways that are important to chip vendors.
Technically, ONNX is a flat representation of operations as a graph. NNEF can do that too, but the Khronos approach also supports compound operations that fuse nodes in a graph. Packing and unpacking operations in this way is one approach that chip vendors plan to use to execute operations efficiently, he said.
Next page: Avoiding past troubles with open source