SAN JOSE, Calif. -- CogniMem Technologies Inc. aims to demonstrate at Supercomputing 2012 the scalability of its non Von Neumann approach to computing based on pattern recognition. The startup will demo a system using 40,000 of its silicon processing elements.
The CogniBlox system will employ 10 boards, each using a Lattice FPGA and four of the startup’s own CM1K parallel processors and 4MB of magneto-resistive memory. Each of the CM1K chips includes 1,024 processing/memory elements.
The chips use techniques born in the era of neural networks and fuzzy logic to learn and recognize patterns in data. Thus the system and processors do not function like today’s load-store machines and are not programmed using current techniques. However they can be used for a wide range of applications including stock and weather forecasting, data mining and analytics.
“I firmly believe this is the way things need to evolve--this is the real parallel computing model,” said Bruce McCormick, a 36-year Intel veteran who helped found the startup in April 2011.
The new approach solves the problem of scaling, the startup claims. Today’s multicore CPUs and GPUs can scale but not in a linear fashion, something McCormick said is possible and he aims to demonstrate with CogniBlox.
“To our knowledge no one has put together as a system where performance does not change as you add more hardware and has this many elements together in one system,” he said.
Earlier this year, the company failed to get working a more ambitious demo of a million-node system, suggesting it hit issues with achieving low latency in its FGA links. It is still pursing multiple paths toward that goal.
One downside to the CogniMem approach is cost. Each board costs about $3,000 in single-unit quantities, so the demo system with 40,000 elements could cost nearly than $30,000. Shifting from expensive MRAM to cheaper NOR flash memory could slash costs, but NOR lacks the responsiveness of MRAM.
The demo system aims to execute 5.2 tera-operations per second. The architecture could be a contender for an exascale-class supercomputer, McCormick claimed.
CogniMem is privately funded. McCormick would not say who the startup’s investors are or how much they have put into the company.
One of CogniMem’s founders was a lead designer on the zero instruction set computer (ZISC) developed with IBM. The ZINC project is no longer active at IBM, but Big Blue did announce work last summer for the Defense Advanced Research Projects Agency (DARPA) on chips for similar concepts.
Separately, Frost & Sullivan gave CogniMem its 2012 New Product Innovation award. The market watcher cited the systems novel approach to resolving the system memory bottleneck to enable high levels of parallelism in high performance computing.
Actually, the innovation is that the N CM1K chips residing on a stack (4 per stackable board) make a bank of N*4096 cognitive memories responding in parallel to an input pattern (like a massive lookup with parallel access). The RBF and KNN classifiers are inherent to the bank of CM1Ks.The FPGA is used to configure the CM1Ks into a single or multiple banks, handle comm with hosts,etc. With regards to your question, CogniMem web site has an RBF Tutorial for download as well as a ref guide.
It does indeed. Transputers had limitations but not nearly that of old school stuff. Like a lot of things they did incredible stuff with the right people driving them, but were anathema to most people. It was sad to watch them kicked to the side by the main streamers.
it seems that the innovation would be in the algorithm. The hardware looks very similar to other fpga clusters like sciengines, or dini group or comblock modules. Can anyone point to a good tutorial paper on "non-linear classifiers Radial Basis Functions and K-Nearest Neighbor" or patent maybe?
Sounds like they need non-volatile SRAM, which MRAM (and FRAM) emulates nicely. Per Everspin's website: "Parallel MRAMs (8-bit and 16-bit) have SRAM read and write cycle times and asynchronous timing interfaces that use standard SRAM access timing."