Recent announcements from QuickSilver have shed new light on that company's adaptive computing machine (ACM) technology. At long last they've ripped the veils asunder and really given us a good look at what can only be described as a truly innovative nay, revolutionary new digital IC architecture (and I don't use these words lightly).
But there's more to the QuickSilver announcements that this, because in addition to revealing the innermost machinations of the ACM architecture whose current incarnation is known as the Adapt2000 ACM System Platform they've also presented us with an associated all-singing-all-dancing tool suite, along with what can only be described as a "blow-your-socks-off" licensing program.
Truth to tell, I'm like a kid in a candy store, because I don't know what tasty topic treat to sink my teeth into first. But of course I'm a hardware design engineer by trade, and I can hear the plaintive, lonesome call of the wild silicon stirring the blood in my veins, so that's where we'll begin.
Who "nodes" where this will lead us?
Now this is a tad tricky, because it's one of those circular topics that really requires us to have some background information in order to place things in context. But we have to begin somewhere, so we'll start by stating that an ACM consists of hierarchically connected matrix of algorithmic processing elements called "nodes," then we'll plunge right to the heart of the matter by mulling over a raw node.
The core of a node comprises two main elements: an algorithmic processing engine and some amount of associated memory (Figure 1). The algorithmic processing engine can be reconfigured (adapted) on a clock-by-clock basis to perform completely different tasks. Meanwhile, the local memory portion of a node core will typically account for 75% of that node. This opens the way for a new form of data processing, because a block of data can remain resident in the node while the processing function being performed by the node is reconfigured around that data.

Figure 1 the core of a generic node
At the time of this writing, QuickSilver has designed and implemented a number of nodes, including an arithmetic node, a bit manipulation node, a programmable scalar node (that acts like a RISC processing engine to handle legacy code), and so on. Once again, each of these nodes can be reconfigured (adapted) on a clock-cycle-by-clock-cycle basis.
For example, an arithmetic node can be used to implement different (variable width) linear arithmetic functions such as a FIR filter, a Discrete Cosign Transform (DCT), or a Fast Fourier Transform (FFT). Such a node can also be used to implement (variable width) non-linear arithmetic functions such as ((1/sine A) x (1/x)) to the 13th power. Similarly, a bit manipulation node can be used to implement different (variable-width) bit-manipulation functions, such as a Linear Feedback Shift Register (LRSR), Walsh code generator, GOLD code generator, or TCP/IP packet discriminator.
A key point is that, although these nodes are widely different, each forms a "Turing complete" machine, which means that any node can perform any task. It's just that different nodes will be more efficient for specific tasks.
The really cunning stuff starts with the fact that each node is surrounded by a "wrapper" (Figure 2). The node wrapper accepts 32-bit words containing data, instructions, and control information and decides what to do with them. At the front end the wrapper comprises a pipeline input connection from the outside world (in the form of the rest of the ACM) and a data distribution unit (used to pass data around the node).
At the back end the wrapper features a data aggregator (used to gather and package data) and a pipeline output connection back to the outside world. The wrapper also contains functions such as a hardware task manager and a DMA engine.

Figure 2 Each node is surrounded by a wrapper
The clever thing is that, from the outside world (the rest of the ACM), the wrappers make each node appear to be identical. This means that when one is creating an application to run on an ACM (as discussed later in this column), you don't have to do anything special to handle different node types you just indicate which type of node you think is best suited to each task and leave it up to the ACM to deal with any of the messy stuff.
Speaking of which, each ACM includes an on-chip operating system. We may envisage this as residing in special system controller node and also as being distributed across all of the node wrappers. The system controller node is like a high-level traffic cop in that it is responsible for managing the data moving on, off, and through the chip. (This includes security management that validates inbound data using a public key-based digital signature system.)
Meanwhile, the node wrappers act like local traffic cops managing the tasks within their respective nodes. The task managers in the node wrappers provide the capability for each node to be working on up to 32 different tasks simultaneously.
The really cool thing is that application developers don't have to be multi-tasking or multi-threading experts or deal with individual nodes (unless they really want to), because everything is dealt with by the ACM itself. Furthermore, the fact that the vast majority of the application and task management system is performed using hardware task managers means that these functions are executed in a couple of clock cycles (as compared to hundreds of clock cycles if one were using a software task dispatcher).
And life continues to get better and better, because if you have multiple ACMs on a board, then their on-chip operating systems link up. From the viewpoint of the rest of the system, it appears as though you are dealing with a single ACM device.
There's one more point we need consider before moving on to the really "juicy" stuff, which is how the various nodes communicate with each other. Each group of four nodes (called a "quad") has an associated matrix interconnect network (MIN), which is essentially a 32-bit, single word, packet switched, full duplex network (Figure 3).

Figure 3 Each "quad" of nodes is connected via a MIN
Via its associated MIN, any node in a quad can talk to any other node in that quad in a single clock cycle. The word-width of each connection is 32 bits (4 bytes), so when running at 300 MHz, each connection can handle 1.2 gigabytes-per-second. Furthermore, the MIN can handle five such connections simultaneously in each clock cycle, giving a total MIN bandwidth of 6 gigabytes-per-second, which is certainly more than sufficient for any of my data processing needs!
In reality, this is total overkill today, because the way in which the nodes tend to work (keeping large amounts of data in their node memory and processing it locally) means that you have a "fire hose" of data coming in and going out of the ACM, while the internal data communications drop off rapidly inside the chip. Having said this, it's good to know that this humongous bandwidth is available for the future, because if history has shown us anything, it's that we will want to use as much bandwidth as we can lay our hands on before long.

Figure 4 Four "quads" are connected by a higher-level MIN
My node is better than your node!
One very important point that's worth noting is that QuickSilver doesn't actually produce ACMs themselves, with the exception of "proof-of-concept" chips, evaluation devices, and also the base chips used in the InSpire development board (as discussed later in this column). Instead, they give you the RTL source code for their nodes, the node wrapper, the MIN, the system controller, and so forth.
This has several implications. First of all you get to decide the optimal mix of node types for your particular application area (maybe you need two arithmetic nodes for every bit manipulation node and so forth). You also get to decide how many nodes to implement on a chip: will four nodes serve your purposes, or would 8, 12, 16, 32, 64 better meet your requirements?
In fact you can modify QuickSilver's nodes or design your own. Your internally-developed nodes might take the form of any legacy DSP or microprocessor cores you feel you just "must have," or you could decide to implement a completely new node from the ground up. The beauty of QuickSilver's scheme is that once you've embedded your node in their node wrapper, it looks just like any other node to the rest of the ACM.
Now this is the point where my mind starts to spin into "totally overloaded and boggled" mode, because you could include a complete FPGA core as a node, or maybe a structured ASIC, or just let your imagination run wild and free.
Once again, this changes the entire business model, because once you've created your first device (which takes a fraction of the time of a traditional ASIC if you use only the QuickSilver supplied nodes), this becomes your unique system platform. Subsequent products simply involve the creation and verification of new software algorithms to be executed by this platform.
The InSpire SDK tools set
There have been several occasions where a company has regaled us with a cunning new hardware concept, but left us flapping in the wind when it comes to the tools required to do anything useful with it. There's little point in being presented with a wonderful new technology like an ACM, and then being handed an abacus for use in the design process.
Fear not, because the guys and gals at QuickSilver are engineers from the old school, and they have all been burnt this way themselves in the past. For this reason they have leapt out of the starting gate with an all-singing-all-dancing development environment called InSpire, which provides a unified programming model for design definition and capture, simulation and emulation, and applications development.
Before we introduce InSpire in a little more detail, it's worth noting that ACM applications are created using a special form of C called SilverC. This is very similar to pure C/C++ with the addition of some special language statements that allow you to state which tasks should be performed in parallel, the preferred node type to be associated with each task, and so forth. The resulting applications can be compiled and simulated as speeds approaching pure C/C++. Ultimately, the end product is a Silverware module, which is a linkable/executable binary object.
In addition to simulation, the InSpire suite includes a hardware emulator and a development board featuring a QuickSilver-supplied base ACM. InSpire also includes a performance analyzer, a debugger, and something called the InSpire Switchboard. In an uncharacteristic display of modesty, the folks at QuickSilver describe this switchboard as an "unprecedented piece of technology."
In reality, the switchboard really is quite clever, because you can use it to assign any node or group of nodes to the simulator, another group to the emulator, and another group to the hardware development board. Irrespective of which tasks are running in which environment, they all appear to be a single chip to the other tools like the performance analyzer and debugger.
And just in case you were wondering, in those cases where an ACM is being used as an element in a larger system design, InSpire also features a SystemC interface. This allows you to model the rest of the system at any level of abstraction in SystemC and then to create function calls from that system to the ACM being modeled and simulated in InSpire.
Once again, you don't need to be a processor expert or a processor expert to design multi-tasking, multi-threading ICs. The InSpire node control kernel and the ACM on-chip controller (including the task managers in the node wrappers) take care of everything for you, so implementing these features in the most complex applications becomes a no-brainer.
However, this doesn't mean that you can't exercise control if you wish. Using the performance analyzer you can visually see which nodes (and node types) are available and how the various nodes are being loaded when multiple applications are being run simultaneously (as each new Silverware application is handed to the ACM, it's on-chip OS reallocates nodes and resources so as to keep all of the applications running at peak efficiency). You can also click on a node in the interface to see which tasks are being executed on that node in real time, and you can select one or more of those tasks and move them over to another node if you wish (again, this can be performed in real time while the applications are running). All I can say is, "WOW!"
License program
In many respects, QuickSilver's licensing program is as interesting as the ACM architecture itself. Called the Adapt2000 ACM Cooperative License, this is like a cool version of an open source program. When you buy into this program, you also buy in to a rich technology pool.
In the case of the ACM hardware, the technology pool includes the RTL (soft cores) for all of the QuickSilver-developed nodes, the node wrapper, the matrix interconnect network (MIN), the system controller, sample library synthesis scripts and layout guidelines, and of course documentation.
Similarly, in the case of software, the technology pool includes full access to the source code for the SilverC compiler, node assemblers, the InSpire debugger, the InSpire simulation platform, the linker, the aCM access API, a variety of standard algorithmic modules, and documentation.
The modified open source license gives you the rights to use and extend ACM technology and IP. You can use existing QuickSilver nodes as-is, modify them, or create new nodes. Similarly, you can use QuickSilver development tools as-is, modify them, or create completely new ones. The only restriction is that, in the case of enhancements and improvements to any of QuickSilver's hardware and software IP, you have an obligation to return such enhancements or improvements to the technology pool at the time of your commercial release.
By comparison, you retain ownership of any uniquely new node designs or tools. In these cases, you can keep this IP for internal use, make it available to the technology pool as a whole, or resell it to (or trade it with) other participants in the pool. It's also important to note that you have complete control over your designs; for example, there are no restrictions with regards to foundries, so you can use who you want to build your ACM-based masterpieces.
The Adapt2000 ACM Cooperative License will cost you $300K to get into the pool, which is less than a tool suite for a typical standard cell-based ASIC design flow. There are also fees and royalties associated with commercial production of devices, but these are in line with off-the-shelf IP. Overall, this cooperative license model is unique for a new technology (truth to tell, I don't think anything like this has happened in this arena before).
In summary
There are a number of companies working in field programmable node array (FPNA) space (note that FPNA is a term I made up myself, but please feel free to use it). Some of the more interesting contenders are:
Some of these devices are aimed at completely different markets. For example, picoChip components, which are intended to be reconfigured "every now and again," are predominantly targeted at wireless base stations. But in the case of low-power devices that can be reconfigured hundreds of thousands of times a second, QuickSilver appears to have a unique offering.
The bottom line is that the ACM architecture works, the InSpire design tools work, the license program is amazing, and the first customer chip is back and in testing. So all-in-all I would say that this deserves an official "Mega-Cool Beans" from me. Until next time, have a good one!
Clive (Max) Maxfield is president of Techbites Interactive, a marketing consultancy firm specializing in high-tech. Author of Bebop to the Boolean Boogie (An Unconventional Guide to Electronics) and co-author of EDA: Where Electronics Begins, Max was once referred to as a "semiconductor design expert" by someone famous who wasn't prompted, coerced, or remunerated in any way.