SAN FRANCISCO The eagerly anticipated Cell processor from IBM, Toshiba and Sony leverages a multicore 64-bit Power architecture with an embedded streaming processor, high-speed I/O, SRAM and dynamic multiplier in an effort, the partners hope, to revolutionize distributed computing architectures.
Although the technical aspects of the design, which has been in the works for nearly four years, are tightly held, details are emerging in excerpts from papers to be released today for the 2005 International Solid-State Circuits Conference(see story, page 94), as well as in patent filings.
The highly integrated Cell device has been billed as a beefy engine for Sony's Playstation 3, due to be demonstrated in May. But the architecture also addresses many other applications, including set-top boxes and mobile communications. Workstations fitted with the Cell architecture a $2 billion endeavor are already in the hands of game developers.
Five ISSCC papers from members of the 400-strong Cell processor team (see related story, "Best Development Teams," page 64) open peepholes onto a highly modular and hierarchical first-generation device implemented in 90-nanometer silicon-on-insulator (SOI) technology.
At root, the Cell architecture rests on two concepts: the "apulet," a bundle comprising a data object and the code necessary to perform an action upon it; and the "processing element," a hierarchical bundle of control and streaming processor resources that can execute any apulet at any time.
The apulets appear to be completely portable among the processing elements in a system, so that tasks can be doled out dynamically by assigning a waiting apulet to an available processing element. Scalability can be achieved by adding processing elements.
These ideas are not easily achieved. According to data from Paul Zimmons, a PhD graduate in computer science from the University of North Carolina at Chapel Hill, they require a highly intelligent way of dividing memory into protected regions called "bricks," careful attention to memory bandwidth and local storage, and massive bandwidth between processing elements even those lying on separate chips.
At the top level, the architecture appears to be a pool of "cells," or clusters of perhaps four identical processing elements. All of the cells in a system or for that matter, a network of systems are apparently peers. According to one of the ISSCC papers on the Cell design, a single chip implements a single processing element. The initial chips are being built in 90-nm SOI technology, with 65-nm devices reportedly sampling.