SAN JOSE, Calif. Researchers at IBM Corp. are creating a prototype for a vast data-storage system built Lego-style with storage-array bricks that can be stacked into a cube-shaped server. Such systems would offer significant gains in data throughput and fault tolerance while significantly reducing the size, cost and power of today's data center storage systems, the company said.
IBM's Ice Cube project aims to define a way for end users to easily maintain increasing amounts of data, while also plowing ground for a similar approach to computing systems.
As a proof of concept, IBM is attempting to build by the end of this year a 32-terabyte storage system out of a 3 x 3 x 3 array of 27 small, relatively simple hard-disk modules it calls collective intelligent bricks. "We like to say that our goal is you will be able to hug the Library of Congress by the end of this year," quipped Jai Menon who heads storage research at IBM's Almaden Research Center.
Designing software that can mask the complexity of making a collection of plug-and-play drive modules appear to a user as one cohesive file system is expected to be one of the core challenges of the project.
"The way we envision it, it will be a fully virtualized storage system. The hard part is the software," said Winfried Wilcke, a program director at Almaden who is creating the storage bricks with colleagues at IBM's research center in Mainz, Germany.
Indeed, only pieces of the software that dynamically spreads out and protects data across the collection of drives will be available in the prototype this year. A commercial-grade version of the software "is going to keep us busy for awhile" said storage research manager Rich Freitas.
Expectations are high for IBM's storage cubes. Users will be able to add bricks as needed to increase the storage they need, and software will automatically generate a single system image of data on the cube. Despite the highly dynamic and distributed nature of the physical storage, data will be replicated across disks as needed for fault tolerance and to allow it to be accessed within acceptable latencies.
The system will also automatically compensate for any hardware failures without user intervention. In fact, the philosophy behind the system is to allow failed modules which might represent 10 percent of a system's modules over five years to be simply left in place.
About half of all repair actions in a data center create other failures, which in turn require repairs. Under the brick approach, modules with hardware failures are simply left for dead in the cube. "A central tenet is you leave it alone," said Wilcke.
Although the software represents an enormous challenge, researchers said they have a firm grasp of the hardware issues. Each brick consists of twelve 2.5-inch hard drives, managed by three disk controllers tied to a microprocessor and a standard eight-port Ethernet switch. Future versions would likely use lower-latency, higher-throughput Infiniband switches.
Each switch is linked to six couplers, one on each side of a brick, which can communicate with adjoining bricks at rates up to 10 Gbits/second. The couplers are essentially capacitive plates that will communicate wirelessly over a 3.125-GHz frequency using an alternating current.
The architecture gives each module an effective throughput of 60 Gbits/s. Throughput on a full cube system can scale based on how many of the external facing couplers are linked up to a wired interface. "You could have terabits per second in a large cube," said Wilcke.
Patents on the couplers are still being written, said Wilcke, who declined to describe exactly how they work. The prototype system will use an iSCSI interface.
Thanks to the 3-D mesh nature of the cube's internal communications, latency would be only a few microseconds between the furthest drives in the stack. "For a storage server that is negligible," Wilcke said.
Power from a 48-V supply passes through the brick via connectors at the top and bottom. "We had to aim for the highest power density in the design and that shot down an air-cooled system immediately," Wilcke said.
A water pipe rises through each vertical stack of bricks, linking to heat pipes on each module. The water cooling scheme is cheaper than air cooling, researchers said. Air conditioning units can take up 40 percent of the floor space of a data center and getting cool air to hot spots on a chip involves a difficult "black art" of fluid dynamics, said Wilcke.
Overall, IBM believes its cube structure could cut power requirements by a quarter compared to monolithic storage systems. Noise could be significantly reduced as well. But the biggest gains come in reductions by a factor of seven to ten for required floor space.
Most importantly, cost of the modular system would be much lower that today's monolithic storage servers. For instance, the cost of a large external switch used by most storage systems represents about half the overall system's cost, but only 10 percent in the IBM scheme.
"I think when all is said and done, the cube will be cheaper than existing systems," Wilcke said.
Mix and match
While the storage bricks are under construction, researchers are already starting to imagine similar computer bricks that might be mixed and matched with the storage bricks in future data center cubes.
"It's really straightforward. We can easily handle 500-watt power dissipation in a brick like this," said Wilcke, describing a four- or six-way brick server.