San Mateo, Calif. - When designers discuss the problems of integration in a system-on-chip, generally the talk is about complex functional logic blocks, processor cores, hundreds of memory instances and the like. If I/O gets discussed at all, it's usually as an afterthought. That bias can have its disadvantages, especially if the I/O topology isn't carefully thought through during floor planning or if the impact of the pad ring on power distribution and consumption hasn't been anticipated.
But when the SoC in question is a high-bandwidth switching chip, the total bandwidth demanded of the I/O pads can be enormous. The I/O blocks themselves may be the most important external intellectual property in the design, and everything may depend on their successful integration onto the chip.
That was certainly the case when Internet Machines produced the SE200 switch chip. The device is a RAM-based 64-by-64-port nonblocking crossbar, handling an aggregate data rate of up to 200 Gbits/second. To do that, the device requires 64 instances of a 3-Gbit/s I/O port.
IM licensed a serial cell with integral serializer/deserializer (serdes) from Rambus Inc. That solved the immediate problem of the pin electronics-but it was just the beginning of the integration process.
Once the IP was licensed, a number of issues had to be dealt with in parallel. At these speeds, a layout that permitted data to flow between the I/O cells and the connecting cells on the die was clearly essential. It might be less obvious that this signal flow didn't start and end on the die: The entire path, including the circuit board, bumps, package and die, had to be modeled to ensure data integrity.
"Floor planning was a serious problem," said director of ASIC development Todd Khacherian. "We had to place 16 Rambus macros on the die but leave room for the accompanying logic, the switch fabric and the other I/Os. And we had to determine a bump configuration that would provide appropriate spacing for the signal pins. It required quite a few iterations to get the die, package and board layouts to all work together to meet the requirements of these signals."
Nor was signal routing the only issue at the floor plan level. Power and ground had to be provided for the macros, bringing up issues of how many bumps would be needed (a question that involved estimated current profiles, package and board inductance), where the bumps should be placed (which of course influenced the signal placement question) and how much decoupling would be required. For power routing, the design team opted for major overkill, just to be sure.
To give the team more leverage in the latter area, IM deployed a pair of weapons. One was to use the intrinsic capacitance of the package-the enemy of signal integrity-to provide supply decoupling. The other was to develop a custom decoupling capacitor cell that used metal-1 to form a modest-value capacitor. Those cells were laid out in much of the remaining space after the logic cells had been placed, eventually covering a large portion of the total die area and providing quite a bit of help with the decoupling problem.
Close eye on the clock
Another serious issue for any sort of synchronous I/O macros at these speeds is clock routing. Not only are there the obvious skew issues, but the Rambus cell-again, like any cell with a very fast serdes-has very tight specifications on reference clock jitter. "That made clock distribution really a big deal," observed IM vice president of engineering Chris Haywood.
The attention appears to have paid off, according to senior hardware designer Tim Bakken. "The initial results in the lab on jitter and supply noise look good. And by the way, the neat tools we can get now for characterizing jitter are very helpful."
Finally, the design's status as one of the early applications of foundry Taiwan Semiconductor Manufacturing Co.'s 130-nanometer LV low-k process provided some interesting moments. "The logic core speed required this process," Haywood said. "But we were early enough [on it] that some of the foundry Spice models were not mature, and the Rambus cell had never seen working silicon in this process before. We double-checked everything and ended up building in lots of margin when we couldn't be sure."
The low-k option proved an issue of two sorts. On the positive side, it provided added margin for routing high-speed signal traces between the Rambus cells and the cells they talked to. On the other hand, there were reliability questions about the low-k material on a die as large as the SE200. "We were pushing things, for back then," Khacherian said.
The combination of care, overdesign, clean timing of the Rambus macros and Magma timing-driven place and route appear to have done the job, since the SE200 is looking good in the lab. The chip stands as an illustration that sometimes a significant portion of the design task is destined to go into the I/O.