SAN MATEO, Calif. If last spring's Networld+Interop trade show was a coming-out party for a second wave of network processors, this year's model will do the same for the devices that surround them, including switch fabrics and coprocessors.
OEMs' growing use of network processors instead of packet-forwarding ASICs is heating up the market for these ancillary devices, chip makers said. Switch fabrics, in particular, are becoming a keystone in OEM buying decisions, because the switch-fabric selection can affect the overall system architecture.
Mindspeed Technologies, Vitesse Semiconductor Corp. and ZettaCom Inc. have new entries in this hotly contested market, even as Applied Micro Circuits Corp. readies a multistage fabric targeting high-end, high-throughput core switches.
In coprocessors, Integrated Device Technology and NetLogic Microsystems will take the wraps off new entries at the N+I show, which will be held this week in Las Vegas. And a number of companies, including Music Semiconductors, are readying dense content-addressable memories for deep packet classification. Also, Acorn Networks and ZettaCom will tip next-generation traffic managers.
Mindspeed (Newport Beach, Calif.) last week entered the increasingly competitive merchant market for switch-fabric ICs with its iScale chip set. Vitesse (Camarillo, Calif.) will announce its GigaStream switch fabric this week, and startup ZettaCom (San Jose, Calif.) plans to announce volume shipments of its Zest switch fabric.
One key difference between the Vitesse and Mindspeed parts is speed: Vitesse's fabric is designed for 32 ports of OC-48 (2.5-Gbit/second) traffic, while Mindspeed's can handle up to four ports of OC-192 (10 Gbits/s). But Vitesse and Mindspeed take opposite strategies with the queuing and scheduling functions. Vitesse is putting those functions on the line card, while Mindspeed places them on the switch side.
By putting all the queuing on the line card, next to the network processor, Vitesse was able to keep its switching card simple, using the VSC882 switching chip found in the company's old CrossStream product line.
On the line card, the VSC872 queuing engine handles load balancing, distributing packets to as many as eight switch cards. Each switch chip, in return, communicates its status to up to 16 queuing engines, telling them how much bandwidth is available on that link. The setup was designed with the help of OEM customers that wanted to add system-level ideas to the switch fabric, said Rob Sturgill, director of marketing at Vitesse.
"It looks like a good architecture for the performance they're targeting," said Bob Wheeler, an analyst with The Linley Group (Mountain View, Calif.). "The beauty of moving the scheduling out to the line card is that they can do their load balancing at the line cards and not have to overprovision the fabric." Vitesse will target higher-speed traffic with an upcoming chip set, reportedly called TeraStream.
Mindspeed's iScale, on the other hand, places two queuing chips on the switching card next to the crossbar fabric. The chips make use of Mindspeed's CMOS-based SkyRail transceivers, which offer 3.125 Gbits/s of bandwidth. "Single-stage is what our customers are really asking for in edge devices," said Michelle Zoncick, director of marketing at Mindspeed Technologies, a unit of Conexant Systems. "Single-stage is lower latency than a multistage fabric."
ZettaCom, for its part, mixes the two approaches, splitting the scheduling function between line and switching cards. Its Zest device is designed to handle 1 terabit of throughput in a single-stage architecture, though Paul Liesenberg, vice president of marketing, acknowledged that the mainstream market wants "160, maybe up to 640 Gbits."
Zest could reach higher throughputs in a multistaged configuration, but Liesenberg questioned the need to go that far. "If you do that, you're optimizing your architecture for the very high end of the market, and I'm skeptical whether 8-, 10- or 20-Tbit switches are going to make it into POPs service providers' points of presence in the next couple of years," he said.
Analyst Wheeler agreed that most OEMs will use a single-stage switch fabric. But Applied Micro Circuits (San Diego) is developing a multistage fabric based on technology acquired from Yuni Systems Inc. (San Diego) for beefy core switches that can't wait for single-stage fabrics to catch up to their needs.
"The ones who want more total bandwidth right now have to live with a multistage fabric," said Andy Gottlieb, vice president of marketing for AMCC. Though he agreed that most customers will prefer the single-stage arrangement, Gottlieb held that the highest-end switch vendors will be looking for raw throughput regardless of cost.
Like Vitesse's GigaStream, AMCC's nPX8000 device will handle scheduling on the line card, Gottlieb said.
AMCC already has a mainstream switch fabric announced, the nPX5800, tailored to work with the company's traffic management chip.
CAMs square off
Content-addressable memory (CAM) vendors also will be squaring off at N+I as competitors bring densities up to 9 Mbits and struggle with nonstandard interfaces to network processors. This is another area that is heating up, as many vendors agree that deep packet classification will be imperative at high speeds. "It seems like it's more and more of a requirement, especially when you get to OC-48 or OC-192," said Pat Lasserre, director of strategic marketing for Integrated Device Technology (Santa Clara, Calif.).
Some vendors are building coprocessors around their CAM architectures by adding logic functions. NetLogic Microsystems Inc. (Mountain View, Calif.) this week will release its CFP3128 search engine, the third in its series of classification and forwarding processors. Due to sample in July and ship in the fourth quarter, this ternary CAM has 128,000 entries and additional features such as aging of media-access control (MAC) entries.
The CFP3128 also sports NetLogic's Zero Table Management feature, which adds a priority bit to the CAM array. Standard CAMs are location-dependent, meaning if two matches are found, the one with the earlier register number gets selected. NetLogic's priority bit can be used to override that.
The purpose is to cut down table maintenance, said T.J. Mueller, vice president of marketing at NetLogic. Tables in a regular CAM must be strictly ordered to satisfy the location-dependent prioritization, and some service providers will add blank lines periodically to leave room for updates and changes.
IDT's Lasserre, however, said that company studies show table maintenance takes up only a small chunk of time. In addition, those extra bits exact a trade-off, he said. "The problem is the penalty of having those extra bits for each location, which gives you a bigger die. And you already have a big die," Lasserre said.
IDT has enhanced its new line of CAM chips with added logic for functions such as table maintenance. Sizes of the resulting coprocessors will range from 32,000 to 128,000 entries of 72 bits each, giving the largest device a capacity of roughly 9 Mbits. IDT uses a double-speed clock on the parts, so that the bus can do a 144-bit search in the same time as a 72-bit search, Lasserre said. A 288-bit search is also available, at half the speed.
IDT is initiating sampling of two parts this week: the 75T43100 ternary CAM with 32,000 entries, and the 75T54100 binary CAM. A 64,000-entry ternary CAM is to sample in the fourth quarter, with a 128,000-entry part due to sample in the first half of 2002.
Music Semiconductors Inc. (Mountain View, Calif.) will unwrap at N+I the MUAE64K144, in versions boasting 9 or 4.5 Mbits of storage. The company also is announcing 1- and 2-Mbit parts, under the family name Harmony. All devices are due to sample in the third quarter.
One problem dogging search engines is the lack of standards for the interface between a CAM and a network processor. Normally, an FPGA is used as glue logic in this spot. "Sometimes Xilinx and Altera are the big winners here. They'll have an $800 or $900 part on this board," NetLogic's Mueller said.
NetLogic will announce at N+I that it is working with AMCC to jointly develop an interface between that company's network processors and NetLogic's CAMs. A similar deal is in the works with Vitesse, he said.
But IDT's Lasserre contended that the FPGA "gasket" approach isn't so bad. "Some customers say they like the flexibility of having that gasket chip because they're not tied to an NPU vendor," Lasserre said, adding that he's had customers ask IDT to keep the FPGA-based interface to the network processor.
A Network Processor Forum interface for the search engine isn't likely to come soon, he said. "They're standards bodies, it's tough to get a quorum."
Traffic management also has become a pressing concern, and is widely seen as a requirement for OC-192 systems. "Last year, when we started articulating the traffic management theme, no one wanted to hear it," ZettaCom's Liesenberg said. ZettaCom will release its QM traffic manager to volume availability this week. It handles 1 million active traffic flows; other chips aiming at 1 million flows can't keep more than about 4,000 active at a time, Liesenberg said.
For its part, Acorn Networks (Reston, Va.) will unroll an OC-192c traffic manager, the genFlow-10G, which sits alongside an NPU and regulates the flow of data into a switch fabric. The 256-port device can handle 1 million traffic flows against 64,000 in Acorn's 2.5-Gbit/s traffic manager, the genFlow-2.5G and can carry combinations of packet- and cell-based traffic. Acorn produced its 10-Gbit/s part simply by doubling the clock speed and bus width on the 2.5G part, said Mike Bowser, Eastern-area sales manager. It is set to sample at the end of 2001.
Traffic management is a difficult task that has sunk at least one company: Extreme Packet Devices, acquired by PMC-Sierra Inc. and shuttered before producing a part. Bowser admitted that it's difficult to gauge the competition. "You're going up against a lot of paper right now," he said. "If they don't have an OC-48 part, I don't take them very seriously."
The genFlow-10G is designed to handle OC-48 traffic as well as OC-192, on the assumption that multiport OC-48 line cards will be available before single-port OC-192 cards. The device can handle four OC-48 traffic flows two transmissions and two receives. And the genFlow-10G has functions such as segmentation and reassembly for ATM traffic, and an aggregation capability.