SANTA CLARA, Calif.--A new class of short reach 10 Gbit/second Ethernet chips is on way, opening the door to smaller, cheaper, lower power switches for data centers.
The so-called 10GBase-T SR (short reach) physical layer chips could cut power consumption from 4W to 1.5W per port. That opens the door to 60W 10G top-of-rack switches for data centers, replacing 250W systems used today. The power reductions enable use of lower cost power supplies and other components so systems can be closer in size and cost to today’s gigabit switches.
The key change is the new physical-layer chips do not have to support the full 100 meter distance of the 10GBase-T standard. Instead they can automatically negotiate a connection, optionally supporting 100G or shorter distances.
Today’s data centers typically only need five meter links inside a rack and 30 meters to an aggregation switch. “The IEEE 10GBase-T standard really overshot [distance requirements] and a new 40G effort realized that is and may back off,” said Dan Dove, a senior director of technology at Applied Micro Circuits Corp., speaking at the Linley Data Center Conference here.
Ethernet standards traditionally supported 100 meters, but “that last 20 meters for 10G became a problem,” he said.
Dove made a short reach proposal to the IEEE 40GBase-T group now starting its work. It aimed to include the 10G generation, specifying optional lengths of 10 and 30 meters and giving vendors an option to not support 100 meters at all for low cost products.
The proposal drew vocal opposition from cable and patch-panel makers. “I didn’t even put it to a vote--we wouldn’t make it--so we will compete in the market,” Dove said.
Vendors have done some work on ad hoc multi-source agreements defining the approach and may create a marketing alliance to promote it. The change can be made to existing chips in firmware, Dove said.
If the work moves forward, “PHY vendors will jump on it and create devices,” said Kamal Dalmia, vice president of sales and marketing at Aquantia, a 10GBase-T vendor on a panel at the conference.
the cable lengths quoted are pretty peculiar - no in-rack cable needs to be 5M - 1.5-2.5 would be preferred to minimize the excess. and 30M to an upper-level switch is also a bit extreme, since it's easy to arrange ~50 racks within a 10-12M diameter (even for a spread-out air-cooled cluster). I wonder whether they said meters but were thinking feet...
that anecdote from the patch-panel maker is pretty sad. they seem to be thinking of traditional IT, not clusters. a cluster will not normally use any patch panels: nodes connect directly to the leaf/TOR switch, and they connect directly to spine switches. it would be sad if archaic cable-length concerns quashed this effort.
I wonder how much the PHY is keeping switches from getting reasonable-priced. 10G won't really take off until its per-port switch+cable+nic price is very close to 1G (let's say, around $100.)
Great article Rick. One point that was not captured is that it takes 75% to pass a technical motion in the IEEE Study Groups. While there was support among a broad range of participants, a popular proposal can lose with a 74% approval.