MUNICH, Germany Xilinx Inc. will give a new twist to the market for digital signal processors when it rolls out its FPGA-based XtremeDSP initiative at the biennial Electronica exhibition this week. Chief executive officer Wim Roelandts said the company's plan to pack up to 192 multiplier units as hard cores in its Virtex-II chips could create a device with a theoretical performance of more than 100 times that of leading DSP cores.
Roelandts predicted that XtremeDSP, aimed at broadband communications gear, would take design wins from DSP vendors like Texas Instruments, Lucent Technologies, Motorola and Analog Devices. Xilinx is also positioning the move as the first step toward the automated design of DSP and data-path system architectures from C++ and Java.
But analysts and experts see significant hurdles ahead for Xilinx's DSP drive.
The power inefficiency, cost and die size penalties of embedding a DSP matrix in an FPGA will bar XtremeDSP from many applications, some said. In addition, observers said, Xilinx will be swimming against the tide of a huge installed base of tools and algorithms amassed by DSP vendors at a time when the San Jose, Calif., company is still shoring up key portions of its software support, including the availability of a high-level language.
Still, the program holds great promise, according to some industry watchers. "XtremeDSP is going to be a wake-up call to the competition," said Will Strauss, president of Forward Concepts (Tempe, Ariz.). "XtremeDSP does things you can't do with an ordinary DSP processor."
Under the XtremeDSP initiative, Xilinx will place on a Virtex-II FPGA as many as 192 18-bit x 18-bit single-cycle multipliers, associated registers, up to 3.5 Mbits of dual-port RAM and as many as 10 million gates of logic.
The result, the company claims, is a theoretical performance of 600 billion 8-bit x 8-bit multiply-accumulate cycles (MACs) per second. The multipliers are cascadable, allowing a smaller array of 32-bit multiplies to be done in parallel.
Roelandts said that by 2002 XtremeDSP chips should be on 100-nanometer (0.1-micron) process technology, ratcheting up performance to trillions of MACs/second and leaving conventional DSP architectures in the dust. Fourth-generation cellular protocols could require performance of 1,500 billion MACs/s, Xilinx estimated.
The underlying Virtex-II devices will sample before the end of this year and be formally announced in the first quarter of 2001. They are designed to use a 0.15-micron CMOS process technology and should clock at up to 250 MHz, Roelandts said.
Xilinx estimated that an XtremeDSP can execute a 1,024-point, 16-bit-data fast Fourier transform in less than 1 microsecond while clocked at 140 MHz. The company estimates that the industry's fastest DSP core takes 7.7 microseconds to do the same operation when running at 800-MHz clock frequency.
"Designers can use Virtex-II devices to implement critical DSP elements of emerging broadband systems," said Chris Dick, the chief DSP architect at Xilinx. Among them, he cited sub-1-µs 1,024-point FFTs, hyperfast adaptive filters, third-generation (3G) system turbocoders, rake receivers and spread-spectrum calculations.
Dick said that thanks to drifting standards, the 3G basestation market is looking very promising for user-programmable solutions. "The turbo-codec specification within 3GPP the 3G Project Partnership just changed," he said. "Anybody with an ASIC solution could be facing a respin."
Analyst Strauss said that among the tasks XtremeDSP can take on is "chipping-rate calculations, which control how you spread and despread spectrum in CDMA." Though beyond the capability of ordinary DSPs, "you can do them with this architecture. Turbocoding also requires extremely fast processing."
Strauss said he was "not sure" XtremeDSP could be put to use in handsets. "There are some power consumption issues there. But 3G presents a tremendous opportunity for all the players," he said. "And it's notable that Ericsson, the world's largest supplier of mobile communication basestations, is one of Xilinx's customers."
Strauss said he felt the established DSP players would respond either by deploying multiple DSP cores on a chip, or by going for a more fine-grained approach based on hundreds of multipliers and perhaps adding their own field programmability.
Henry Wiechman, marketing manager for Texas Instruments Inc.'s C6000 DSP line, agreed. He pointed to the company's C5000 line, with four cores on board. "And the C64X will have 1-GHz performance on a single core," Wiechman said. "All you have to do then is take four, six or eight of those to get the rates FPGA vendors are talking about."
Xilinx's Roelandts countered that traditional DSP players would run into a wall with performance gains from a sequential architecture. "Yes, perhaps you could put 100 DSPs down on a die, but you don't get 100 times the performance," he said. "We can do at 200 MHz what ASIC DSPs can't do at 1.1 GHz. And we're going to have a big advantage in power consumption."
"I see some interesting applications in prototyping for basestations," said Michael Bolle, executive vice president of engineering at Systemonic AG (Dresden, Germany), a DSP chip startup formed to address the same broadband markets as XtremeDSP. "But we have to assume these Xilinx parts will be big die and big on power consumption. I would need to see some figures for power consumption per multiplier in an application. You would need a lot of headroom to allow for the generalized mapping of functions to the multipliers."
Theo Claasen, chief technology officer of Philips Semiconductors, observed that basestations are starting to be built so densely that power consumption and cooling are becoming issues that would count against the use of field-programmable logic. Nor does he see a need for programmable hardware in his company's semiconductor platforms.
"FPGAs have traditionally lagged in terms of power, cost and size, relative to a DSP," added Wiechman of TI. And besides the millions of lines of installed code for current programmable DSPs, designers are often concerned with getting the maximum number of channels for a give space. "So, you have to see how all that matches up," he said.
Nevertheless, Strauss estimated the market for reconfigurable chips for digital signal processing will grow from about $350 million in 2000 to some $1.75 billion in 2005, representing a 41.6 percent compound annual growth rate.
The XtremeDSP is one leg of Xilinx's three-pronged Platform FPGA Initiative. "The microprocessor platform is called Empower, the DSP platform is XtremeDSP and then we have the serial I/O called SystemIO," said Roelandts. "When you can do all three at the same time it starts to get really interesting."
A significant part of the DSP push involves internally and externally developed software. The initiative includes pre-engineered DSP algorithms, in the form of mappings to the XtremeDSP architecture, that make efficient use of the multipliers in Virtex-II, along with system development tools and partnerships with other tool providers, including The Mathworks Inc. (Natick, Mass.).
Xilinx has a road map to take engineers to high-level language (HLL) design entry in C++ or Java within two years. The company also claims to have an acquisition strategy going forward to help realize that road map.
In addition, the company has expanded its intellectual-property (IP) solutions engineering organization and is releasing 11 cores for implementing data communication and image-processing applications.
One of these is a filter generator integrated with MatLab from Mathworks. Other cores include three color-space converters, three G.711 pulse-code modulation speech codecs, and cores for the discrete cosine transform and its inverse.
Xilinx's System Generator tool, which works with Mathwork's Simulink and was announced a few weeks ago, is another crucial part of the XtremeDSP initiative.
"Our Core Generator delivers optimized pieces of IP. System Generator sits at a level above that," said Per Holmberg, product marketing manager for DSP at Xilinx. "It plugs into Mathwork's Simulink and provides engineers with a visual design flow. What you see is Simulink with a Xilinx set of building blocks. You build, simulate and then you parse the design and it calls up Core Generator in the background."
At present System Generator for Simulink contains no provision for automatic hardware-software partitioning. But Xilinx does have a power estimation tool, Xpower, that lets designers do performance-vs.-power consumption trade-offs in simulation.
However, with no high-level language support for up to two years, Xilinx may have a hard time catching up to the breadth of software support companies such as TI have generated.
In addition, cellular developers often prefer to use one development environment for both handset and basestation systems, something Xilinx may not be able to tap into, with its focus on infrastructure gear.
To get the full utility from XtremeDSP, users may have to wait until the architecture is crossed with Xilinx's SystemIO initiative, which promises to offer OC-192 and Gigabit Ethernet serial data rates. To keep up to 192 multipliers fed with data will require a high on- and off-chip data bandwidth.
No pricing has been disclosed for the high capacity Virtex-II devices, and this is likely to determine which applications can benefit from XtremeDSP.
Patrick Mannion contributed to this story.
More Electronica coverage