SAN MATEO, Calif. In an effort to ease the burden on engineers designing multiple-processor chips, Tensilica Inc. is readying a design environment that packs system-modeling tools, a shared-memory interface and enhanced debug capabilities for its Xtensa reconfigurable processor.
The new design environment, scheduled to be ready by next month, will be released at a time when more of Tensilica's customers are choosing to pack multiple processor cores on a die. Of the 50 devices that have been designed or are being designed by its 30 or so licensees, about 20 are using multiple processors on a chip, according to the company.
With 0.15-micron design rules, the Xtensa processor offers some 100,000 usable gates per square millimeter, which is plenty of room for lots of logic on a typical 8 x 8-mm die. "Any single-processor core is between 25,000 and 250,000 gates, so that's only 1 to 4 percent of the die," said Steve Roddy, director of product marketing for Tensilica (Santa Clara, Calif.).
But that doesn't make it easy. CPU-DSP implementations, for example, often require separate tools and programming environments, though companies like Texas Instruments Inc. and the Intel-Analog Devices team are trying to alleviate some of those problems with a more unified development environment.
Though Tensilica claims to have an edge in this area with its ability to add DSP instructions and simulate the results after a few hours, things can get dicey when combining several processors on a chip. One trouble spot, for example, has been ensuring fast I/O performance between Xtensa CPUs. Tensilica's standard bus interface was geared for conventional I/O tasks that require long latencies, which can't be tolerated when trying to get deterministic performance between processors. As a result, many of Tensilica's customers were having to yank on the hardware in more ways than the company wanted. "They were having to pull apart our processor. We had a closed scratch pad and cache interface that they were breaking into," Roddy said.
To simplify things, Tensilica developed a low-latency, single-cycle local-memory interface with a bandwidth of 128 bits at 200 MHz for processor-to-processor I/O. A separate processor interface will handle the nondeterministic response times, such as peripheral I/Os. Also as part of the new environment, the company has extended its cycle-accurate instruction-set simulator to support multiple processors. This includes a simulation API that lets a designer come up with models for different bus, memory and processor configurations.
Finally, Tensilica has revamped the debug environment so that designers can choose which processors must stop when the chip encounters an exception or trips a breakpoint.
Tensilica plans to make the design environment available in July for a $25,000 license fee.