MANHASSET, N.Y. Tasking Inc. is joining Texas Instruments Inc. (TI) and others at the vanguard of companies seeking to demystify software development for processors that use multiple, and sometimes disparate, processing cores.
Using target-independent debug interfaces and a central coordinator, Tasking has devised a multiprocessor debug utility that simplifies designing with multiple processors in both homogeneous and heterogeneous environments.
Tasking's Multi-Core Debug System (MDS) addresses a burgeoning need for tools that can handle the complexities that result when RISC and DSP cores are placed on the same piece of silicon. Tasks such as processor interfacing, interprocessor communications, run-time execution and coordination, and data presentation have drastically slowed product development and time-to-market for heterogeneous chips.
Recent architectures from Intel Corp. and TI illustrate the trend. TI combined an ARM-925 RISC and a TI 55x DSP on a die to form the OMAP architecture. To help designers cope, TI plans to roll its Code Composer Studio 1.2 for DSPs and its CCS 1.3 for ARM processors into a single-install CD with additional plug-ins that will provide greater insight into OMAP's interworkings.
In Intel's case, the heterogeneous model for the pending Personal Client Architecture (PCA) depends on its own XScale RISC and the Micro Signal Architecture DSP that it co-developed with Analog Devices Inc. Intel recently hooked up with tool vendor CAD-UL (Ulm, Germany) to work out the debug issues (read the full story online).
"I would say that a good debugger, for even a single core, is not a solved problem," said Jeff Bier, general manager of Berkeley Design Technology Inc. (Berkeley, Calif.). "When you add multiple cores, whether they are homogenous or heterogeneous, that is a significantly harder problem."
In such instances, Bier said, the debugger needs to understand first of all that there are multiple cores. Then it must communicate effectively with those cores and coordinate all their activities. "This means starting them all at one time, stopping them at once, being able to look at the information passing between them and then presenting all this information in a coherent, understandable way."
"Anybody can step and repeat a number of processors on a die," said Will Strauss, president of Forward Concepts (Tempe, Ariz.). "But getting them all working together in a meaningful manner is tough."
Tough or not, Strauss believes most designers will have to come to grips with the task. "Multiprocessor debugging used to be reserved for niche DSP applications, such as military and medical imaging, but now that we're down at 0.13-micron processes, we're seeing hundreds of processor cores on a die," he said, pointing to recent announcements by ARM and Tensilica licensees.
For Laslo Gross, director of business development at International Wireless Technologies LLC (IWT; South Plainfield, N.J.), the big problem isn't simply multiple cores on a die or board but the mixing of RISC and DSP processors, regardless of the number.
"They [RISC and DSP] both have different compile and debug environments. What many companies are trying to do is clobber the two together, and in doing that you lose a lot," said Gross. "You have to coordinate the communication between the two, while watching to see if it crashes which it invariably does."
The resultant need to continually reboot and reload process results is a waste of scarce and costly engineering resources, Gross said. "That's why we chose to opt for multiple cores on a single homogeneous platform, with some RISC functionality added."
IWT took that approach in its recently announced MC2K baseband processor for software-defined radio. "This approach allows us to move things back and forth as we need it, and debugging in that environment is much simpler. The issues are less than with a heterogeneous environment."
Tailored solution
To support its three-core DSP solution, IWT tapped VLSI Systems in Finland. "They tailored their solution for a multicore debug environment, so you can simultaneously debug and run," said Gross.
Part of the problem with heterogeneous environments, as Gross sees it, is getting RISC and DSP houses to cooperate to realize effective and efficient solutions. "To date, the efforts to do this have been pretty poor," he said. "The problem is the two disparate environments.
"That's why an independent entity [such as Tasking] will do a better job than someone from one of the major vendor houses. That's where these third-party houses are most efficient," said Gross.
At 3DSP (Irvine, Calif.), which provides debug tools for complex DSP-based systems-on-chip, chief technical officer Kan Lu agreed on the potential effectiveness of third-party, independent software companies in crafting debug support for heterogeneous chips.
"While our customers haven't asked for this right now, there are no fundamental problems to doing it," he said. "But it's a case of us having to work beside ARM or MIPS to do it. Multiple vendors have to work together to succeed.
"Tasking is the first to come up with [such support], and we're talking with them about it, in case our customers start pushing [for it]," he said.
Tasking (Dedham, Mass.) had its own criteria for dealing with core-independent debug, according to chief technical officer Julian Horn. "We wanted to be able to reuse all the single-core debuggers we had, without any major changes. At the same time, we wanted to be able to reuse all the existing target systems that we already knew how to interface with," he said.
The design goal became to push all the intelligence into a middle section that would mediate all interactions between the debug utilities and the target simulations or hardware.
Generic interface
A key aspect of the Tasking approach is the Generic Debug Interface (GDI), which links the debug entity to the Multi-Core Debug System and maintains processor independence. "Essentially, the single-core debuggers are all ignorant of each other; each thinks it's the only debugger in the system," said Horn.
The debug entities could be simulations, hardware environments inside separate pieces of silicon or hardware environments sharing the same piece of silicon.
"A typical configuration is a DSP and a CPU," said Horn. "The tool doesn't care about the mix; it's blind to the kind of target, thanks to the GDI." All the targets are then multiplexed through the middle control section.
The MDS manager is the presentation function of the system. It shows the user all the processors in the system, as well as their relationship from one to the other, so the designer can group them into units arbitrarily.
"In a typical example," said Horn, "I would take two processors and run them as a unit. If one gets a breakpoint, I want them both to stop. If I say go to one, I want them both to go, so they step together or however the user determines they should be grouped or run."
The MDS manager describes all that to the tool, and the information is sent to the core intelligence of the system. "The core now knows the roles and relationships, so it knows what to do when one of the debuggers upstairs sends a command down, thinking it's telling the target system to go," said Horn. "If [the debugger is] part of a synchronized group, the piece in the middle will take care of the details to orchestrate the related debuggers." That avoids confusion among debuggers about the state of the target.
The target side consists of a plug-in architecture that for now favors simulation over hardware. At the Embedded Systems Conference this week, the company demonstrated the debug environment using a simulated StarCore DSP and PowerPC RISC.
For simulations, a piece of software between the core and the target coordinates and executes all the simulators. "It's important that there be some kind of layer to coordinate the executions," said Horn, "because if they're just separate Windows processes with each executing as many instructions as it can, given the amount of time that Windows gives it they'll get all out of sync. This layer enforces the time relationship across related processors."
Targeted environment
For hardware layers, the connecting software is responsible for dealing with the differences between a single JTAG connection, attached conceptually to a single CPU, and a chained JTAG implementation that's potentially attached to multiple CPUs.
"It has to know what those look like underneath, as the primary target for this kind of technology is a hardware environment with a single chip that has multiple CPUs in it," said Horn. But the system is equally happy with discrete processors on a board.
While the system has no theoretical scaling limits, Horn acknowledged that "this kind of interactive, step-through process does break down at a certain level. And this software version doesn't really address that."
Tasking will offer the Multi-Core Debug System as an extension to its existing line and plans modifications to its single-core debuggers to allow them to work with the multicore release.
"While we haven't figured out the exact bundling for deployment, we're looking at the end of the year for rollout," said Horn.