[Editor's note: For additional reading on the topics discussed in this article, see Using a multicore RTOS for DSP applications.]
You may find that your best time to market, cost, and performance will come from a heterogeneous processor architecture—that is, a processor that includes both general-purpose processor (GPP) and digital signal processor (DSP) cores. Combining two or more processors into your design allows you to draw on the strengths of both, increasing your overall efficiency. Such a design, however, introduces new challenges to the software designer. How will you partition the system for optimal loading levels between the processors? How will you perform scheduling on independent processors to ensure dependent activities are executed in order and with the lowest latency? And how can you optimize inter-processor communications so that the computational benefits of a heterogeneous design are not lost to data-transfer overhead?
In this article, we will examine how to program a heterogeneous processor architecture based on the proven method of the Remote Procedure Call (RPC). We will examine how this method addresses the concerns listed above. We also explain how the RPC introduces some pitfalls, and show how they may be avoided.
Remote Procedure Call basics
The first widespread usage of the Remote Procedure Call concept was in the UNIX operating system in the 1980s, though its usage dates back at least to the 1970s. Whereas a Local Procedure Call (LPC), also known as a function call or subroutine, is executed on the processor that issues the command, a Remote Procedure Call (RPC) is a command executed on a different processor.
In the terminology of the RPC, the command-issuing processor is known as the Client and the command-executing processor is known as the Server (see Figure 1). The Client sends the command and its parameters to the Server over a physical communications medium, possibly employing some kind of communications protocol or stack. We will refer to this physical medium and communications protocol as the Inter-Processor Communications (IPC) Layer. In the case of computers, this layer is typically a network. For embedded processors, there are many options for implementing this layer, such as a PCI bus, serial or parallel data port, or shared memory. Once the server completes execution of the command, it sends a message back through the IPC Layer to the Client, providing any return values of the procedure.
1. A simplified view of the RPC software layers.
The actual RPC mechanism is implemented through stub functions. There is a client stub function and a server stub function for every remote procedure. The application calls the client stub function just as it would make a local procedure call. However, instead of performing the requested function, the client stub packages the command and any needed parameters into a message and accesses the IPC layer to send the message. The IPC layer on the server device receives the message and passes it to the server stub. The server stub then unpacks the command and parameters from the message and makes the function call, acting as a remote proxy for the application. When the function returns, the server stub packages any return values into a return message which it sends via the IPC back to the client stub, which then returns the values to the application.