Many distributed real-time systems must meet strict real-time performance demands. Extremely high-performance middleware has evolved that can handle millions of messages per second with tenths of milliseconds of latency, many times faster than messaging systems developed for less demanding enterprise applications. Such technology is fast, flexible, scalable, deterministic and reliable. Just as important, the middleware supports a net-centric design paradigm that greatly eases system integration and evolution.
Real-world applications have driven the evolution of real-time middleware. For example, the Ship-Wide Area Network (SWAN) for the San Antonio class of U.S. Navy vessels comprises hundreds of computers and must be able to take a hit anywhere while continuing operations. That requires middleware supporting automatic discovery, redundant data sources, sinks, data paths and transports. A last-line defense system against incoming missiles and aircraft directs thousand-round-per-second depleted-uranium guns to track and shoot down incoming targets traveling at hundreds of miles per hour. It coordinates high-speed radar and decision systems with fast automated guns. The system requires data to be distributed with submillisecond latencies to dozens of nodes, driving the middleware to meet extreme performance requirements.. Such applications' connecting of sensors, control and storage forces real-time middleware to provide bandwidth management.
Real-time middleware today drives communications subsystems; fuses disparate sensors into a single world view; connects motion control, graphics and controls in flight simulators; and integrates sensors and command for unmanned vehicles. Each of these applications requires real-time response.
As is often the case, the best way to address complexity is with a simple concept: the publish-subscribe paradigm. Conceptually, with publish-subscribe, needed information is simply asked for and sent. The middleware matches senders to receivers, ensuring that each data "contract" is satisfied. In practice, many details must be specified; nonetheless, the overall model is intuitive and usable.
Real-time designs differ most markedly from traditional middleware in tuning options. For example, most middleware is built on top of the Transmission Control Protocol. TCP was designed in the 1970s and provides reliable byte-stream connections between two computers. Because it ensures reliability, TCP is very useful, but it's also restrictive. For instance, TCP only supports communications between two computers. Its driving state machine has many timeouts, none of them user-settable. It supports reliability but hides away all the important details: how many times to retry dropped packets, how much memory to use, when to send retries and so on.
Real-time middleware, by contrast, is built on top of the User Datagram Protocol (UDP), a much simpler technology. The middleware offers quality-of-service (QoS) control of reliability, timing and timeouts; memory and resource usage; network transports; and priorities. It reacts gracefully when media drop packets, links go down and hardware fails. Most enterprise middleware uses a daemon- or broker-based architecture; real-time middleware is best based on a decentralized "peer to peer" protocol with no extra hops and no single point of failure. A local in-memory cache provides quick access to recent values without retransmission. Automatic self-discovery and configuration allow components to be added and removed dynamically on a live system without disruption.
Real-time middleware must be blazingly fast. The first step is to slash overhead. Whereas older, client-server designs require a round-trip request/response cycle for each message, publish-subscribe has no request traffic for each message. Older middleware also uses central servers to coordinate data flow. Sending to an intermediate server at least doubles the latency of sending a "nonstop" peer-to-peer packet, because the packet must be both received and then sent a second time. In practice, servers may be loaded, congested or not immediately responsive for many reasons. Interposing a server into every transmission also doubles the total traffic on the network.
On current hardware and operating systems, the stack and raw network transport can handle about 50,000 messages per second. Batching multiple application-level messages into a single transport-level datagram allows server-based architectures to achieve throughput in the range of 100,000 messages per second with latencies of several milliseconds. Because peer-to-peer middleware skips all intermediaries, it is capable of nearly 3,000,000 application-level messages per second with batching. Peer-to-peer middleware can also deliver latencies below 65 microseconds with no appreciable variation.
Multicasting--the ability to send a single packet to many destinations--drastically cuts overall latency and raises effective throughput while increasing efficiency. Multicast can theoretically send information to 50 nodes 50 times faster than unicast. However, multicast reliability is a famously hard challenge.
QoS directly affects system performance and capability and includes:
• Reliability--choosing when, and how many times, to retry dropped transmissions.
• Bandwidth control--limiting the bandwidth a single publisher can use.
• Resource management--setting the memory for buffering and retries.
• Filters--optimizing information only to those nodes that need it, by virtually partitioning the network, restricting the number of updates delivered per unit time or delivering only those up- dates with content that matches a filter.
• Fault tolerance--reacting to unexpected failures and situations.
Real-time middleware uses QoS settings to optimize network resources to fit the problem. The middleware, for instance, can simultaneously receive receive a thousands-of-updates-per-second stream of radar tracks, pass updates only every 10 seconds to an HMI operator station, reliably record every reading on a database server and update a collision checker only for those targets within a 5-mile radius, all while multicasting to hundreds of apps on the network.
Publish-subscribe middleware allows applications to pool information from many distributed sources and access it from many locations. Many label this new capability the "net-centric" or "data-centric" architecture.
A net-centric architecture fundamentally changes how easy it is to design and evolve a networked application, such as the U.S. Navy's E-2C Hawkeye aircraft, a design that net-centric thinking transformed. The net-centric architecture is more modular and maintainable than older, client-server based designs. It provided a structured overall design paradigm allowing expansion, changes and independent development.
The DDS standard
In 2005, the Object Management Group adopted a standard called the Data Distribution Service for Real-Time Systems. DDS is the first middleware specification that targets high-performance distributed systems. It includes both an API spec and a wire-protocol design.
DDS has become the rallying point for high-performance, standards-based middleware, especially in the military. The Defense Information Services Agency has mandated DDS for all data-distribution applications in the U.S. military. Most major NATO weapons system designs are upgrading to DDS, and the technology forms the core communications capability for South Korea's most important new ship systems.
Real-time middleware delivers the performance and functionality needed to address the new realities facing the embedded industry.
Stan Schneider (Stan.Schneider@rti.com) is chief executive officer of Real-Time Innovations Inc., which he founded in 1991. He holds a PhD in electrical engineering and computer science from Stanford University.