They don't enjoy as much hype as hardware hot swap, but redundant network links play just as important a role in the system designer's quest for 99.999 percent-"five nines"-availability and reliability. In fact, redundant links can serve a dual purpose: to give a network both greater fault tolerance and greater throughput. It doesn't matter whether the network is wide, local or, for that matter, very local, such as the network of line cards found within a telephony switch.
For instance, you could use multiple LAN links to connect two servers: If a link failed, data could be rerouted automatically across one or more of the remaining links. Likewise, you could use multiple links within a router to connect the line cards to the main CPU. If the main link went down, a standby link -typically a slower maintenance bus-could immediately transmit error information, alerting a system administrator to take action.
Reliability aside, redundant network links can boost throughput significantly. For instance, instead of an active/standby configuration, in which only one link is active at a time, one could implement an active/active configuration, where traffic is load-balanced across all links. With the latter approach, two identical links could double the throughput, three links could triple the throughput and so on-provided, of course, the CPU can generate data fast enough to match the capacity of the combined links.
Unfortunately, conventional operating system architectures don't offer built-in support for redundant network links, whether those links are all the same media or some combination of switch fabric, fiber, Ethernet, serial and so on.
Consequently, it's up to the individual software developer to implement support for each link. That must be done "by hand" on an application-by-application basis.
For instance, each pair of software processes talking across a redundant network (for example, a client talking to a remote database) would have to be aware of each link: its characteristics, failure modes, integrity and so on. That's a problem, since the number of links, the kind of links and, significantly, the failover policy for each link can change from device to device, or even from installation to installation. In fact, a different failover policy may be required for each pair of processes talking across the links.
Should the developer try to code for every possibility, or recode every process whenever the network installation changes? That's an unattractive choice either way. Good design dictates a higher level of abstraction, where processes don't have to be hard-coded for each network configuration.
In a microkernel OS, where message passing forms the central method of interprocess communication, much of this abstraction is already in place. Say you have two processes: A and B. When A sends a message to B, it doesn't have to know which node B resides on. If B is local, the microkernel will route the message directly; if B is another node, then a separate network manager can forward the message to that node. The important point is that it's exactly the same message either way-neither A nor B has to invoke any special code to "get networked."
With this level of network intelligence centralized in a separate network manager, it's logical to go one step further and have that manager provide failover and load-balancing policies as well. That way, any issues specific to redundant links also become neatly abstracted from each application.
Consider an actual implementation: the Qnet networking manager for the QNX Neutrino OS. Using Qnet, designers of redundant networks can choose from several network policies:
- Load balance. Packets are queued on the link that will deliver them the fastest, based on current load and link capacity. The policy uses the combined service of all links to maximize throughput and allows service to degrade gracefully if any link fails.
- Redundant. Every packet is sent over all links simultaneously. If a packet on link A arrives before the same packet on link B, the packet on link A "wins." Redundant packets that arrive later are quietly dropped. With this policy, service will continue without a stutter even if one link fails.
- Sequential. All packets are sent over one link until it goes down, at which point the second (or third or fourth) link is used.
- Preferred. This is like the sequential policy but falls back to load balancing if the specified link can't be used; that is, all available links are used to reach the remote node.
Of course, a policy appropriate for one application may not be appropriate for another; hence the need for the network manager to support multiple policies simultaneously.
So, what does happen when a failed link recovers? In a nutshell, the network manager sends periodic maintenance packets to any failed link; if the link has recovered, the manager will return the link to the pool of available links. What happens next depends on policy. For instance, if you've chosen sequential mode and the primary link recovers, then the network manager would stop using a secondary link and reroute traffic back to the primary, and more desirable, link.
We've seen how a network manager, together with a microkernel OS architecture, can help insulate applications from networking concerns. But in reality, the network manager should itself be insulated from lower-level networking details, such as the protocol or media used for each network link. Say you need to write your own protocol to send data across a proprietary switch fabric. If the network manager makes any assumptions about the underlying protocol or media, then the manager would have to be recoded. Likewise, it may have to be recoded every time a protocol or a hardware driver is updated.
Microkernel architecture can address the problem, since the OS microkernel and its message-passing services act as a kind of software bus. Like the hardware buses of modern networking equipment, the software bus enables virtually any software service-be it an application, protocol stack or network driver-to run as a cleanly decoupled component that can be inserted, unplugged or upgraded independently of other services. In the case of the network manager, this means underlying protocols and drivers can change without the network manager's having to be recoded. In fact, those services could be upgraded dynamically on a live system without the network manager's knowing that anything has changed.
So a microkernel OS can provide a "write once, network anywhere" architecture, where applications can communicate transparently across any number of redundant links, using any combination of network media. No network-specific code is necessary.
Nonetheless, to enable five-nines availability, an OS must provide more than network redundancy. It also has to stop and restart drivers to support hardware hot swap. It has to recover from software faults, without forcing a system reset. It has to allow applications, protocols and drivers to be upgraded dynamically, without service interruptions. All this requires an OS architecture in which components conventionally bound to the kernel-drivers, protocols, file systems and so on-exist as cleanly decoupled, memory-protected processes.