The looming wave of Internet-enabled "appliances" containing networked embedded systems is not only transforming the computer marketplace, it is now stimulating new views on what exactly a network is. This week's BookShelf contains excerpts from 'Protocol Management in Computer Networking' by Philippe Byrnes. He discusses a theory of networking that is inspired by control system theory-the natural milieu of the traditional embedded system. Control theory is viewed as the best context for understanding and managing data flow over networks. The theory offers a toolbox to constructively define how to manage a discrete event system, a particular case being the Internet itself. The full version of the book is available from Artech House Publishers at www.artech-house.com.
The Internet is arguably the greatest technological achievement since the microprocessor. A measure of its success can be seen in its explosive growth, which since 1969 has developed from an experimental testbed connecting a few dozen scientists into a commercial juggernaut, fueled by the explosion of the World Wide Web, connecting tens and even hundreds of millions of users every day.
When a technology has been this successful, it may be unwise to suggest a new approach to the subject. Nonetheless, that is the purpose of this book: To propose a new management model, derived from control systems engineering, with which to analyze communications protocols and data networks, from simple LANs to the Internet itself. As we hope to make clear, this is not a book about network management as the term is conventionally used but rather a book about management in networking.
There are several reasons for advancing a new management framework. First, it is our belief that management is ubiquitous in modern communications networks and their protocols, at every layer and in ways not commonly appreciated. Indeed we contend that layering, central to every modern protocol architecture, is itself an instance of (embedded) management: Layering works by mapping from the abstracted layer interface to implementation details, and this mapping is a management task. And yet, because of narrow and constricting definitions of what constitutes management, this fact is obscured.
Another example of this obscurity is the handling of faults. Most protocols include some mechanism(s) for the detection if not correction of communications faults that corrupt the transported data. Another type of fault recovery is provided by routing protocols and other methods of bypassing failed links and/or routers. Still another type of fault management is the collection of statistics for the purpose of off-line fault isolation. Our goal is to unify these, and for this we need a definition of management that treats all of these as instances of a common task.
Instead of such a common definition, today we are confronted by a reductionism that in the first place differentiates protocol management from network management. This book seeks to go beyond such distinctions by constructing from the first principles a very broad formulation of management and utilizing this to identify management tasks whether they occur in communications protocols, routing updates or even network design. To borrow a notable phrase from another field, we seek to demonstrate the "unity in diversity" of management in data networking.
And the foundation of this construction is the concept of the manager as a control system. More precisely, from control engineering we derive a framework for task analysis called Mesa, taking its name from principal task primitives: measurement, estimation, scheduling and actuation. Corresponding to these four tasks are four classes of servers that make up closed-loop control systems: sensors, which measure the system being controlled (also known as the plant); estimators (also known as observers), which estimate those plant parameters that cannot be easily measured; schedulers (also known as regulators), which decide when and how to change the plant; and actuators, which carry the changes. The plant is a discrete event system, composed of a client and a server, along with storage for queued requests for service (RFSes). Everything in this book will revolve around this basic model, where management intervenes to control the rate at which work arrives from the client(s) and/or is executed by the server(s). The former we refer to as workload management and the latter we call bandwidth (or service) management.Foundation cracks
Another reason for a new approach is that networking suffers from cracks in its foundations. The theoretical framework on which networking at least nominally has been based for most of the last 20 years-epitomized by the OSI 7 layer model-has been rendered increasingly irrelevant by recent innovations and hybrids that have blurred once-solid distinctions. For example, from the 1960s onward, routing reigned supreme as the best-in fact the only-way to concatenate data links. When local-area networks (LANs) appeared, with their peer protocols and universally unique station addresses, it became possible to concatenate at Layer 2. Bridging, as this became known, and its modern equivalent, ASIC-based incarnation known as (frame) switching, is simpler and often faster at forwarding data than conventional Layer 3 routing.
Routing advocates counter that bridging/switching is considerably less efficient at using network resources and less robust at handling faults than routing. All this has led to heated arguments and uncertainty about how to design and implement large computer networks, as well as a slew of curious hybrids such as "Layer 3 switches," "Internet Protocol switches," and the like that may owe more to marketing hyperbole than any meaningful technical content.
The separation of the latter from the former requires, in part, a neutral vocabulary, and that is one by-product of our study of management. To repeat, our aim is to construct a unified theory of data networking, with unified definitions of tasks such as concatenating transporters, independent of the nature of the technology-bridges, routers, and so on-used in implementing networks.
Finally, the current model will not get us from where we are to where we want to go. The next generation of internetworks must be self-tuning as well as self-repairing. By monitoring traffic intensity (its constituents, workload arrival, and server bandwidth), throughput, and particularly response times, the managers of these networks will automatically adapt workload to changes in bandwidth, increasing the latter in reaction to growth in the former (for example, meeting temporary surges through buying bandwidth-on-demand much as electric utilities do with the power grid interconnect.
Response time is particularly worth mentioning as a driving force in the next generation of internets: the visions of multimedia (voice and video) traffic flowing over these internets will only be realized if predictable response times can be assured. Otherwise, the effects of jitter (variable delays) will militate against multimedia usage. All of this points to the incorporation of the techniques and models of automatic control systems in the next generation of internetworking protocols.
This is precisely the tracking problem encountered in control engineering. Two tracking problems frequently used as examples in discussions of control systems are the home furnace/thermostat system, perhaps the simplest and certainly most common control system people encounter, and the airplane control system, perhaps the most complex. Someone who wishes to keep a home at a certain temperature sets this target via a thermostat, which responds to changes in the ambient temperature by turning a furnace on and off. Similarly, a pilot who moves the throttle or positional controls (e.g., elevators, ailerons) is giving a goal to one or more control systems, which seek to match the target value. In both cases, as the goal changes the control system attempts to follow, hence, the term tracking problem.
A control system attempts to ensure satisfactory performance by the system being controlled, generally referred to as the plant. The control system's scheduler receives a high-level goal or objective. Obviously, the goal is expressed in terms of the desired state value(s) for the plant. The scheduler seeks to attain this state by means of actuations that change the plant. Naturally enough, the actuations are executed by the control system's actuator(s). The roles of sensor and estimator are complementary: For those state variables that can be measured, the sensor executes this task; however, in many instances the plant may have state variables that cannot be measured and must instead be estimated-this estimation is the task of estimators. Finally, this information on the state of the plant, obtained either by measurement or estimation, is fed back to the scheduler, which uses it to schedule the next actuation(s).
The obvious next question is "What about the plant are we measuring and/or actuating?" The plant is characterized by its state, that is, the set of variables and parameters that describe it. For example, the state of an aircraft is described by its motion, its position, its current fuel supply, and its currently attainable velocity; other variables and parameters may include its mass, its last servicing, the number of hours its engines have been ignited, and so on. The computer analogy of state is the program status word (PSW), the set of state information that is saved when a process is suspended by an operating system and that is retrieved when execution is resumed. The state is the set of information needed to adequately characterize the plant for the purposes of monitoring and controlling it.
The actuator is the server in the control system that changes the plant's state. When an actuator executes, it changes one or more of the state variables of the plant; otherwise, if no variable has changed, then by definition the execution of the actuator has been faulty or, put equivalently, the actuator has a fault. When the plant is an airplane, the actuators are its ailerons, its other wing surface controls (canards, variable geometry, and so on), its rudder, and its engines. By altering one or more of these, the airplane's position will change: open its throttle and the engine(s) will increase its acceleration and speed; pivot its rudder and the direction in which it is heading will correspondingly alter; change its wing surfaces and the amount of lift will alter, increasing or decreasing altitude. Note that not all of a plant's state variables may be directly actuatable; in this case those variables that cannot be directly actuated may be coupled by the plant's dynamics to variables that can be actuated, allowing indirect control.
The sensors in a control system provide information about the plant. Also, when it comes to sensors, a similar situation may arise: With many plants, it is impossible to measure all state variables for either technical or economic reasons. For this reason the literature on control and instrumentation introduces a distinction, which we will follow here, between state and output variables. Therefore, when we speak of a sensor it may be measuring a state variable or an output variable, depending on the plant involved.
An output variable does not describe the plant but rather its environment; in other words, an output variable represents the influence the plant has on its environment, and from which the state of the plant may be inferred. For example, there is no direct way to measure the mass of planetary bodies; however, by applying Newton's laws and measuring the forces these bodies exert, their respective masses can be estimated. This brings us to the role of the estimator in our control system model. Estimation is complementary to measurement. Using the state and/or output variables that can be measured, an estimator estimates those state variables that cannot be measured. Another term used by some authors for this is reconstruction.
Finally we come to the scheduler. Although this statement may seem obvious, a scheduler schedules. For example, at the heart of a flight control system is a scheduler (regulator) that decides when the various actuators should execute so as to change some aspect of the plant, such as the airplane's position or momentum. The classic feedback control system operates by comparing the state of the plant, measured or estimated, with the target state: Any discrepancy, called the return difference, is used to determine the amount and timing of any actuations to drive the plant toward the target state, often called the set-point. When the scheduler receives a target (goal), this is in fact an implicit request for the service. The simplest example of a feedback control system is the furnace-thermostat combination. The room's desired temperature is set and a thermometer measuring the room's actual temperature provides the feedback; when the difference between the temperature setpoint and the actual temperature is sufficiently great, the furnace is triggered, actuating the room's temperature to the setpoint.
Although every control system has a scheduler and actuator, not all control systems rely on estimators or even feedback. Some schedulers make their decisions without regard to the state of the plant. These are referred to as open-loop control systems. In other instances, the plant's state variables are all accessible to measurement; this is called perfect information, and no estimator is required to reconstruct them. The most complicated control system, of course, is the scheduler-estimator-sensor-actuator, which is required when one or more state variables are inaccessible to measurement.
It is worth briefly mentioning the role of the model in control systems. There must be some understanding of the plant and its dynamics if the scheduler is to create the optimum schedule or, in many instances, any schedule at all. In addition, the design of the instrumentation components (sensors and estimators) is based on the model chosen, otherwise it will not be known what data must be collected. The sensors and estimators change the state of the model, and the updated model is used by the scheduler.
A control system that uses feedback does so to correct discrepancies between the nominal model of the plant and the plant itself. Two common sources of these discrepancies are (exogenous) disturbances and uncertainties reflecting idealization in the mathematical models-the latter is introduced as a concession to the need for tractability. When a server suffers a fault, for example, this is an example of a disturbance.
When it suits our purpose we would like to abstract the details of a given control system (open-loop, closed-loop with perfect information, closed-loop with imperfect information) and simply denote its presence as a manager, which in our vocabulary is the same thing as a control system.
Let's move back from the plant being either a plane being flown or a room being heated to the plant being a discrete event system composed of a server and a client, where the client generates work for the server in the form of RFSes, which are stored in a queue for execution if there is sufficient space.
Reduced to essentials, we are interested in controlling the performance of a discrete event system (also known as a queueing system). Later we explain that the server is the computer network that is a collective of individual servers and the client is the collective set of computers seeking to exchange data; that is in the case at hand, the plant is the communications network and its clients (i.e., digital computers). The function of the control system is to respond to changes in the network and/or its work load by modifying one or both of these. For now, however, let's keep it simple.
The performance of a discrete event system can be parameterized by such measures as delay, throughput, utilization, reliability, availability and so on. These are all determined by two (usually random) processes: the arrival process, which determines the rate at which work load arrives from the client; and the service process, which determines the rate at which the work load is executed by the server. This means that, for even the simplest discrete event (i.e., queueing) system, there are two degrees of freedom to the task of controlling the performance: actuating the arrival and service processes. Other ancillary factors include the maximum size of the queue(s), the number of servers, and details about how the queueing is implemented (input vs. output vs. central queueing), but the fundamental fact remains that overall performance is determined by the arrival and service processes.
The state of a discrete event plant is determined by the state of the client and the state of the server (leaving aside for now the question of queue capacity and associated storage costs).
In the terminology of queueing theory, the client is characterized by the arrival process/rate. The arrival rate generally is shorthand for the mean interarrival time, and is denoted by the Greek letter lambda ; more sophisticated statistical measures than simple means are sometimes used. Real-world clients generate requests for service (work) that can be described by various probabilistic distributions. For reasons of mathematical tractability the exponential distribution is most commonly used.
In addition, we must specify what the client requests. A client must, of necessity, request one or more types of tasks (this can include multiple instances of the same task). The task type(s) a client may request constitute its task set. Two clients with the same task set but with different arrival processes will be said to differ in degree. Two clients with different task sets will be said to differ in kind. As with clients, two servers with the same task set but with different arrival processes will be said to differ in degree. Two servers with different task sets will be said to differ in kind.
A server's task set is basically an atemporal or static characterization. To capture a server's dynamic behavior, we need to discuss its tasking-how many tasks it can execute at one time. The obvious answer to this is one or many-the former is single tasking and the latter is multitasking. However, we must further differentiate multitasking between serial multitasking and concurrent multitasking. In serial multitasking, two or more tasks may overlap in execution (i.e., their start and stop times are not disjoint) but at any given time the server is executing only one task. Concurrent multitasking, on the other hand, requires a server that can have two or more tasks in execution at a given moment. Concurrent multitasking implies additional capacity for a server over serial multitasking.
Obviously, if there is a mismatch between the task set of a server and the task set of its client(s), then there is a serious problem; any task that is requested that is not in the former will effectively be a fault. Corresponding to each task in a server's task set is a mean service rate, which we will refer to as the bandwidth of the server. It is denoted by the Greek letter mu. As with arrival rates, more sophisticated statistical measures than simple means are sometimes used. In all cases, however, one thing holds true: The service rate is always finite. This will, it turns out, have profound consequences on the type of work load manager that must be constructed.
Now, there is an additional fact of life that complicates this picture: An actual server will have finite reliability. At times it will be unable to execute a task requested of it not because it has no additional bandwidth but because it has no bandwidth at all. There are several classes of faults that can cause problems in a server. A fatal fault, as its name indicates, "kills" a server; a server that has suffered a fatal fault cannot operate, even incorrectly. A partial fault will reduce a server's bandwidth and/or task set but neither totally.
We also must distinguish faults based on their duration. Some faults are persistent and disable a server until some maintenance action (repair or replacement) is undertaken by a manager. Other faults are transient: They occur, they disable or otherwise impair the operation of the server in question, and they pass. Exogenous disturbances are often the source of transient faults; when the disturbance ends the fault does, too. An immediate example is a communication channel that suffers a noise spike from an outside source such as lightning.
Management of a server in a discrete event plant amounts to managing its bandwidth, that is, its service rate, and by extension of its task set. When we speak of bandwidth we mean its effective bandwidth (BWe), which is the product of its nominal bandwidth (BWn) and its availability A. Availability, in turn, is determined by the server's reliability R, typically measured by its mean time between failures (MTBF) and its maintainability M, typically measured by its mean time to repair (MTTR). A "bandwidth manager"-arguably a more descriptive term than "server manager" or "service rate manager"-can there-fore actuate the bandwidth of a server by actuating its nominal bandwidth, its reliability or its maintainability.
Implementing the least-cost server entails making a set of trade-offs between these parameters. For example, a server with a high nominal bandwidth but low availability will have the same average effective bandwidth as a server with a low nominal bandwidth but high availability. Similarly, to attain a given level of average availability, a fundamental trade-off must be made between investing in reliability (MTBF) and maintainability (MTTR). A highly reliable server with poor maintainability (i.e., a server that seldom is down but when it is down is down for a long time) will have the same availability as a server that is less reliable but which has excellent maintainability (i.e., is frequently down but never for a long time). In both of those trade-off situations, very different servers can be implemented with the same averages, although it should be noted that the standard deviations will be very different.
When is a server's bandwidth (and/or other parameters) actuated? The first occasion is during its design and implementation. Implementation is an actuation of the server's nominal bandwidth from zero, which is what it is before it exists, to some positive value; and an actuation of its task set from null to nonempty. Up to this point, the server does not exist. Although it seems obvious to say, bandwidth management is open loop in the design phase since there is nothing to measure. Based on measurements and/or estimates of the client's demand, management will schedule the actuation of the server and its components.
This question has been studied considerably: Can a reliable server be constructed out of unreliable ones? The answer is yes, and the key is the use of redundancy. For this reason, implementing a reliable server is much easier when the plant is a digital plant that can be replicated at will, subject to limitations in cycles and storage. We will explore ways to do this in more detail in various chapters that follow, from coding theory to data link redundancy (bonding) to protocol retransmission to dynamic routing.
After this there is a server extant, and this means that bandwidth management may be, if desired, closed loop. The next instance of actuating a server's bandwidth generally occurs after a fault. As we remarked earlier, all servers have finite reliability. A server that is disabled by a fault has a reduced bandwidth. A partial fault may reduce the bandwidth but still leave a functioning server, whereas a fatal fault reduces the bandwidth to 0. Restoring some or all of the server's lost bandwidth is obviously an instance of bandwidth management.
This task is typically divided into three components: fault detection, isolation and repair (or replacement). Of these, fault detection involves the measurement of state and/or output variables to detect anomalous conditions. For example, high noise levels in a communications line can indicate a variety of faults or vibrations at unusual frequencies or can mean mechanical system faults. Fault isolation generally requires estimators since it entails a process of inference to go from the "clues" that have been measured to identifying the failed component(s) of the server. The actuation of the server is effected in the last phase, repair or replacement. The reason this is bandwidth actuation is that after a successful repair the bandwidth of the server is restored to the status quo ante.
It might seem from the preceding discussion that a bandwidth manager must be closed loop to effect maintenance; and while feedback undoubtedly reduces the time from the occurrence of a fault to the server having its bandwidth restored, there are circumstances under which open-loop maintenance policies might be used instead. Such policies as age replacement and block replacement require the bandwidth manager to replace components of the server irrespective of their condition; such a policy will result in any failed components eventually being replaced, and many failures being prevented in the first place, albeit at the cost of discarding many components with useful lifetimes left.
Typically, however, bandwidth managers responsible for maintaining servers are closed loop. Indeed, in the absence of sensors and estimators to infer the server's condition, the incidence of latent faults will only increase. Therefore, a major part of most bandwidth managers is the instrumentation of the server to monitor its condition. In fact, because it can even be argued that the maintainability of a server is one measure of the service rate of a bandwidth manager that is responsible for fault detection, isolation and recovery (repair or replacement), an investment in instrumentation that reduces downtime increases the bandwidth of the bandwidth manager.
Finally we come to deliberately upgrading or improving the server's bandwidth, as opposed to merely restoring it after a fault. The two basic degrees of freedom here are (1) the bandwidth of the server and (2) its task set. Consider first the instance where we have a server that can execute multiple types of tasks but we can change neither its total bandwidth nor its task set. By holding both of these constant, we can still change the bandwidth allocated to each task and this is still a meaningful change. An example would be to alter the amount of time allocated to servicing the respective queues of two or more competing types of tasks, such as different classes of service or system vs. user applications. Because we are not changing the tasks the server can execute, the "before" and "after" servers differ only in degree, not kind. We will therefore refer to this as actuation of degree.
Another variant of actuation of degree is possible, namely, holding the server's task set constant but now changing the total bandwidth of the server. An example would be to replace a communications link with one of higher speed; for example, going from 10 Base T to 100 Base T, but not adding any additional stations. The task set would be unchanged but the bandwidth would be increased. We refer to this as actuation of degree as well, but to distinguish the two cases we will call this actuation of degree2 and the first type actuation of degree1. Of course, if a server can execute only one task, obviously, this collapses to a single choice, namely, the actuation of degree2.
Changing a server's task set does transform it into a different type of server and we call this last type of change actuation of kind. Changing the task set of a server often entails significant alteration of its design and/or components. Of course, it can be as simple as adding a new station to a LAN. Generally, though, actuation of kind is the most complicated and extensive of the changes possible in bandwidth management.
Note that changing the nominal service rate and/or task set is not something undertaken easily or often. In some cases servers have two or more normal service rates that a bandwidth manager can actuate between, perhaps incurring higher costs or increased risk of faults as the price of the higher bandwidth. For example, increasing the signal levels in communications channels can improve the noise resistance but reduce the lifetime of the circuits due to increased heat. An example of a server that has several service rates is a modem that can operate at several speeds, depending on the noise of the communications channel.
Now we come to work load managers. The need for, indeed, the very existence of, work-load management is a concession to the inescapable limits in any implementable server. This means, as we just discussed, accommodating a server's finite bandwidth and reliability. And, just as we identified three levels of actuation for changing the task set and/or service rate(s) of a server, so are there three levels of workload actuation.
The first level of workload management is access and flow control. A server with limited (i.e., finite) bandwidth cannot service an unlimited number of RFSes. (In addition, although we did not dwell on it in the state description given earlier, the limits on the queue size often constitute even greater constraints than the fact that bandwidth is necessarily finite.). A basic workload manager will actuate only the interarrival distribution, that is, the arrival process. We will refer to this as actuation of degree1.
Various mechanisms can be used to actuate the arrival rate so as to allocate scarce resources (bandwidth, queue space, and so on). These mechanisms can be broadly divided into coercive and noncoercive. Coercive mechanisms include tokens, polling, and other involuntary controls. To change the arrival rates of workload can also be done by buffering and/or discard, either in-bound or out-bound. Noncoercive mechanisms revolve around issues of pricing and cost: raising "prices" to slow down arrivals, lowering them to increase arrivals. Note that coercive and noncoercive mechanisms can be combined.
Message vs. packet
The preemptive nature of packet switching is what distinguishes it from message switching, another technique for serially reusing (sharing) a transporter. With message switching, transporters are shared serially among two or more clients but the plant each client requests to be transported is sent intact.
Time slicing has another benefit, namely, fault management. Because the plant is now divided into components, these constitute units of recovery that are much smaller than the whole message.
A fault that occurs during the transportation of the plant is unlikely to affect all of the components, meaning that only the affected components need be retransmitted.
Such replication is a powerful technique for managing transient faults in digital servers; by replicating the plant for either concurrent execution or serial reexecution, the effects of transient faults can be mitigated.
See related chart