In embedded environments, where the reliability of one element of a design often dictates the reliability of the whole, issues such a real-time correctness is best achieved through "design-in" techniques rather than the traditional "test-in" timing performance procedures at the end of development. This avoids the costly problems that can arise when timing faults are found late on in testing or, worse still, after deployment.
There are two common misconceptions regarding hard real-time systems. The first is that hard real-time means "very fast." However, hard real-time is best described as "when it absolutely positively has to be done on-time." The second misconception is that hard real-time means safety-critical and that because a system cannot kill someone, there is no need to worry too much about meeting deadlines.
However, a hard real-time system is one where the consequences of missing a deadline are serious. So, if injury or death results from a missed deadline then this is clearly serious and we have a hard real-time system. However, there are many systems where the consequences of missing a deadline, even occasionally, are economically unacceptable.
It's clear that if we want to build a hard real-time system then it is essential that it is possible to find out, before the system is deployed, whether deadlines could be missed or not. There are a number of scheduling techniques that are appropriate for building hard real-time systems. Static techniques, such as simple cyclic scheduling and major/minor cyclic scheduling offer good support for determining real-time correctness. However, their deficiencies with regards to efficiency and flexibility can make it impossible to build an affordable solution. Preemptive scheduling overcomes these problems and by using scheduling mathematics such as Deadline Monotonic Analysis (DMA), it is possible to analyze a system to determine the worst-case response times for tasks and interrupts and verify the timing behavior of an overall system.
A better solution is to use temporal development processes that emphasize in "design-in" rather than "test-in". Testing is commonly used to try to find out worst-case timing behavior of a system. However, it is a technique that is no longer adequate for many highly deterministic, real-time embedded systems. In testing, the longest observed response time generally is less than the worst-case response time. This is because the worst-case response time only occurs in the circumstances where a task's execution follows the worst-case path through the implementation and suffers the maximum amount of interference from other system events such as interrupts and other tasks.
As a consequence it might be assumed that a deadline is satisfied, but in fact the deadline could be missed once the system is deployed. This is particularly problematic for products in mass production. The amount of effort that can be invested in testing is insignificant compared to the amount of use the product will get after deployment.
For example, an average automobile has a driving lifetime of 2500 hours. Multiply this by the number of automobiles of a particular model and it is easy to see that in the field, billions of hours of use is possible. In testing only a tiny fraction of this is economical and therefore there will almost always be a residual probability of a missed deadline. While this residual probability may be extremely small, given the high usage the product will experience following deployment, it is highly likely that the missed deadline will be experience in the field.
One of the goals during development is to limit the length of time between the introduction of an error and its detection and ultimately removal. This is important as the longer an error goes undetected, the more expensive it will be to fix.
The costs grow exponentially as time between the insertion of the error and its detection due the increased design decisions that need to be re-visited. So, returning to the issue of temporal errors, it can be seen that testing will not detect the error until the integration or acceptance testing stage. However, the problem may have been introduced as early as the architectural design and therefore many of the design decisions will need to be examined and possibly changed to fix the problem.
One approach would be to employ this technique as part of the integration or acceptance testing. This would then confirm whether or not the system met its temporal requirements prior to deployment, thus removing the risk of failing to detect temporal errors through testing. On the surface this seems great as the cost of product recalls and the potential damage to a company's reputation can be enormous if problems make it into the field.
However, while the application of this analysis does prevent the deployment of systems with temporal errors it provides no benefit in terms of the amount of rework necessary for such systems. Temporal errors will still only be discovered during integration or acceptance testing even if the error was introduced during architectural design.
The problem here is the feedback loop from introduction of a temporal error to its detection is too long. This leads to large rework costs and high project risk. While the analysis will prevent temporal errors from making it into the field, if they are only discovered weeks before deployment the project is exposed to huge risk.
For this reason the temporal behavior should be "designed-in" throughout the development process. The verification of temporal behavior should begin as soon as possible and continue throughout the development, as is the case for functional behavior. This minimizes the length of time from introduction of an error to its detection.
The best place to employ such temporal design techniques is early in the development process at design verification. Using event-response requirements established as part of the requirements specification, and execution time estimates for the implementation, a proposed design can be verified. Schedulability analysis can be applied to allow the developer to examine whether a proposed design will be able to meet its temporal requirements.
Clearly at this stage it is not possible to perform a completely accurate analysis. However, fundamental temporal problems can be uncovered.
Analysis can also be used at this stage to investigate the boundaries of the design. This can be used to two ways. First, it can be used to identify areas of high risk. Sensitivity analysis determines the margin for errors in the estimates used in the analysis. Where these margins are small, there is obviously higher levels of risk. Second, for systems where temporal problems have been identified sensitivity analysis can identify the most beneficial areas at which to target corrective action.
As the design proceeds in to the detailed design stage the model used for the analysis can be refined and its accuracy improved. The estimates defined during the design verification are used as budgets for the task implementations. During the detailed design these budgets can be broken down and allocated to individual elements of functionality.
Developers are then responsible for providing an implementation of this functionality that consumes no more than their allotted execution time budget. Also during detail design shared resources and inter-task communications will be identified.
As the development proceeds to the coding stage the model can be refined still further to reflect the actual implementation - execution time, resource sharing and inter-task communication information can all be updated. This process is likely to continue into the testing stages as unit testing and integration testing may well be used to obtain execution time information.
In addition, the application of advanced schedulability analysis may well provide a feedback loop into various aspects of the implementation. For example it is possible for the analysis to determine the maximum buffer sizes required, the most efficient interrupt scheme and the required clock speed to satisfy all temporal requirements.
The application of temporal verification, repeatedly from an early development stage reduces the size of the feedback loops. Hence, it allows temporal problems to be located as quickly as possible after their introduction. It also makes the temporal correctness part of the design process from the inception of the project.