[Part 3 explains how the RTOS kernel prioritizes tasks. It also explains dynamic memory allocation, system calls, and hardware interrupts. Part 5 shows how to protect critical code and resources using semaphores, spin locks, and other techniques. It also explains how to synchronize interdependent tasks.]
This article will outline the major issues with implementing multitasking systems. We will first discuss the danger of deadlock. We will also discuss the ways in which multitasking systems share common resources. Designers of DSP-based systems and embedded systems in general must constantly manage the allocation of scarce resources; these may include on-chip memory, the DMA controller, a peripheral, an input or output buffer, etc. This article will introduce ways to synchronization on shared resources as well as other forms of task synchronization.
One common danger is that of "deadlock," also known as deadly embrace. A deadlock is a situation in which two computer programs sharing the same resource are effectively preventing each other from accessing the resource, resulting in both programs ceasing to function. The earliest computer operating systems ran only one program at a time. In these situations, all of the resources of the system were always available to the single running program, and thus deadlock wasn't an issue. As operating systems evolved, they began to support timesharing and to interleave the execution of multiple programs. In early timesharing systems, programs were required to specify in advance what resources they needed so that they could avoid conflicts with other programs running at the same time. Still, deadlocks weren't a problem. But then, as timesharing systems evolved, some began to offer dynamic allocation of resources. These systems allowed a program or task to request further allocation of resources after it had begun running. Suddenly deadlock was a problem—in some cases a problem that would freeze the entire system several times a day.
The following scenario illustrates how deadlock develops and how it is related to dynamic (or partial) allocation of resources:
- Program 1 requests resource A and receives it.
- Program 2 requests resource B and receives it.
- Program 1 requests resource B and is queued up, pending the release of B.
- Program 2 requests resource A and is queued up, pending the release of A.
- Now neither program can proceed until the other program releases a resource. The OS cannot know what action to take. At this point the only alternative is to abort (stop) one of the programs.
As our short history of operating systems illustrates, several conditions must be met before a deadlock can occur:
- Mutual exclusion – Only one task can use a resource at once.
- Multiple pend and wait – There must exist tasks which are holding resources while waiting for others.
- Circular pend – Circular chain of tasks hold resources that are needed by others in the chain (cyclic processing).
- Preemption – A resource can only be released voluntarily by a task.
Unfortunately, these prerequites are quite naturally met by almost any dynamic, multitasking system, including the average real-time embedded system.
"Indefinite Postponement" is another dangerous system condition in which a task that wishes to gain access to a resource is never allowed to do so because there are always other tasks gaining access before it. This condition is also called starvation or lockout.
There are two basic strategies for handling deadlock: make the programmer responsible or make the operating system responsible. Large, general-purpose operating systems usually make it their responsibility to avoid deadlock. Small footprint RTOSs, on the other hand, usually expect the programmer to follow a discipline that avoids deadlock.
Deadlock Detection and Recovery
Some operating systems are able to manage deadlock for the programmer, either by detecting deadlocks and recovering or by refusing to grant (avoiding) requests that might lead to deadlock. This approach is difficult because the deadlock is generally hard to detect. Resource allocation graphs can be used but this can also be a very difficult task to perform in a large real-time system. Recovery is, in most cases, not easy. One extreme approach is to reset the system. Another approach is to perform a rollback to a pre-deadlock state. In general, deadlock detection is more appropriate for general-purpose operating systems than RTOSs.
Perhaps the crudest form of deadlock detection and recovery is the watchdog timer. In event of a deadlock, the timer will elapse (deadlock detection), and reboot the system (recovery). As a last ditch safety net for very infrequent or as yet undiscovered deadlocks this is acceptable.