Since the introduction of Plug and Play and operating-system-directed power management, hardware vendors and industry working groups face many challenges when designing technologies to operate with Microsoft Windows operating systems.
Plug and Play requirements for devices are divided roughly into two areas: discovery and configuration. Discovery includes detecting system topology, device identification information (device IDs) and device scanning requirements. Configuration concerns to be addressed in a specification include required functionality, event notification, control of surprise removals and hardware-management instrumentation.
On the discovery side of the equation, Windows must be able to accurately discover the physical topology of the entire system. That's because the topological relationship of buses and devices is the most fundamental information required for successful Plug and Play and power management.
Under Plug and Play, devices are enumerated according to industry standards for each bus by the related bus driver. When the Plug and Play Manager asks a bus driver for the devices on it, the driver determines its list of children according to its bus protocol. For example, the ACPI driver looks in the ACPI name space, the peripheral component interconnect driver queries PCI configuration space and a Universal Serial Bus hub driver follows the USB protocol.
In the Plug and Play process, Windows detects buses and loads their drivers, the bus drivers report which devices are connected to their buses and so on until Windows can construct a hierarchical software representation of the system. This representation is called the device tree.
The device tree describes the physical relationships of all hardware in the system. Based on topological information in the device tree, the operating system makes a wide range of assumptions about how to manage hardware. If hardware does not accurately describe itself to the operating system, the resulting device tree will not accurately reflect the hardware topology and operating system assumptions will be incorrect, resulting in interoperability problems.
For example, with certain kinds of buses, correct allocation of resources depends on the system topology. Resources that are required by devices behind a bridge must be assigned to the bridge, which passes these resources to its devices. If this physical topology is inaccurately represented to the operating system, it will not assign the resources to the bridge and the devices will never receive the resources they need.
If a parent device is removed, its children must be also. For example, if the user detaches a USB hub from the computer, the operating system must remove the hub and any USB devices attached to that hub (for example, a mouse or a joystick) from the device tree.
Another example is a dockable mobile PC, which has multiple devices that will be ejected together. When it is docked, the mobile PC should appear as a parent in the hierarchy so its devices can be enumerated. If all devices in the mobile PC appear to the operating system as siblings without any intervening parent device, complex workarounds are required to express removal relations.
Otherwise, when the mobile PC is undocked, the operating system cannot detect that the devices are missing. It cannot notify drivers that devices are about to be removed, thus preventing a driver from vetoing a removal request that would cause data to be lost. The upshot is that it may continue to send I/O to the missing devices, a case that drivers usually are not designed to handle and that can lock the system.
A child device cannot be in a higher power state than its parent; the parent must always be in the same or higher power state than its children, otherwise the children cannot get the power they need.
If you are defining a new bus technology define a protocol for the bus driver to enumerate devices on the bus. Consider defining a quick scanning technique that can be used to determine which devices reside on a particular bus, where these devices reside on the bus topology and what the current state is of the bus topology. A quick scan should also yield the device IDs and required configuration information for each of the functions of devices present on the bus topology.
When designing any technology (bus or device class), consider the hierarchical nature of Plug and Play to ensure that your hardware describes itself accurately to the operating system. While the Windows operating system is focused conceptually on hierarchical tree structures, the topology for some hardware can be represented only as a graph or a network.
The Plug and Play process in Windows relies on a unique identifier for the device in order for the operating system to determine the proper driver to load. A device ID is a vendor-defined numeric value that uniquely identifies a device. The underlying bus driver for a device returns the device ID to the operating system, which uses that ID to create a subkey for the device in the registry.
If you are creating a specification that establishes a new bus (requiring a new enumerator under the operating system), then you should establish an independent clearinghouse that assigns unique vendor ID numbers. As an alternative, you can define vendor IDs to be exactly the same ones used in PCI (which is what USB did). The uniqueness of all other ID numbers associated with a given device is the responsibility of each individual vendor.
The following example shows how the operating system constructs a PCI device ID from the ID values described above:
In this example PCI is the bus that enumerates the device, VEN_1113 is the unique vendor ID assigned to the manufacturer by the PCI Special Interest Group, DEV_1211 is the device ID assigned by the vendor, SUBSYS_12111113 is the Subsystem ID assigned by the vendor and Rev_10 is the revision ID of the device.
In addition to helping the operating system load the best matching driver for a device, specific device IDs make it easier for drivers to enable specific customizations for particular devices. The more granularity that exists in identifying products, the more flexibility the vendor has in terms of loading drivers and providing special-case handling during run-time.
For example, if a device has unexpected problems that are not identified until the device is tested in various system configurations, an operating system vendor can provide special-case handling for the device-for example, increase timeouts, change how the device is power-managed and so on-without affecting performance or feature sets of any other devices in the class.
Subsystem IDs can be particularly helpful in those cases, because a device might require special-case handling in one system configuration but not in another. The subsystem ID can provide the necessary distinction.
The other major set of requirements falls into the configuration area. A mechanism is needed to ascertain all the capabilities of all possible configurations of a given device, including all the different hardware resources. It may require typical hardware resources for which support already exists. These hardware resources include interrupts, various address spaces (such as I/O, memory and so on), the amount of bus bandwidth required for acceptable performance by the device or its power states and wake capabilities.
In addition to those standard types of hardware resources, a manufacturer may be designing a device with additional new hardware resources. In such a case, OEMs need to contact Microsoft engineers to ensure that support can be implemented for that capability in the operating system.
A mechanism is needed that requests a device to report all its current configuration settings and capabilities. Plug and Play uses those mechanisms to determine hardware resource requirements and allocate resources to devices on the system. If a device is designed to operate within ranges of resource assignments and to report all the possible configurations that it can use, the operating system can balance allocation of resources system-wide to meet the requirements of individual devices. This activity is sometimes called resource arbitration.
For example, one device might require only one interrupt vector but be able to use any one of a range of vectors. A legacy device on the same system might require a specific interrupt vector within that range, but its interrupt might be set with a jumper on the card and therefore not be configurable by software. If both devices accurately report all possible configuration capabilities as described above, the operating system can assign the specific interrupt vector required by the legacy device and assign a different yet equally acceptable interrupt vector to the Plug and Play device, thus allowing both devices to operate.
Plug and Play keeps track of resource assignments by creating a list of them for each device. It also changes assigned resources as devices are added to or removed from a system and resources are subsequently reallocated from the resources available on the system.
The implementation of the event mechanism should use interrupts to notify the operating system rather than rely on some polling mechanism, which consumes bus bandwidth and can degrade performance of the whole system. If the interrupt is a level-triggered event-which it should be-some bus-level mechanism must exist to acknowledge and clear the interrupt. Otherwise, the operating system will experience an interrupt storm and stall. Surprise removal occurs when the user removes a hot-pluggable device or card without using the user interface that controls card removal. There are two approaches to dealing with that-prevention and damage control-and both need to be used.
Although surprise removal of devices on USB, SCSI and Fibre Channel buses will not stall the system, for many devices it can cause device-side data corruption. For example, before a disk is removed the operating system must be notified in order to flush the cache to the disk; otherwise, outstanding I/O will be lost.
If technology allows ejection of multiple devices in one action, a specification should provide a mechanism to indicate which devices will be removed from the system based on the action.
For example, a storage slice on a portable computer might consist of a floppy and hard drive combined in a single physical package.
Although the floppy drive and hard drive operate independently of each other, it is not possible to eject one without also ejecting the other.
Such enclosures should be enumerable and should be presented to the operating system as a parent with children so it is possible to determine which devices "travel" together. The specification should define parameters for implementation that accommodate the limitations of target operating systems, dependencies on other buses and the reality of available hardware.
To guarantee that surprise removal works properly on any bus, the design should ensure that removal of all devices is under the control of the operating system. This means allowing the user to remove a device only by using a software interface that provides operating system notification before unlocking the device, or by using a mechanical locking mechanism, such as a switch.
JONATHAN V. SMITH, ADRIAN ONEY, JAKE OSHINS AND ANDY GLASS-ALL OF THEM MEMBERS OF THE WINDOWS 2000 BASE TEAM- CONTRIBUTED TO THIS REPORT.