Implementing electronics design functionality using a combination of hardware blocks and software modules based on advanced, multiprocessor platforms is now commonplace. As the dominant implementation style for leading products such as cell phone handsets and consumer multimedia devices, this architectural trend offers advantages of reduced risk, lower cost and faster time-to-market. However, it also drives traditional design methodology issues to extremes.
Since the design functionality is contained in both hardware and software, hardware engineers and software specialists must work closely to ensure consistent operational interaction between their design components. This interaction is often detailed in a specification using the English language as the medium. This method has often proved inadequate because of ambiguities in operational descriptions, producing particularly obscure bugs. To find the bugs, exhaustive verification methods are required. Today, the only methods available to accomplish the required level of verification are to prototype the design or to use an emulation system.
To illustrate these issues and their resolution, we will describe a relatively simple processor platform and attempt to boot up an operating system on it using an emulation system. We will then demonstrate the effect of a complex bug and detail its resolution.
The hardware design used was an embedded processor core--a Tensilica Diamond 232L--with a random-access and flash memory subsystem and a universal asynchronous receiver/transmitter (UART) controller. We used the Linux OS distribution from Monta Vista based on the version 2.6 kernel, coupled with various software capabilities to complete the necessary OS infrastructure. The OS contained driver routines for the UART and other system components. The entire system was mounted onto an EVE ZeBu-UF 0.5 emulator, connected externally through two transaction-level interfaces. A UART transactor was used to drive the console window on a host workstation, and a JTAG standard transactor connected the Tensilica debug tool set into the system.
The system was initialized and the Linux image loaded into the emulated processor model. Linux started to boot, generating output on the console window. Partway through the process, the system generated the error message "serial8250: too much work for irq4," resulting in a kern panic.
First, we tracked the error from the resulting message back to the actual cause of the problem. Then, we located the appropriate segment within the driver source code. An inspection of that code revealed that a looping structure timed out after not receiving an expected interrupt signal from the UART. The interrupt was generated from the UART when it was ready to receive more characters from the main processor. The question became: Why was this interrupt not being generated? The answer required a hardware debug operation.
We needed to examine the signals around the UART and the time frame in which the error was occurring. This highlights a key issue with hardware/software debugs. The error initially manifests itself on the software side. Simply setting a software breakpoint to capture the right moment, however, will halt the program execution but not necessarily the entire system. On the other hand, tracing the design for the entire emulation run is impractical, as a 1 billion-cycle waveform trace would be time prohibitive.
Within an emulation environment, it is possible to set a hardware trigger that will halt the entire system. The ideal trigger is the processor program counter's reaching a specific value close to the time of interest. Using the system.map file created during software compilation, we deduced the hardware address where the driver code was stored. We set a trigger based on the processor program counter's reaching that address.
The emulator graphical user interface has the ability to use TCL to add functionality for a specific operation. We created a small amount of TCL code to allow us to enter the function name and enable the hardware trigger on the execution of that function. The ability to add specialized profiling, debug and analysis operations that pertain to a particular design is useful with a complex system.
We reran the emulation to the trigger point, at which time the system halted. We instructed the emulation hardware debug environment to record signal values around the UART, including the interrupt signal back to the processor and key registers required to handle the interrupt. By observing the waveforms, we noted data transmitted on the bus to the UART. However, we also observed the UART's failure to set the interrupt, which halted the transmission of more data and explained the error message during the Linux boot operation.
Note that it would have been difficult to use a software debugger to examine the value of the UART interrupt register: Reading it actually clears the pending interrupt. However, through the emulator's back-door access to all internal signals, we were able to observe that register without causing side effects.