United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 


Debug-based design covers the bases








EE Times


By designing an embedded system to collect data for debugging problems that occur after release of the product, engineers will develop a product with fewer overall bugs, and one that takes less time in debugging prior to release.

Even though good development practices can reduce defects, they cannot be eliminated. Because of the inevitability of defects, developers must devote resources to providing solutions to cover the technical problems that will arise; much like insurance covers the financial problems that arise in businesses.

It is frequently impossible to test requirements directly. Instead, test cases must be developed from requirements, and the process of developing test cases from requirements is, at best, an imperfect process. For example, one requirement for a system may be that the system is coded so that it does not exhibit any memory leaks. To test this requirement, several test cases can be developed that exercise memory allocation and de-allocation. However, it is unlikely that a test case can be developed for every execution path in a system that allocates or de-allocates memory. Hence, the final system may pass all of the test cases, yet not meet the requirement because there is an errant execution path that was not tested.

Consequently, even though we may live up to a zero-defect standard as measured by delivering a system that meets our test cases, it is unlikely that we will be able to deliver a truly defect-free product. According to a report by The Ganssle Group (Columbia, Md.), "More often than not, some 50 percent of a project's development time is spent chasing bugs. This appalling number surely indicates a problem with the software design process. But even the best methodologies will never eliminate all defects."

Defects are a reality, so it is best to be prepared to resolve them. As was mentioned earlier, businesses use insurance to cover potential financial problems that arise in connection with doing business. In the same way, developers need to collect data that cover technical defects in new products.

For intermittent problems that occur in remote locations, it may not be possible to reproduce the problem in a controlled manner. Additionally, some problems demand immediate attention, which will not allow for the time-consuming process of isolating and duplicating the problem and coming up with a verified fix. Hence, there is a need for a better way to deal with these problems.

That better way may rest with historical operation information about the malfunction. For example, if a historical log showed that a specific process failed to "check in" with the watchdog process (prior to each reboot), a developer has a basis to focus the search for the cause of the failure. Developing this historical information can require significant engineering time and product cost, so there is a need to decide wisely on the information recorded. However, it is frequently impossible to develop this information unless the capabilities for recording it were provided for in the product design. Retrofitting for recording debugging information can be difficult, if not impossible.

With the need established for historical operational information for a postrelease environment, the question arises, how are the needs for debugging in a prerelease environment going to be met? This question can best be answered in two parts. The first part is to consider the debugging needs in the unit-test phase of development.

Under the covers

In this phase, there are many robust debug/emulation tools available that allow a developer to get under the covers and see the internal details of the processor operation. Unless intrusion of debugging tools is a problem, there is no need to design additional tools: the needed information can be obtained from existing tools. If intrusion of debugging tools is an issue, debugging can be done without getting under the covers.

In the second phase of development, beyond unit test, it is difficult to get under the covers for debugging data. From a debugging standpoint, this phase of development is similar to the postrelease phase because a developer does not have access to internal details of the operation of the processor. To resolve the defect in either case, a developer needs to know what happened before the bug. In both phases, there is a need to record such diagnostic information in the system design. As we shall see, it is more difficult to solve the data collection needs in the postrelease environment.

We must collect data about a product to make debugging easier in the field. To implement the recording of historical operation data, many factors and trade-offs need to be made to properly capture the needed data within cost constraints. Three key decisions that need to be made are: What data will be recorded? How will the data be stored? How will the data be retrieved?

Another closely related issue: once code is fixed in response to a bug, how will fixed code be delivered? The first topic to consider is the recording of exceptions. In general it is important to record in some log a record of anything that is an exception to normal operation of the system, because this can offer a clue as to what caused a malfunction.

Because of the time saved, exceptions are one of the essential things to collect. However, embedded systems frequently are limited in the amount of memory they can devote to diagnostic purposes. As a result, logging more exceptions can reduce the memory needed to perform other functions in the system.

Diagnostic information, in particular, can be valuable both in the prerelease process and the postrelease process. However, diagnostic data can be the most storage-consumptive part of the log. As a way to reduce memory demands for logging, diagnostic events can be partitioned into functional areas, which can be either logged or not logged, depending on a switch that is tied to the functional area of the code.

Accurate picture

The more exceptions that are detected and logged, the more accurate picture a developer has to diagnose a malfunction. Ideally, if logging were "free," each branch in the code would have a diagnostic logging point. In that way, a developer would have a very clear snapshot of exactly how the code was executed. Additionally, that information can be used to determine if all code was being covered during a test cycle. However, logging is not free: it consumes both memory and processing cycles. Often, as a compromise, the entry to each function is logged, instead of each branch. In that way, the developer can see what modules of code are being executed.

Perhaps the next most important area to consider is the recording of memory usage. One of the most challenging problems to face is debugging a memory leak, especially in remotely located embedded systems. One approach is to record the amount of free memory (the heap) as well as the largest free memory block on a periodic basis and see if this value is stable over time. If it is, then one can presume there is no memory leak. Not using heap at all may be a better approach.

A somewhat more rigorous approach of testing for memory leaks is to build synchronization points into the code. The concept is that the amount of free memory should be the same value when comparing one synchronization point with another.

Some operating systems provide facilities for detecting memory leaks. However, we must make sure they will work when the product is in a customer's hands.

Another serious situation deals with getting information if the system reboots. If possible, try to save all processor registers as well as the contents of the stack (or stacks, if more than one context is present) in order to get some idea of what occurred before the reset. If the reset was user-initiated (as opposed to being generated by application of the /RESET line), indicate the source event causing the reset.

One important thing to consider is what data can be obtained from the development tools that are in use for this product, as well as the portability of the development tools. For example, if an emulator is used in development and the emulator is highly portable, then it may not be necessary to develop tools for reading the memory space of the processor.

However, shipping an emulator with every product is usually not feasible, so some of the diagnostic information probably will still be necessary.

A powerful piece of debugging information is a program trace. This shows not only what happened, but also the sequence of what happened, to assist a developer in determining the cause of a malfunction. However, unlike a desktop computer, embedded systems frequently do not have a large log file that can store a large amount of trace information, because available memory can be very limited. Further, access to the log file may require a user to connect an additional computer to the embedded device to upload the log over a "slow" serial link. Consequently, the upload process may pose difficulty in supporting large log files.

A common technique with many embedded devices is to output trace messages through a serial port as the device is operating. For example, when a system is under development, a developer can connect a PC that is running a terminal emulator to a serial port on the embedded device to see what the embedded device is doing. If the embedded device is programmed to write a serial string to show details of the boot process, these details also can be written to a wrap-around trace file. If they are, then they will be available to a developer after the fact.

Deep buffer

If the trace buffer is stored internally, there is the obvious question of how deep does the buffer need to be? Though bigger is usually better, it can be difficult to determine how much is enough. Often, buffers do not need to be extremely deep to be useful. One guideline is to consider the maximum levels (L) of function calling that there are in the code, as well as how many debug messages (M) there are at each level. Then the product, L x M x maximum message length, provides an estimate of a minimum trace buffer size. A final consideration in trace buffer messages is to include time in the message: this allows for correlation to external events.

Unlike desktop systems, embedded systems face difficult challenges in accessing debugging data stored inside the unit. Some of the more common obstacles include: getting physical access to the unit, having time to examine the unit, and getting permission to service the unit. There are many ways to access data, such as through the physical connection of a diagnostic device to the unit, connecting to a serial port on the unit, accessing the unit via a modem, use of wireless connections, and/or connecting to the unit over a TCP/IP link.

There are several techniques that, if done properly, can reduce the amount of analysis and experimentation needed to resolve a product defect. These include: accurate records of what was tested in the past; procedures to easily duplicate problems in a lab environment; up-to-date test plans to regression-test corrected code; and proper version-control documentation.











  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
With Acquisition Delayed, Sun Cutting 3,000 Jobs
With its proposed acquisition by Oracle being delayed by regulators, Sun plans to cut 3,000 jobs across several regions over the next 12 months.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About