Barr Code

Five more top causes of nasty embedded software bugs

Michael Barr

11/2/2010 10:32 PM EDT

What do memory leaks, deadlocks, and priority inversions have in common? They're all Hall of Famers in the pantheon of nasty firmware bugs.

Finding and killing latent bugs in embedded software is a difficult business. Heroic efforts and expensive tools are often required to trace backward from an observed crash, hang, or other unplanned run-time behavior to the root cause. In the worst scenario, the root cause damages the code or data in a way that the system still appears to work fine or mostly fine—at least for a while.

In an earlier column ("Five top causes of nasty embedded software bugs," April 2010, p.10, online at www.embedded.com/columns/barrcode/224200699), I covered what I consider to be the top five causes of nasty embedded software bugs. This installment completes the top 10 by presenting five more nasty firmware bugs as well as tips to find, fix, and prevent them.

Bug 6: Memory leak
Eventually, systems that leak even small amounts of memory will run out of free space and subsequently fail in nasty ways. Often legitimate memory areas get overwritten and the failure isn't registered until much later. This happens when, for example, a NULL pointer is returned by a failed call to malloc() and the caller blindly proceeds to overwrite the interrupt vector table or some other valuable code or data starting from physical address 0x00000000.

Memory leaks are mostly a problem in systems that use dynamic memory allocation.1 And memory leaks are memory leaks whether we're talking about an embedded system or a PC program. However, the long-running nature of embedded systems combined with the deadly or spectacular failures that some safety-critical systems may have make this one bug you definitely don't want in your firmware.

Memory leaks are a problem of ownership management. Objects allocated from the heap always have a creator, such as a task that calls malloc() and passes the resulting pointer on to another task via message queue or inserts the new buffer into a meta heap object such as a linked list. But does each allocated object have a designated destroyer? Which other task is responsible and how does it know that every other task is finished with the buffer?

Best practice: There is a simple way to avoid memory leaks and that is to clearly define the ownership pattern or lifetime of each type of heap-allocated object. Figure 1 shows one common ownership pattern involving buffers that are allocated by a producer task (P), sent through a message queue, and later destroyed by a consumer task (C). To the maximum extent possible this and other safe design patterns should be followed in real-time systems that use the heap.2


Click on image to enlarge.

Next: Page 2




piyush_

11/4/2010 11:02 AM EDT

Bugs are always cursed but they have a vital function too. They force us to take a hard look at the design and the code (this usually does not happen in code reviews) and keep us familiar with the software to fight future new ones and general maintenance of the software. A sort of periodic refresh. Somewhat similar to the viruses and bacteria that keep the immune system up-to-date and alive. Bugs are part of interconnected web of software life.

Sign in to Reply



Duane Benson

11/4/2010 12:48 PM EDT

Adherence to Coding standards is great advice as are code reviews. But what do you do if you're an independent or a one-person software team? Obviously, following best practices is vital in that case, as in all cases. If you or your company have to money to hire out for a code review, problem solved. But what do you do if logistically or financially those aren't options.

No programmer should ever be forced into a "No time to do it right, but time enough to do it over" scenario, but the real world all too often has other ideas.

Back through the way-back machine, I found myself in such a situation every now and then. No one to look over my shoulder and no resources to hire out a code review. I'd typically take my code to a friend or family member. They didn't have programming experience, but I'd walk through my code, explaining how each function worked.

I certainly wasn't going to get someone to point out a flaw, but just through the act of explaining things to someone, I would frequently find a large number of my errors or usage of lousy coding practices.

Anyone have any thoughts on how to best hunt down bugs in such a non-optimal environment where peer or independent code reviews are not an option?

Sign in to Reply



sharps_eng

12/24/2010 6:29 PM EST

@Duane - lone programming is of course bad news. One tip; write a module's code in one language, and then transcode it to a second language. Transcoding by hand is surprisingly quick, quicker than just rereading the code over and over when bug-hunting, and because you are in a different mindset you will see things that you missed first time round. Because you have to convert the code, you also have to actually read what you wrote, not just copy it.
A bonus is that you then have two programs to do the same thing which can be compared if you get some weird behaviour later on. One program may be slow (esp if you use Python or VB as the second language), but should still be functionally identical.
My preference is prototyping in Forth, then transcoding to C/C++. It is also possible to call the C functions via Forth wrappers, allowing function-by-function migration into the C code, and calling library C from Forth.
All the above is only about coding, though, and doesn't help with the initial design work, which is where all the real grunt should take place.

Sign in to Reply



one_armed_bandit

11/4/2010 12:53 PM EDT

Memory leaks - better still, never use malloc() and statically define the buffers (or only malloc once for the fixed buffer pool). KEEP STATS on the buffer pools. The end-notes are correct.

Deadlock - Dining Philosophers.

Race conditions - Don't allow them, wait until the race is guaranteed over, and everything else. Only the first two actually work.



Sign in to Reply



ShepSiegel

11/15/2010 9:15 AM EST

Nice collection of "design patterns for bugs". Most of these have corresponding patterns in circuit design for hardware. Nice. -Shep

Sign in to Reply



sharps_eng

11/20/2010 5:22 PM EST

Good to revisit these topics, even though we've been repeating this stuff for decades. How come our tools still let us even consider the bad practices described? Is it because we are too collectively too cheap, and too intellectually insecure, to climb up one further level of abstraction? We made it from hex to assembler to C didn't we? I'm not saying C++, Java or, say, Eiffel are an obvious next level. But we MUST define what a next-level tool would inhibit, what it would assume, what it would auto-suggest. I for one want it to learn from my programming/debug cycles and MAKE ME THINK before I repeat them on the next module or project. Where do we get such a tool from? We commission the open source community to make it, and offer to make a contribution to charity for each copy sold. Anyone up for coordinating this? Anyone partway there?

Sign in to Reply



cdhmanning

12/16/2010 9:09 PM EST

sharps_eng : Engineering is the art of compromise.

Embedded engineers will always have to live with constrained resources in many applications. That makes the use of big fluffy environments a non-starter.

No matter what your environment you can still leak memory, though C/C++ etc makes it a bit easier.

Java can run on small environments. Look at Lejos (google will find). It does run slower than C, but is adequate for many uses.

Even if you don't actually leak memory it is wise to use malloc() as little as possible. Memory fragmentation can still starve out memory or cause performance reduction.

Sign in to Reply



sharps_eng

12/21/2010 12:32 PM EST

@cdhmanning - I am not sure if you are expecting a reply, sorry.
I am aerated by the fact that I want to get from idea to product efficiently, but the tools are manifestly inadequate. No-one would suggest returning to a non-iDE, non-source-browsing command line for general programming, so I am trying to focus on the next step up in tools. I am just trying Sigasi for FPGA/VHDL development. Sigasi has live, on-line pseudo-compilation running as you work, to trap carelessness and frequently-made errors, sort-of code-hand-holding while you focus on the real problem. That seems like a good use for those spare desktop CPU cycles.
I'll try to report back on its usability.

Sign in to Reply



cdhmanning

12/22/2010 10:57 PM EST

sharps_eng:

Tools like that look pretty neat. I've found Eric IDE helps like this for some programming.

Forth and python programmers will tell you how productive these environments are because of their inherently interactive nature.

There is a lot we can do to make C programming slicker. We can run all the wrapper stuff in desktop simulations then only run the real code in embedded space. See http://www.eetimes.com/design/embedded/4008919/An-interview-with-James-Grenning Grenning's book is worth a read.

I have almost never developed anything significant on the embedded target. I always start in a desktop environment then move the code to the embedded platform. Almost everything is easier on the desktop: better logging, better debugging, faster compilation cycles, etc etc.. Desktops provide tools like Valgrind which will do extensive memory checking etc allowing you to verify your code before you drop it into an embedded platform.

A lot of the pre/post condition stuff that helps to makes Eiffel robust can also be achieved in C with a bunch of assert style macros that you enable in your test framework. If need be these can turned off in the final embedded code.

Sign in to Reply



Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)
Featured Job On
Scroll for More Jobs