Debugging is, in my view, the most challenging of all engineering-project phases. Whether it's hardware alone, software, or an unclear mix of the two, it's a skill that you learn by hard experience and mentoring, rather than in the classroom. [For a look at an excellent book on the art and techniques of circuit, software, and system debugging—published by the American Management Association, of all places—see "Debugging by the book" and "Debugging: The 9 Indispensible Rules for Finding Even the Most Elusive Software and Hardware Problems".]
Sometimes we debug because we have to, to get that project or product released. Other times, it's a matter of on-going engineering curiosity. For example, there's a spot on a road I often drive where the broadcast-band FM signals nearly always fade away for a few seconds. I assume it's because of the way the overhead power, phone, cable and other wires cross the road right there: perhaps they are creating some sort of dead zone or interference pattern. It's no big deal.
But the other day I was in a friend's car, listening to XM Sirius satellite radio, and we experienced a signal fade-out on that road—but not quite at the same spot as the FM problem: it was a few hundred yards beyond, under clear, open skies. Even stranger, we had another fade-out on the return leg, again under open skies, but on the other side of the FM-problem area. "What's going on here?" I wondered. This was a case of wanting to debug solely for engineering curiosity, with some confusing observations which pointed perhaps to different causes—or perhaps not.
It turns out the confusing symptoms actually make sense, once you know the "secret" of what's happening. FM radio is broadcast in real time, of course, but satellite radio uses a buffering technique to minimize effects of overhead interference and blockage. Therefore, what you hear on via satellite radio was actually sent several seconds before it is played. That would account for the XM Sirius fadeout I was hearing on either side of the overhead obstruction on the FM radio (depending on travel direction). No big deal—but knowing why that was happening did feel good.
I was lucky: the "problem" I was trying to debug was not holding up product release or shipments. But it reminded me that debugging is tricky: the problem you observe is often not directly linkable to the actual root cause(s), your clues are often obscure or in conflict, and it takes skill, experience, discipline, and yes, lots of luck to understand what is actually happening and what is causing it to happen—and that's before you even get to a solution to the problem.
Have you had to debug and troubleshoot where you were perplexed by clues and evidence which made little or no sense, or more frustrating, appeared to be in conflict with each other? ?