Power-supply problems can show up in many ways in a system and steer you down many wrong paths, looking for the cause.
I've always been a proponent of a rock-solid power supply subsystem, based on my design and debug experience. A good supply -- regulators, rails, physical distribution, and ground returns -- means a headache-free foundation for the ICs, the circuitry, and the software that operates in the product.
I was reminded of this yet again when I had some odd behavior in a low-cost (i.e., cheap) MP3 player I use to take music with me when bike riding; it was only $20, plus it has a digital FM broadcast-band tuner, handy for listening to the local college jazz station. (Yes, I could use my smartphone and stream, but this unit is smaller and I worry about it less on the road.) The Coby unit (see photo) has a "scroll wheel" for the user to select tracks, adjust volume, and perform other functions. Although it looks like a circle, the implementation actually has four switches under the disc, at the left/right/top/bottom pointers.
This MP3 player has an interesting idiosyncrasy: when the battery is at half or below, it doesn't sense two of the four scroll-wheel switches.
Occasionally, the left and right switches would not sense that they were being pushed. I attributed this to dirt or similar -- especially as this MP3 player actually went through the washing machine once (unintended, of course!) and amazingly, it worked fine after I opened it and dried it with air flow from a fan. The switch problem seemed to be one of those intermittents you just learn to live with, especially as this is not a mission-critical application.
But then I noticed a pattern. The switch problem only occurred when the battery-charge indicator showed the unit was at half-charge or below. In fact, I could consistently clear the switch-sensing problem simply by charging the player to full.
Apparently, there is some sort of interaction between the state of charge, the IC that manages the switches, or resistance in the switches themselves, and it only affects two of the four sensed switches. If I was on the team doing product development for this player, I might have spent time chasing software bugs when the switch wasn't sensing, yet that may not be the problem at all.
Since I don't have a schematic of the unit, nor the time or tools to investigate further, I'm dropping this issue; let's be real: it is only a petty annoyance. Plus, experience teaches us that these sorts of problems can be very hard to diagnose and are generally unfixable. After all, if the problem is that the IC has some internal sensitivity to a low-power rail at its digital I/O, there's not much I can do, nor can I get into the switch-contact areas to clean them.
The lesson here is that the power supply can have subtle interaction with functions that seem far removed from the supply rail. It's not just battery-sourced DC supplies, either. Several years ago, I visited a major analog semiconductor company. While we were chatting informally, they told me they had a batch of desktop PCs that were about six months old, and which had random crashes (the Blue Screen of Death -- BSOD) ranging from every few hours to days.
Naturally, they first assumed it was some bug in the Windows operating system. But they were also smart enough to know that the assumption was an easy way to rationalize and dismiss the problem. Since they had the necessary equipment and expertise, they did some more poking and probing.
Long story short: The real cause was that the AC/DC supplies were marginal and unable to handle some load transients that occurred during regular PC operations. Digging deeper, they found the problem was not the supply's design itself, but the substandard and very likely counterfeit bulk capacitors in those supplies. These capacitors would be good enough to make it through final test at the factory, but would then age and dry out within a few months, unlike the quality capacitors that should have been there.
Have you ever had power supply and power rail issues that caused subtle circuit and performance problems that appeared elsewhere, and took some serious digging and debugging to uncover? How did you figure this out? How long did it take for you to find the problem?