One of the many tasks I have had as a Hardware and Software Engineer is troubleshooting unintended consequences of design decisions made many years previously. I was working on a VMEBus-based system that ran on the pSOS real-time operating system. This system used an FDDI (Fiber-Distributed Data Interface) network interface, and due to its unique configuration, we thought it would be a good idea to perform some sort of BIT (Built-In Test) on it.
Nine years later, the manufacturer of the CPU board we used decided to replace it with a new model. All of a sudden, our FDDI BIT test started failing. This was mighty suspicious to me because I recalled that the vendor actually supplied the BIT (built into the circuit card's firmware).
The BIT Test Executive that our company wrote would just start the test, wait for a period of time, and then read the status register to see if it passed. I actually tested it myself using the "watchful eye" of a VMEBus analyzer. I started the test, saw it fail, and immediately went to read the status register and saw that the test had PASSED!!!
At this point, I smelled a rat. I changed the VMEBus analyzer over to asynchronous mode and ran the BIT, swapping both the new and the old CPU boards. At first I noticed that that new CPU executed a VMEBus read in 60% of the time that it took the old one. Then, I noticed that the BIT Test Executive was continuously reading the status register. I finally got my hands on the software and confirmed my suspicions: the BIT Test Executive was performing a read loop, continuously reading the status register a fixed number of times and then doing a compare of the status bit with a "pass" condition. Since the new CPU was faster, it completed "n" reads before the FDDI card completed its BIT.
I guess you could call this simply a matter of lazy programming: you have to read the register anyway, so why not just do that "n" times? A better way to do it would be to "WAIT(TICKS)", utilizing the built-in timing in pSOS, or even to decrement a CPU register: this is a reliable technique, since the time interval is based on CPU clock speed and not interface speed. However you do it, DOCUMENT, DOCUMENT, DOCUMENT!!! That way, some poor sot like me won't have to work WEEKS to find it!!!
Join our online Radio Show on Friday 11th July starting at 2:00pm Eastern, when EETimes editor of all things fun and interesting, Max Maxfield, and embedded systems expert, Jack Ganssle, will debate as to just what is, and is not, and embedded system.