- An analog engineer and a digital engineer join forces, use their respective skills, and pull a few bunnies out of a hat to troubleshoot a system with which they are completely unfamiliar.
Our sales department had just accepted a new challenge on behalf of Engineering. They promised a customer that yes, of course, we can repair a telecom product that we have never seen before and for which we have no systems, no test fixtures, and no schematics. (The OEM no longer supported this product.)
Engineering was once again expected to shake our rattles, do our magic voodoo dance, and pull bunnies out of hats. About fifteen of these backplane-pluggable boards showed up in my office for initial evaluation and perusal of their inner workings. They had a proprietary SIMM (socketed memory module), which on several units turned out to be bad. Temporarily substituting the memory modules from other cards with obvious smoke damage failure modes brought them back to life when powered while lying flat on the bench. (Remember, there was no test chassis available.) They would then boot and talk to us over their RS232 ports.
These modules were populated with four SRAMs and four flash memories, each flash and SRAM shared an 8-bit-wide data bus, and each pair of SRAMs was enabled together with the same chip select. I proposed to the boss that we build a small test fixture that would take the DUT memory module, run SRAM tests, and if necessary reprogram the flash.
A digital/software colleague three cubes away was assigned to work with me on this project. He had previously designed and laid out a PCB that used a surface-mount PIC microcontroller as a universal I/O for our current and future test fixtures. It turned out that it had just enough I/O lines to handle the address and data buses on the DUT memory module, with two spares, as long as I tied the four separate DUT data busses together into two pairs on the fixture. So we decided to use it.
I ordered the necessary SIMM connector and a plated-through-hole protoboard, along with some ribbon cable and IDC header sockets to connect to the PIC board. It was somewhat annoying that the 72-pin SIMM connector was spaced at a.05-inch pitch, so the protoboard also had to be this pitch. Its tiny .025-inch-diameter holes did not accept .025-inch-square pins, so wire-wrap was impossible. (Now I know where that old adage, "Can't fit a square peg into a round hole," came from.)
I had to solder ribbon cable directly to the protoboard and string short 30AWG wires to the SIMM connector. As long as the stranded ribbon wires were not overly tinned (to keep the strands together), they actually fit into the protoboard holes.
Endeavor brings back cuss words long since forgotten
Another annoyance was that the SIMM connector had plastic retaining tabs that quickly wore out from repeated insertions of memory modules. The maker had designed them for maybe a single SIMM replacement over the lifetime of the product. We wanted to plug DUTs in and out constantly.
Fortunately I had used socket pin strips in the protoboard for the SIMM connector in anticipation of eventually needing to replace it easily. I subsequently found a connector with metal retaining tabs. This particular feature does not show up in vendors’ online part descriptions. I had to look at the mechanical drawing of each of many to find "W/ Metal Latch."
The first test of the fixture went well. My colleague coded a walking-ones SRAM test that immediately identified bad SRAM chips on a couple of the DUT (Device Under Test) boards. We replaced them and now they booted, but with the disconcerting message "RAM is BAD." Due to availability we had used 12 nsec SRAMs in place of the original 20 nsec SRAMs, so speed was probably not the issue. Hmmm, maybe we needed to improve the test algorithm.
Then we got brave and copied about five different versions of firmware from the flash of the good memory modules and tried to re-write the new firmware into a module, which semi-booted at first but complained about a "missing application loader." After the firmware re-load it would no longer even talk to us over its RS232 port. Somehow a 'known good' firmware load messed it up. My colleague verified that the firmware in the good and bad modules was identical. So why did one boot and not the other? Speed?