Pause and consider for a moment a world in which passing tests are not an indicator that all is well, but rather a warning flag that not enough functionality is being tested, vital checks that confirm proper operation of the design are missing, or problems in the verification infrastructure are masking test failures and hiding RTL bugs merrily making their way to fabrication.
This statement of anguish rarely uttered in the annals of functional verification history is certainly absurd on its face. After all, verification engineers continually strive for the holy grail of clean regressions, toiling day-after-day to cleanse the system of failing tests on their way to tape-out nirvana. But pause and consider for a moment a world in which passing tests are not an indicator that all is well, but rather a warning flag that not enough functionality is being tested, vital checks that confirm proper operation of the design are missing, or worse yet problems in the verification infrastructure are masking test failures and hiding RTL bugs merrily making their way to fabrication. Such musings might lead one to ask, who's verifying the verification environment?
A verifier's hard life
First, let's dispense with the truisms that form the basis of all good EDA articles: Designs are getting bigger and more complex; state space is (still, believe it or not) exploding; verification is becoming ever-more difficult; the complexity of today's verification environments often rivals or exceeds that of the design itself; [add your favorites here]. So, what does all of this mean? It means that verification engineers, that hardy breed who take the RTL code authored by others (or perhaps by themselves in a previous, forsaken life) and try to make sure that it conforms in some reasonable way to the original design specification, are doing the best they can under very difficult circumstances.
The dirty little secret the issue that keeps verification engineers up at night fretting about the potential success of their project, the design, and perhaps the company's future is this: It is extremely hard to measure how "good" your verification environment is whether it is robust and comprehensive enough to catch the RTL bugs that could otherwise derail a successful, on-time, on-budget design project. Most certainly, tools have been developed and deployed that give engineers hints about their test coverage and progress and whether they've at least achieved the minimal level of completeness that provides some glimmer of hope for signing off a functionally-correct design. But the information provided by these tools is partial, subjective, and in the worst case can give the verification engineer and project management team a false sense of security about how well the design is verified.
Click on image to enlarge.
It's better than nothing, but
To understand the potential perils of relying on today's popular solutions, let's focus first on the most common of these code coverage. As we all know, code coverage provides a measure of how much the RTL code has been exercised by a given set of patterns, with said measurement typically a percentage of RTL lines executed, signals toggled, paths traversed, or some equally concrete and seemingly comforting number. The problem with code coverage is that it says nothing about the other vital necessities of bug detection the ability of the tests to propagate the effect of a bug to an observable point, and the presence at that point of a checker or monitor that detects the propagated and unexpected effect. Thus, any competent verification engineer and likely a large selection of incompetent verification engineers could easily create a suite of tests that provides something close to 100 percent code coverage yet does a very poor job of truly verifying proper operation of the design as measured by "bug detection potential".
Beyond the obvious shortcomings of tests that exercise but don't propagate localized activity and missing or defective checkers on the outputs of the design, imagine if you will the extreme case of a verification environment that has been hardwired to return PASS for all tests, no matter the actual results of the test itself. Unlikely, huh? Now consider the case of a forlorn and forgotten test whose PASS/FAIL check was long ago and "temporarily" commented out to enable the regression process to continue while a designer fixed an RTL bug detected by the test. Unlikely? Based on past experience, probably not.
The point is a verification environment is an extremely complex thing, with multiple layers of functionality from design to assertion to testbench to simulation wrapper to batch execution script and beyond. Each layer presents unique opportunities to screw things up, with the worst possible result being a happy set of false positives erroneously passing tests that make everybody happy but hide the potential for serious design problems that may not be caught until late in the process, if ever.
Asking a hard question: Should my tests be passing?
Often times, the best solution is found by standing the problem on its head. In the case of measuring the "goodness" of a verification environment, instead of focusing on failing tests and relatively suspect measurements of code coverage, why not focus instead on passing tests? Now, only a fool would advocate the investigation and manual debug of all passing tests. After all, if everything is supposedly working as expected, where do you start debugging? But what if there was an automated way to scour the environment and answer the question, "So my tests are passing is that because everything is really OK, or is it because there are serious deficiencies in my verification environment?" This is the promise of Functional Qualification.