Design Article
Tell us What You Think
We want to know what you thought about this Design. Let us know by adding a comment.
Driving Blindfolded – Solving The Triage Challenge
Daniel Hansson - Verifyter
9/10/2012 10:25 AM EDT
Automatic Triage
Mapping test results to revision information
The Version Control System (VCS) contains the code of the device under test, the test bench, the tests and all the changes made to all of its files. The VCS contains information about each revision of each file or module such as: the committer of the revision, the time of the commit and the code lines that were updated. This is the information needed for triaging: who (committer), what (revision) and when (time of commit). What is needed is a tool that can map test failures to a revision in the VCS in order to extract all the information needed for triaging, including both creating a bug report and assigning it to the right person.

In order to take advantage of the information available in the VCS an automatic triage tool consequently needs to interface against both the VCS and the regression test system in order to map test results to revision information (see Figure 1). The automatic triage tool also needs to be able to communicate the result of its diagnosis automatically to the bug tracking database, email and files on the web and on the normal file system. The automatic triage tool also needs to save the test results and its diagnosis in a test result database to allow for speedy analysis and to be able to amass historical test results information that can be used in its analysis.
Finding the revision at which point a bug was introduced that caused certain tests to fail is done by rerunning the failing tests on older revisions, until the first revision is found for where these tests fail. If tests passed on the previous revision and then started to fail on this revision then we know that the changes made in this revision contained a bug which caused the tests to fail. The revision that contains the bug is called the faulty revision.
Regression testing is a type of verification where the purpose is to ensure that the device under test does not regress in quality. The regression tests should normally pass and if they don’t, it is possible to trace back through the revisions until the earliest revision is found for which the test(s) do not pass. This is the faulty revision. However, if a test has never passed, e.g. a new test that was just introduced, then the result may be always-failed. In order to avoid going back too far a limit must be set at which point a test is concluded to be always-failed.
Diagnosing a test failure to the faulty revision (or less common, concluding that the test has always failed) is a very robust method. The result is deterministic (“this test fails since this faulty revision”), which is different from many other diagnosis methods which lists the most likely sources of an error. For triage it is important to deterministically be able to point out the committer in order to assign the bug report to the correct engineer automatically. The committer of the faulty revision may not have introduced an actual error. Instead the committer may have done an incomplete update, where the changes made were appropriate but the committer forgot to update associated tests or configuration files. It may not even be the committers responsibility to update the other files necessary to make the update complete. However, even in this scenario the committer is the best person to assign the bug report to as this person can quickly notice what is missing to make the update complete and assign the bug report to the appropriate person and add the required changes to the bug report. Automatic triaging is not about blame, but about finding who is the person best suited to analyze a failure in order to fix a bug as soon as possible.
Next: Linear vs. binary search
Mapping test results to revision information
The Version Control System (VCS) contains the code of the device under test, the test bench, the tests and all the changes made to all of its files. The VCS contains information about each revision of each file or module such as: the committer of the revision, the time of the commit and the code lines that were updated. This is the information needed for triaging: who (committer), what (revision) and when (time of commit). What is needed is a tool that can map test failures to a revision in the VCS in order to extract all the information needed for triaging, including both creating a bug report and assigning it to the right person.

Figure 1. Automatic Triage Tool in System
In order to take advantage of the information available in the VCS an automatic triage tool consequently needs to interface against both the VCS and the regression test system in order to map test results to revision information (see Figure 1). The automatic triage tool also needs to be able to communicate the result of its diagnosis automatically to the bug tracking database, email and files on the web and on the normal file system. The automatic triage tool also needs to save the test results and its diagnosis in a test result database to allow for speedy analysis and to be able to amass historical test results information that can be used in its analysis.
Finding the revision at which point a bug was introduced that caused certain tests to fail is done by rerunning the failing tests on older revisions, until the first revision is found for where these tests fail. If tests passed on the previous revision and then started to fail on this revision then we know that the changes made in this revision contained a bug which caused the tests to fail. The revision that contains the bug is called the faulty revision.
Regression testing is a type of verification where the purpose is to ensure that the device under test does not regress in quality. The regression tests should normally pass and if they don’t, it is possible to trace back through the revisions until the earliest revision is found for which the test(s) do not pass. This is the faulty revision. However, if a test has never passed, e.g. a new test that was just introduced, then the result may be always-failed. In order to avoid going back too far a limit must be set at which point a test is concluded to be always-failed.
Diagnosing a test failure to the faulty revision (or less common, concluding that the test has always failed) is a very robust method. The result is deterministic (“this test fails since this faulty revision”), which is different from many other diagnosis methods which lists the most likely sources of an error. For triage it is important to deterministically be able to point out the committer in order to assign the bug report to the correct engineer automatically. The committer of the faulty revision may not have introduced an actual error. Instead the committer may have done an incomplete update, where the changes made were appropriate but the committer forgot to update associated tests or configuration files. It may not even be the committers responsibility to update the other files necessary to make the update complete. However, even in this scenario the committer is the best person to assign the bug report to as this person can quickly notice what is missing to make the update complete and assign the bug report to the appropriate person and add the required changes to the bug report. Automatic triaging is not about blame, but about finding who is the person best suited to analyze a failure in order to fix a bug as soon as possible.
Next: Linear vs. binary search
Navigate to related information

