The cost of testing complex system-on-chip designs will soon surpass the cost of manufacturing them. Clearly this is an unstable situation. It is simply too hard to keep up with Moore's Law. While the automatic test equipment industry keeps an impressive and steady annual productivity improvement of 15 to 25 percent, this is not good enough to keep Moore's Law at bay.
Costs associated with system-on-chip (SoC) testing are pretty simple. There is a setup for each chip test that includes a tester program and a test jig. Then there is the tester itself. For complex, high-speed SoC devices, testers cost between $2,000 and $9,000 per pin. Commonly, these chips have many hundreds of pins, so testers cost many millions of dollars. The current path of test equipment evolution cannot be continued. Testing 100-million-transistor designs through 200 access pins with an 800-MHz clock creates a burden too heavy for ATE providers to bear.
With new process technologies becoming available every 18 months, new testers are required in the same time frame for the high-end devices. These facts make tester time very expensive and, in fact, they dwarf the one-time costs associated with the development of the tester program and fabrication of the test jig.
There are only a few strategies for reducing the cost of test:
- Test less. This is not viable because the further bad devices get into the system assembly process, the more expensive the failure becomes.
- Test more efficiently-apply methods that reduce the time it takes to apply a set of tests.
- Test differently-use alternative test types and strategies that cost less.
- Lower the cost of testers. This is everybody's favorite answer, and there is actually some hope it might come to pass.
A number of interesting approaches have recently been deployed. But there are many contributing factors to consider when trying to understand the increasing complexity of testing today's SoC designs, and projecting how to test the next generation.
The first factor is the nature of the new designs themselves. They have more clocks, with more clock rates. Clocks are gated in new ways, which wreaks havoc with tools. There are a lot more memories of different types that are "buried" in the chip logic, making access more difficult. Embedded cores and new, heterogeneous process features are all adding to the complexity of device test.
Wafers are getting larger, and consequently there is more process variation across a single die. Performance testing (for example, for delay faults) is further complicated by this growing process variation. Critical and near-critical paths predicted prior to fabrication may not be performance-limiting paths because of the distribution of process variation. Also, there is a trend toward fewer levels of logic between registers; this means that more near-critical paths will be more susceptible to process variation.
Another trend is toward a higher percentage of the masks being related to interconnect. The complexity of six to seven layers of metal arranged in complex 3-D structures with more than 4 kilometers of wire length on a chip is tremendous.
The higher interconnect density has led to a higher percentage of interconnect-related failures. This is significant because interconnect failures often do not manifest themselves in the same way as assumed by many of the test-related tools and their associated faults models.
Two basic strategies are used for identifying defects created during the manufacturing process. The first method, called functional testing, creates tests that simulate expected functional stimulus and response of a design. The idea is to see whether or not a chip is performing its intended functions. While this sounds reasonable, it is simply not practical to test each chip to see if it can perform all of its functions, and even that would not guarantee finding all types of failures.
The second method of testing, called structural testing, directs its efforts at determining whether or not a particular physical failure has occurred. The types of failures typically tested for by structural testing are the so-called stuck-at fault models, which model interconnect wires being shorted to power or ground.
Structural testing has its limitations as well. The current collections of standard fault models do not adequately cover the growing proportion of interconnect-related defects. One of the better qualities of structural testing is that the associated ATE can be significantly less expensive than their full-blown, general-purpose counterparts.
Recently, two factors have become clear. Functional testing is too expensive to use alone as a test strategy. And, improvements in design-for-test are needed to realize its full potential and to completely win over designers. With the myriad of new issues that must be addressed in SoC design, it is clear that wherever automation is possible, it ought to be leveraged.
Most designers are aware of design-for-test (DFT) and built-in-self-test (BIST), and most use some form of the two for at least portions of their designs.
What has not happened so far is the optimization of those techniques for the purpose of reducing the test time to achieve a desired defect level. In other words, DFT needs to be optimized to reduce test cost. There is some promising work related to test pattern compression that addresses the central test time issue. There are also a growing number of sources for DFT automation that provide adequate-quality solutions.
From the design side of the test equation, DFT techniques look to be the clear winner in the near and somewhat more distant future. At this point, it seems that all of the designer's eggs are in the DFT basket. This is good because consensus like that helps to drive product entries into the marketplace.
A clear focus is essential on the "testing economy" by the ATE industry. One of the bright spots in the morass of the test issue is that a number of ATE products have been announced for new structural testing, focusing on supporting the tester requirements for the DFT and BIST. The advantage of this focused approach is that these testers can be sold for significantly less than their general-purpose ATE brethren. The vendors project pricing in the $300 to $1,000 per-pin range, compared with $2,000 to $9,000 for full-function testers.
The dis-aggregation of the tester functionality into a series of "lighter-weight" products is analogous to the shift from mainframe computing to workstations. In this case, there will not be the same order of magnitude of economies of scale.
Tools will play a critical role in closing the test gap, and tools can help designers get results. By creating more-optimal solutions, there will be much more margin to introduce the performance and area costs associated with DFT. Microarchitectures that work well without DFT structures in place will not always still work when they are added. Working from a higher level of abstraction offers designers the ability to create microarchitecture alternatives.