Design Article

IMG1

Complex SoC Testing with a Core-Based DFT Strategy

Sandeep Kaushik, Synopsys and Paul Policke, Qualcomm

2/26/2008 10:08 AM EST

With scaling technology and increasing design sizes, power consumption during test and test data volume have grown dramatically — making it almost impossible to test an entire design once it reaches manufacturing. But, using a core-based test strategy combined with scan compression offers one of the most effective ways to limit both huge data volumes and high power consumption of complex SoC tests.

Traditional scan-based test techniques are losing ground against today's SoC designs. The growth in chip size and the number of scan flip-flops equates to an overwhelming increase in the number of automatic test pattern generation (ATPG) patterns and the number of shift cycles per ATPG pattern. Adding delay testing to the scan architecture further increases the number of ATPG patterns, which puts further demands on automatic test equipment (ATE) memory.

Power consumption during test has also been increasing due to the tremendous switching activity of ATPG patterns and leaky processes. High dynamic power during scan shifting and capture can burn the device, while high instantaneous power can lead to excessive IR drop and ultimately device failure.

Using a core-based divide-and-conquer approach helps to overcome the challenges of high power consumption and huge data volume generated during testing. This article describes the results achieved by Qualcomm, with the help of Synopsys Professional Services, using multi-mode test architecture on its 65nm DSP core; DFT MAX was used for scan compression, and DFT Compiler was used for core-isolation implementation.

Multi-mode approach

In a core-based test strategy, the design is partitioned into reasonable-sized cores that can be tested independently. At the top-level, a test schedule is created based on the available I/Os for test and the cores' power consumption. This strategy enables a DFT approach, based on IEEE 1450 and IEEE 1500 standards, that avoids power problems and can also help reduce test costs while improving test quality.

Each core is to be tested independently and requires a test wrapper cell to isolate the core from the rest of the design and provide test access at the core's I/Os. This test wrapper functions in various modes under the control of a test controller. The wrapper cell shown in Figure 1 has been customized from the default DFT Compiler wrapper to add the additional overrides to control the wrapper modes. The modes supported by this wrapper include:

  • INTEST (wrp_if) mode — For testing the core logic. Wrapper cells on the input side isolate the core from capturing data from outside, and the input wrapper cells capture the scan data. Wrapper cells on the output side capture data from the core.
  • EXTEST (wrp_of) mode — For testing top-level user-defined logic (UDL). The wrapper cells on the core inputs capture data from the UDL, while the wrapper cells on the output side isolate the UDL from capturing data from the core and also launch test data into the UDL. During capture, the input wrapper cells capture the scan data from the UDL.
  • Internal_scan mode — For supporting traditional top-level test. The wrapper cells do not provide isolation but are treated like internal scan chains of the core, and the wrapper cells on input and output are transparent as in functional mode. This mode helps to test paths through the core I/Os while keeping all the core scan chains (including wrapper scan chains) intact. An additional override signal added to the default DFT Compiler cell helps make the cell transparent while keeping the wrapper chains intact in this mode.
  • Mission mode — For supporting the design's functional mode. The wrapper cells are disabled.

1. Custom wrapper for each core to be tested independently.

Click here for a larger version

Along with these wrapper modes, the test strategy uses several different test modes:

  • Scan compressed mode — For testing a core. The wrapper chains are part of the scan-compression internal scan channels and are driven by the scan compression logic. For this test mode (Figure 2), wrapper cells are configured in their INTEST (wrp_if) mode.

  • 2. Scan compressed mode for testing a core (design blocks not to scale).

    Click here for a larger version

  • Reconfigured scan mode — This design has two reconfigured test modes for uncompressed scan (Figure 3). The reconfigured scan mode with 17-pin scan chain interface is the default mode created as part of scan compression insertion by DFT Compiler. The second re-configured scan mode has a 90-pin scan chain interface and suits this core for designs in which a top-level scan architecture of the whole design is used for test. Wrapper chains are also part of the re-configured scan chains. The wrapper cells are configured in their INTEST (wrp_if) mode.

  • 3. Reconfigured test modes for uncompressed scan.

    Click here for a larger version

  • EXTEST mode — For testing UDL. The wrapper chains are used while the scan-enable and test clocks for other core internal scan chains are gated off. In this test mode, the wrapper cells are configured in their EXTEST (wrp_of) mode (Figure 4).

  • 4. EXTEST mode for testing user-designed logic.

    Click here for a larger version

To reduce the test data volume and the test cost, the cores can be tested using scan compression based on adaptive scan technology. The wrapper chains are configured (in INTEST mode) as internal scan channels of the scan compression logic. To provide a test access mechanism for the cores, Qualcomm developed a custom IEEE 1500 Core JTAG Interface (CJI).

DSP example

The Qualcomm DSP core provides a good example of how to implement the core-based test strategy. With a total of about 5 million transistors, the DSP has approximately 56K scan flops arranged in 17 scan channels.

The design team for this chip developed a fully automated DFT insertion flow and integrated this flow with Synopsys Pilot Design Environment. DFT insertion was done using DFT Compiler, and this tool's wrapper-insertion features handled core isolation. DFT MAX performed scan compression. Most of steps in the implementation flow (Figure 5) are based on DFT MAX and DFT Compiler automated features, although the wrapper customizations require Design Compiler (DC) low-level commands that were automated using a design-specific script.

The core uses a hierarchical physical implementation flow. To ensure that core blocks can be designed in parallel, the DFT insertion flow was also done hierarchically. Based on the number of available IOs (17), scan flops, scan compression ratio (10X) and test clock domains (2), a balanced scan chain architecture was created. The scan chain architecture allowed mixing of edges but not clock domains. Also, the scan chains from two core design blocks (A and D) come out as is for scan compression insertion instead of merging them with scan chains from the rest of the core logic.


5. DFT insertion flow.

Click here for a larger version

Table 1 summarizes the scan architecture at different levels of hierarchy where DFT insertion was performed.

Design Block Scan Chains Longest Chain Length
A 54 275
D 126 275
Macro 204 275
Top Level* 17/17/90/1 275/3563/822/1161

Table 1 " DSP Core Scan-Architecture.
*Note: Top-level numbers are for different test modes: scan-compressed chains / reconfigured scan-mode chains / scan chains in expanded re-configured scan mode / wrapper chains in EXTEST or wrp-of test mode.

Core design blocks A and D were scan-inserted first. Scan insertion was done in Physical Compiler to avoid any re-ordering flow for these design blocks. Then scan and wrapper insertion was done at macro level. A multi-mode DFT insertion was done to achieve the required number of scan channels and wrapper chains in the respective test modes. The DFT insertion was done in the logical domain. Here is the multi-mode definition and scan configuration at macro level:

Code Example 1

Once the wrapper and scan chains were inserted, the DFT team used a custom script to customize the wrapper cells and created a new CTL (Core Test Language) model for the macro level of the design for scan compression insertion. The new CTL model was created using DFT Compiler scan extraction flow, treating wrapper chains as any other internal scan chains. Finally, scan compression insertion was done at top-level using the new macro CTL model. Again, multi-mode scan insertion was done to implement the required test modes. All the test control signals were connected to the Core JTAG Interface (CJI) in RTL. The internal-pin feature of DFT Compiler was used to define the CJI outputs as control signals. Here is the multi-mode definition and scan configuration at top-level:

Code Example 2

Implementation Challenges
Implementation challenges

The Qualcomm DSP core uses many custom circuits, especially for register files and memories. A great deal of DFT planning and time was spent in making these macros DFT compliant. Unlike the rest of the standard logic, most of these custom circuits use latches instead of flops. Most of these latches were made scanable by adding parasitic scan latches. The DFT logic was inserted to have memory bypass and a write-through mode to make sure that the shadow logic is testable.

The team also took extensive care to reduce the unknown captures (Xs) in the design during test. Most of these unknowns were around the memory elements, and were avoided by either gating-off clocks during capture or by gating the scan-enable signals of the cells capturing from the X generators.

For delay testing, any timing-related Xs generated by timing exceptions were avoided by gating the required capture clocks. Also, the test wrapper cells used for core isolation were customized to launch during launch-on-capture-based transition delay tests.

In functional mode, all the scan parasitic latches and most of the scan nets were gated-off to reduce the power consumption by test logic.

An on-chip programmable clock control was also designed to generate a maximum of seven capture pulses from an on-chip PLL. The logic lets TetraMAX® ATPG control every capture pulse on a per-pattern basis. The PLL and on-chip clock control for this core were part of the top-level clock control logic and were placed outside the core boundary.

Conclusion

Table 2 shows the area used by the DFT logic compared to the total standard logic in the design. Note that the design's total standard logic area does not include the custom macro area. Additionally, the wrapper cells used for this design had a safe mode (not shown in the Figure 1) for putting the core in a known state when not being tested.

Design Block Target # Longest Percentage of Standard Logic Area
Wrapper Cells 1161 0.83
Scan Compression 10X 0.15

Table 2 " DFT Logic Area

Table 3 summarizes the scan compression ATPG for the DSP core using a 10X compression ratio. The total 11.60X compression was achieved based on stuck-at and transition delay ATPG when compared with standard scan ATPG.

ATPG Test Coverage %font> Scan Comprssion Achieved
Stuck-at Greater than 97% 12.36X
Transition Delay Greater than 90% 11.45X

Table 3 " ATPG Summary

Thus, the proposed core-based test strategy with scan compression achieves excellent test coverage with only a small area penalty, yet limits the test scope to individual cores. As a result, the strategy greatly reduces both, power consumption during test and test data volume.

References
[1] CoreTest User Guide -- DB Mode, version X-2005.09 by Synopsys Inc., Chapter "Wrapping Cores"
[2] DFT Compiler User Guide Vol. 2: Adaptive Scan (XG Mode,) version Y-2006.06 by Synopsys Inc., Chapter "Using Adaptive Scan Technology"
[3] TetraMAX ATPG User Guide, version Y-2006.06 by Synopsys Inc., Chapters "Transition Delay Fault ATPG", "On-Chip Clocking Support"

About the Authors:

Sandeep Kaushik is a Staff, Design Consultant at Synopsys, Inc. He holds a MSEE degree from Stanford University and B.Tech EE from Indian Institute of Technology, Delhi. Sandeep may be contacted at: kaushik@synopsys.com
Paul Policke is a Staff Engineer at Qualcomm, Inc. He has a Bachelor of Science Degree in Physics and Mathematics from Lenoir-Rhyne College, Masters of Science Degree in Physics from University of North Carolina at Greensboro and Masters of Engineering Degree from University of North Carolina at Charlotte. Paul can be reached at: ppolicke@qualcomm.com


print

email

rss

Bookmark and Share

Joinpost comment




Please sign in to post comment

Navigate to related information

Most Popular

Product Parts Search

Enter part number or keyword
PartsSearch


FeedbackForm