While the increasing use of design intellectual property (IP) has considerably reduced design effort per gate for the chip designer, it has had an inverse effect on the chip-level integration and functional verification effort. IP verification and correct integration have become a dominant source of effort and risk in system-on-chip (SoC) projects.
In this article, we explain how we used complete formal functional verification that enables us as IP providers to certify highest IP quality, and to do so cost-effectively and with a high productivity of 2,000 to 4,000 lines of verified RTL code per engineer-month. The resulting IP quality significantly reduces the IP integrator's effort, cost and risk. Such results have the potential to fundamentally change the proliferation rate of IP and the profitability of the IP business.
IP development and integration " the effort and risk
With the growing use of IP as the main building blocks in SoC design, SoC design quality is increasingly determined by IP quality. Unfortunately, there are no measurable criteria or standards to enable fast and objective IP quality assessment. Consequently, IP integrators are obliged to resort to evidence such as the IP provider's reputation, or to restrict IP use to that which has already been successfully integrated into many SoC designs. Any IP integrator who wishes to have a more analytical assessment must undertake time-consuming reviews and inspections of the IP provider's verification process and results for the specific block.
Most redesigns and respins are caused by functional errors in the modules and IP, and by the interaction between them. The lack of objective metrics for module and IP quality is a primary effort sink for IP providers and IP integrators trying to control the resulting risks. Nonetheless, all too often, this considerable effort still results in error escapes, costly respins and delayed time-to-market.
Delivering and integrating the highest quality IP is gradually becoming a condition for staying in business for both providers and integrators.
The Needs of the IP Integrator
The IP integrator needs IP that operates error-free and exactly to specification, together with an objective certification to this effect; and a precise description of the conditions under which the IP can be reliably integrated into a chip, thus ensuring and speeding its correct integration. The degree to which the IP meets (or fails to meet) the integrator's requirements determines the risk of using the IP.
The IP Provider's Challenge
The primary challenge of IP providers is to meet the IP integrators requirements, and to do so with the high productivity that enables the provider to meet the IP's time-to-market and profitability objectives.
But how can you do this? How can you ensure error-free operation using coverage-driven verification methodologies that are inherently incapable of testing all possible functional scenarios under all possible conditions? How can you give the IP integrator an objective certification of the IP's quality? And how can you rigorously define the IP's integration conditions if the verification methodology examines only a subset of the possible operating conditions and scenarios?
This is how we do it in the Intellectual Property & Reuse department (IPR) of Infineon's Communication Solutions business group (COM).
The IP Provider
IPR supplies COM's worldwide product groups with in-house and third party IP components. The department's comprehensive IP reuse strategy comprises building its own IP, qualifying third-party IP, and qualifying and packaging existing blocks from Infineon's business units for safe and broad reuse. IPR must ensure the IP's right-first-time functional operation and early availability. For this purpose, IPR employs a mix of advanced verification approaches, including testbench automation and formal verification techniques.
IP Example: A Network Processor
IPR developed a network processor, the PPv2, to meet the demanding throughput requirements of COM's wire-line chips across a wide range of applications. The PPv2 is a compact, high-performance, configurable 32-bit RISC processor with an application specific set of 40 instructions, a seven-stage pipeline and fine-grained multi-threading (see Figure 1).
1. This figure shows the pipeline and partitioning of PPv2.
The PPv2 was designed from scratch while retaining full backwards compatibility with its predecessor, PPv1. The PPv2 supports multithreaded execution with up to four contexts, which can be switched with no constraints, overhead and response delay. This feature enables up to four "virtual machines" to execute on the same architecture, allowing a machine to execute (for instance) while another machine has stalled, waiting for a response from the periphery.
Contexts can be arbitrarily switched and/or restarted by both external events and dedicated instructions. In addition to multiple branch instruction types, the context restart can also be employed as a branch. Branches are categorized as "short" branches, which decide whether or not to branch based on the value of status flags, and "long" branches, which make a decision after first performing a more complex operation such as a decrement with a subsequent register value test.
Every branch instruction inherently incurs a time delay penalty, that is, a number of cycles during which the processor executes no instructions. To increase the processor's performance, the PPv2 architecture supports for each branch instruction configurable "delay slots" for each branch. However, the use of delay slots tends to increase program size, so the PPv2 offers a trade off between performance and code volume by optimizing the use of branch instructions and delay slots. Moreover, the PPv2 micro-architecture has to resolve the complicated conflicts that arise from operations under external exceptions and/or operations in the delay slots. Such conflict resolution must be executed while retaining the consistency of the programmer's view and at no cycle cost. The resolution logic deploys quite sophisticated structures to dynamically buffer and reproduce instructions and program addresses, that is, "on-the-fly" resolution while the pipeline continues to execute.
The whole design amounts to roughly 11,000 lines of VHDL code, but this is a poor metric of operational complexity, the challenge of verifying the design, and the risks of not doing it thoroughly: being a programmable IP, any corner-case malfunction that remains undetected during verification has the potential to jeopardize the hardware and firmware development in subsequent reuse, together with the processor software development tool chain that was developed concurrently with the hardware.
The IP Verification Challenge
The verification objectives were broadly similar to those of any processor verification. On the hardware side, we had to ensure the correct pipelined processing of multiple instructions, guaranteeing no undesired interferences between instructions; and ensure the correct operation of permissible, but unpredictable, behaviors such as traps and interrupts. In addition, we had to comprehensively verify data paths with complex bit-manipulations; and ensure independent execution of multiple threads under all possible combinations of instructions, thread switches, traps and interrupts.
To simplify and ease software development, we had to guarantee that pipelining, together with related forwarding and stalling behavior, was transparent to the software programmer. With 40 instructions we had six three-register instructions, eleven two-register instructions, thirteen one-register instructions, and ten no-register instructions. For one pipeline stage, there were thus (6*163 + 11*162 + 13*16 +10) = 27,610 scenarios to verify. Given that four pipeline stages are used in forwarding, and that each stage can have a different context, this produces a total of (27,6104)4= 27,61016= 1,14*1071 scenarios, and this does not even take data values into account.
Moreover, we had to ensure that there was no branch configuration conflict that could "break" the programmer's view. Now, each branch instruction can be configured to have 0, 1, 2 or 3 slots. With twelve branch instructions (long and short), that yields (412) = 16.7 million different scenarios to be verified.
In addition to these challenges, we had a specification of 130 pages that, like most specifications, contained ambiguities. This specification had to be completed and clarified throughout verification.
Finally, we had to precisely describe the integration conditions that must be met by any hardware/software environment into which the PPv2 is integrated, to ensure the processor's correct operation.
Given the huge number of scenarios arising from the combination of configurability and context switching, we deemed it impossible to thoroughly verify PPv2 using simulation. Simulation would have necessitated choosing a representative subset of scenarios to verify, resulting in verification gaps and possible error escapes. Choosing not to verify parts of the design was clearly not an option. We needed a complete verification.
We had employed a combination of simulation and formal verification to verify PPv1, and found that the completeness and productivity of the OneSpin 360 Module Verifier delivered superior bug-detection effectiveness and efficiency. Consequently, we decided to verified PPv2 using only this formal verification solution.