This article discusses the non-trivial challenge of detecting and correcting the – often elusive – functional defects that unavoidably arise in the design of complex system-on-chip (SoC) devices. How do we mitigate the conflict between the dramatic increase in SoC design complexity and the need to deliver the design in a shorter time with the same or better design quality?
Clearly, we need new design and verification methods, and we need “all hands on deck” to develop them. That’s why a consortium of six companies and six research institutes set up the Herkules project , with support from the German government. Over the past three years, this project teamed design and verification engineers from leading chip companies and developers of commercial verification tools from EDA companies, together with leading technology research institutes. Their goal was to develop a right-first-time verification approach for large digital and mixed-signal designs – and to ensure that it is widely applicable to the development of automotive and telecommunication systems that must comply with very high quality standards.
And why is Melexis interested in these verification issues? We develop and produce a broad range of mixed-signal, high voltage ASICs and ASSP for automotive applications, increasingly equipped with integrated flash and microcontroller(s), local interconnect network (LIN) physical layer or complete LIN controller, and controller area network (CAN) components. In common with most providers of complex SoCs, we implement a re-usable intellectual property (IP) methodology to speed time to market and reduce development costs.
For us, it is of vital strategic importance that this IP is free of error and malfunction, and that we achieve this quality as early in the design flow as possible. A thoroughly simulated IP block with extensive verification coverage may sound good. But integrate several such IP blocks, and experience from many past projects shows that the probability that the ensemble will fail because of undetected bugs increases significantly. In the automotive world, the consequence could be extremely costly recalls – or worse.
The challenges addressed by the Herkules project were: How can we achieve the confirmed absence of functional errors in automotive and communications IP well before tape-out? How do we know with certainty that we’ve detected and corrected every error? How do we know that our specifications are complete and free of ambiguities?
In this article we address the technical and business implications of failing to adequately answer these questions, and cover the essential ingredients of a verification approach – newly developed within the Herkules project – that delivers the requisite certainty.
Verification holes – The implications
Every designer knows the situation. The design team finishes the design of the “perfect” chip – perfect in its view, anyway – and starts full blown development on the next design. Months later, the first problems surface – problems that the team thinks shouldn’t even exist. After all, it’s an experienced team, which has simulated the design with state-of-the-art methods, with test patterns so crammed with coverage that the return rate should be nearly negative. And still customer returns show up. But why? Is it a defective chip? Is the customer applying the chip incorrectly? Is there a use case that violates the specification? Or is it perhaps a combination of any or all of these?
This nasty surprise has become an industry dilemma. End-user applications demand ever more on-chip processing performance, with ever more cost-effective wafers, in ever smaller process technologies, loaded with ever larger memories, and integrating ever more functionality – additional functionality that previously often occupied an entire discrete chip of its own.
And it’s not enough that the chip looks increasingly like a motherboard. To service different application requirements, the chip often integrates entire groups of functionality with reconfigurable implementations – to simplify supply logistics and, above all, to reduce development costs. But this nesting of complex functionality is not the only source of exploding complexity. In comparison to a multi-million-gate data processing design with high levels of parallelism, a multi-function chip with a much lower gate count can actually be more complex. For example, it may have entire register banks, every single bit of which can control any of the multifarious functions that the chip is required to fulfill. So, the number and intricacy of internal dependencies is huge (see figure 1) – and the likelihood of missed bugs high.
Figure 1: This “small control flow” illustrates the intricacy of internal dependencies