|
ASIC Technology
Designing the UltraSPARC-1Advanced tools and design methodology were essential, but a major aspect of the design involved tradeoffs and special considerations.
By
|
||||||||||||||||||||
![]() |
Figure 3. Hierarchical approach used to independently verify various sub-units with a high level of confidence and then integrate them into the full-chip simulation environment. |
During timing verification, a difficult task was HSpice netlist extraction--HSpice is a product from Meta-Software Inc. (Campbell, CA). The design team needed to extract sufficient parasitics from the physical layout for accurate evaluation of potential deep submicron effects. CheckMate gave the team flexibility to exactly balance speed versus accuracy, allowing in-depth timing verification without swamping the database and significantly slowing verification.
The design team used Lsim for switch level-verification of large custom blocks such as caches, TLB, and register files. As for datapath design, UltraSPARC-I contains over 20 large datapaths. Manually constructing all of them would have consumed months of engineering effort. With Mentor Graphics' Datapath, engineers automatically generated a majority of the datapaths from structural descriptions, typically going from RTL to the physical design within 24 hours.
Designers experimented with different datapath structures, quickly zeroing in on the optimal configuration. Approximately 90 percent of the chip's datapaths were generated with the tool. By automating the bulk of the datapath design, the team was free to focus on the critical paths that demanded handcrafting.
Hierarchical approach Functional verification represented one of the project's most ambitious and challenging efforts. Three key verification goals were set. They were (1) to achieve fully functional first silicon that could boot the multi-user Solaris operating system and run OpenWindows in a host system, (2) guarantee SPARC V8 32-bit compatibility, and (3) guarantee SPARC V9 64-bit compliance. Using static timing analysis, timing verification was performed independent of functional verification.
A hierarchical approach was used to independently verify sub-units with a high level of confidence and then integrate them into the full-chip simulation environment (see Figure 3 ). To assure basic functionality in a host system required simulating the processor and associated ASICs to eliminate system bugs. A comprehensive multi-processor simulation environment was also developed.
It was also critical to validate the total system design and improve test coverage. The key factors were (1) the complexity arising from a new, more powerful processor and associated ASICs, (2) new firmware, and (3) new release of the operating system. To validate the system required bringing up the operating system and running application programs.
Testing the large instruction streams in an operating system is not practical with traditional simulation and requires hardware emulation. Emulating the design in-circuit was based on success with earlier MicroSPARC II emulation.
Two hardware simulators were used: Verilog-XL, from Cadence Design Systems Inc. (San Jose, CA), during the early stages and Verilog Compiled Simulator (VCS), from Viewlogic Inc.'s Chronologic Simulation Group (Los Altos, CA), for stand-alone tests. VCS was used once the design grew and stabilized. It performed full-chip and multi-processor simulations and regressions, using both RTL and gate netlists.
VCS was chosen because it is the fastest commercially available Verilog simulator. Speed is important because the number of cycles per second simulated equates to the total number of simulation cycles in a given time. Also, compiled-code simulators require less memory than interpretive simulators.
Unit-level simulation The chip's full-custom design style demanded that libraries as well as individual functional units be completely verified. Each unit employed two simulation and verification approaches. Integer and floating point units that executed instructions could employ SPARC assembly code for functional tests. They employed a Verilog stub-model environment. Bus-oriented units, such as load store and external memory subsystem cache controller, used a common driver and stimulus generator checker program (CUDL/MSG). The unit's test environments, with their low-level interface, provided better signals and bus transaction control.
After stand-alone tests verified sub-units, the strategy was to verify the chip in several system environments. The default simulation environment included a fully integrated UltraSPARC-I design with simple behavioral models for the rest of the system. Another environment modeled the uniprocessor system and instantiated Verilog models corresponding to the real system ASICs. A third environment instantiated four UltraSPARCs for multi-processor simulation. In each environment, different processor modes were simulated to enable or disable various processor features.
Gate-level validation Gate-level simulation verified the integrity of the netlist after synthesis, place & route, and layout. Since control blocks, datapaths, and megacells used different design flows, slightly different netlist versions were used at different stages of the design cycle. Also, since emulation tools require structural netlists with no behavioral code, gate-level simulations verified correctness of the netlists delivered to the emulation team.
|
Figure 4. Emulation methodology consisted of four major phases: pre-configuration, configuration, testbed, and in-circuit emulation. |
Early in the design, gate-level verification was employed for the design's control blocks by substituting synthesized versions of the blocks into the chip netlist. As the design stabilized and layout began, netlists extracted from layout were incorporated into the gate-level simulations. A flexible, modular methodology allowed the units' netlists to be moved from pre-layout to post-layout over a number of design releases.
Hardware emulation The emulation methodology consisted of four major phases: pre-configuration, configuration, testbed, and in-circuit emulation (ICE). Figure 4 shows the relationship between phases. A total of 1.2 million gates emulated the processor and system ASICs. The Enterprise Hardware emulator from Quickturn Design Systems Inc. (Mountain View, CA) was used. An automated in-house flow mapped the UltraSPARC-I design's custom library cells to Quickturn emulation primitives. MEM cards implemented large memory arrays: caches, register files, etc. Over 3,000 probes--distributed over five built-in logic analyzers, one per Enterprise box--and three DAS systems, from Tektronix Inc. (Beaverton, OR), analyzed failures instead of traditional debugging tools.
The Tektronix DAS's sophisticated trigger programming capability helped track instruction flow through the pipe and other critical pipe events. Unix booted three weeks prior to tapeout with one instance of a discrepancy between UltraSPARC-I RTL and gate-level models uncovered during the process. It also found problems in the firmware/software for the new system.
Hardware emulation was useful for post-silicon debug and subsequent tapeout verification. A key advantage to emulation was that it provided a common focus to boot the operating system. As a supplement to verification, it was useful for post-silicon system debug.
| UltraSPARC 64-bit microprocessor |
|---|
| UltraSPARC is a 64-bit SPARC V9 processor with four-way instruction dispatch, superscalar processing, and advanced multimedia capabilities. It has a tightly-coupled instruction prefetch and dispatch unit, integer execution unit (IEU), floating-point/graphics unit (FGU), load/store unit, and memory unit. It has external cache control and system interface logic on board. The chip was designed to maximize system efficiency and promote
optimal throughput when executing complex, memory-intensive applications while maintaining full binary compatibility with all existing SPARC applications.
The microprocessor has 16 Kbytes of on-chip instruction cache and 16 Kbystes of on-chip data cache. It contains about 5.2 million transistors. The chip is fabricated on an advanced four-layer metal 0.5-µm process (see figure). Other components complete the UltraSPARC chipset. A pair of data buffers connect UltraSPARC to the system and isolate second-level cache activity from the system bus. Each data buffer is a 70-kgate gate array containing logic and datapaths that allow overlapping of opertaions, resulting in shorter miss latencies and larger bandwidths to and from the system. The external cache is implemented using standard, priplined 1-Mbit SRAMs that are organized as 32 Kbits by 36 bits. Traditional with all of Sun's systems, the core logic of the system was implemented in ASICs. The ASICs totaled approximately 240 kgates of logic and 13 Kbits of SRAM. UltraSPARC block diagram |
Static timing analysis PEARL, from Cadence Design Systems, (San Jose, CA) was used for static timing. The tool traces paths to determine the minimum and maximum times a signal can change state; thus, no input stimulus was required.
Timing analysis identified maximum operating speed and range of operating conditions for which clocking hazards are avoided. In edge-triggered systems, the tasks are tied to two phenomena: zero- and double-clocking. Zero-clocking occurs when combinational logic in the design is too slow to produce a valid set of outputs within a given clock period. Hence, the phenomenon determines maximum operating speed. By contrast, double-clocking occurs when combinational logic is fast enough to produce more than one set of outputs for a given clock period. Besides the speed of combinational logic, clock skew and timing parameters of the state elements directly affect the two phenomena.
"False paths" are a problem inherent in static timing analysis. The paths are never exercised due to functional relationships between signals. Potential false paths were identified and reviewed by the design team. Monitors were placed in the functional simulation to verify that none of the diagnostic patterns ever exercised these false paths. While this methodology does not eliminate all false paths--the approach is conservative--at least all true paths are identified and verified.
The UltraSPARC-I project team met or exceeded all the project goals. The authors would like to acknowledge the effort of the UltraSPARC-I project team members in making the design a reality. The team developed a methodology which took into account the realities of deep submicron design and today's EDA tools. *
Shrenik Mehta is a hardware manager with Sun Microelectronics, a division of Sun Microsystems Inc. He was the validation manager for UltraSPARC-I and validation lead for the microSPARC-I and microSPARC-II processors at Sun. His current responsibilities include verification of UltraSPARC-I derivative products and evaluation of new verification tools and flows.
Robert Garner managed the UltraSPARC-I microprocessor project. He was responsible for the architecture, logic, and verification teams. Garner was a co-architect of the SPARC architecture and lead designer of Sun's first SPARC product, the Sun4/200. He is currently director of Java Media Processors at Sun Microelectronics.
Hemraj Hingarh is director of engineering at Sun Microelectronics. Since joining Sun in 1992, Hemraj has been responsible for design and development of the UltraSPARC microprocessor family.
Dennis Chen is a senior engineering manager at Sun Microelectronics, responsible for logic and verification teams of next-generation designs. He was the founding member of the UltraSPARC-I microprocessor design team, responsible for the load/store and memory management units as well as both the verification and hardware emulation groups.
David Greenhill is a circuit design manager at Sun Microelectronics. He joined Sun in 1992 to work on UltraSPARC. He has worked on circuit design, datapath methodology, timing analysis, and the memory management unit design.
Peter Fu is a Staff Engineer at Sun Microelectronics and was involved in the design and validation of the UltraSPARC-I processor. His current involvement includes high- and low-level testing through modeling, simulation, and emulation as well as the coordination of engineering changes and tapeout processes.
To voice an opinion on this or any
Integrated System Design
article, please e-mail your message to
michael@asic.com.
|
|
||
Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints| RSS|
Digital| Mobile |
| Network Websites |
|
International |
|
Network Features |
|
|
|
All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved. Privacy Statement | Terms of Service | About |