|
System DesignGeneral Instrument Designs A Set Top Box ChipsetA design methodology of a satellite and broadcast terrestrial receiver decoder chipset.By Brian Balthazor, Mike Lodman, Steve Pollmann, Rick Price, Serdar Yilmaz,and Linley Young
As computers and video begin to merge, the arrival of a whole host of digital video products is imminent. In this article we address the General Instrument (San Diego, CA) design approach to a five chip set DigiCipher II Decoder implementation for both satellite and broadcast terrestrial applications. We fabricated the 2.5 million transistor chipset in the 0.65µm triple-layer metal (3LM) process from Motorola (Phoenix, AZ). This chipset represents in
excess of 50 man-years development time.
Figure 1. The encoded data stream is transmitted via the satellite encoder and is subsequently received by either a home backyard satellite dish, local broadcast station, or cable headend.controlled by an embedded microcontroller containing an 8-bit processor, ROM, battery-backed RAM, power management circuit, various security circuits, and a three-port DMA interface. The ACP also contains a universal asynchronous receiver/transmitter (UART) interface which communicates with an optional external ASIC via a secure serial protocol used to extend the on-chip access control features. The packet processor provides front-end MPEG-2 transport layer parsing to route ECM packets internally. This module is configured via a synchronous serial bus which is fully compatible with the Motorola serial peripheral interface (SPI).
Figure 2.The demodulated MPEG-2 compatible data stream is corrected by the FEC, sent to the ACP for authentication, and simultaneously routed to the GTP, APD, and VDP for OSD, audio, and video processing
Graphics and transport processor (GTP)
The GTP
performs three primary functions: (1) processing MPEG-2 transport stream and message oriented data, (2) routing audio data to the ADP, and (3) superimposing on-screen display images on VDP video and directing the combined image to the video encoder. The GTP also interfaces to the 68xxx processor to allow dynamic OSD font loading and manipulation, and several ancillary firmware functions.
Figure 3. The design flow is an iterative process moving from an RTL description to a Verilog netlist and finally a layout database.Power consumption and packaging We desired conventional, low-cost plastic packaging for all ASICs to provide the lowest cost component, which further required power limiting to achieve targeted heat dissipation goals. All circuits ultimately satisfied their targeted limits requiring no thermal enhancements. Testability Each ASIC included logic to facilitate both debug and production test, and all but the GTP utilized special operating test modes to provide visibility (and in some cases, control over internal signals). The architecture of each particular ASIC dictated the design of the test logic. In the case of the FEC, test multiplexers are used to both view and supply internal values to the different stages of the pipeline. The VDP uses an 8-bit output port to view internal control signals, while the ACP uses two hardware test modes. The ACP also makes use of a suite of embedded firmware routines for built-in self-test (BIST). The chipset also implements boundary scan technology to support board level testing. Each ASIC contains a Test Access Port (TAP) compatible with the IEEE boundary-scan standard 1149.1. The TAP is used to access the boundary-scan controller and chain which support the following test instructions: IDCODE, BYPASS, SAMPLE/PRELOAD, and EXTEST. On the ADP, the TAP is also used to access memory BIST logic. In addition, care has been taken in design and cell utilization to allow I DDQ testing of each chip to complement full clock rate functional testing. Development Methodology Figure 3 presents the overall DigiCipher II ASIC design methodology. Designers initially worked from a set of system requirements and definitions generated by the GI systems group. A C-model demonstrating gross functionality was created by either systems or ASIC designers to assure correct bitstream processing. RTL descriptions were then created in Verilog and subsequently simulated to verify that the modeled hardware behavior was identical to the C-model. The design was then synthesized to gates and re-simulated to verify agreement with the RTL models. Following layout, we used back-annotated timing delays to simulate actual layout conditions and verify the RTL model behavior.
Figure 4. In subsystem verification, several transport streams are simulated to ensure MPEG-2 syntax and good functional coverage.The main functions of the encoder model include video data compression, video, audio, and ECM packetization multiplexing of the video, audio, and data packets into bitstreams, and generating a detailed trace file for each vector set. The VDP chip designers behaviorally simulated about 70 bitstreams representing 500-plus frames. The same bitstreams were also decompressed by the decoder model which output reference video data that was compared bit by bit to the VDP's video data. The GTP chip designers simulated approximately 30 MPEG-2/DCII bitstreams with an average of 30 packets each for a total of about 2,000 ECM messages. About 50 AC-3 frames and 100 MUSICAM packets were simulated on the GTP and ADP designs hooked together. About 700 NTSC and 300 PAL frames were behaviorally simulated to verify OSD functionality in the GTP design, and over 50 packets were simulated to verify PLL control and master clock generation. Behavioral coding Designers spent well over 50 percent of the design time coding each design in Verilog. Behavioral coding took on various forms depending on each designer's preference. Care was taken to use non-blocking assignments, which removes the order dependency within a begin/end block. This was needed to ensure that the behavioral model and the synthesized version of the model would be logically equal. Designers typically partitioned the design on a functional basis with special attention paid to isolating clock domains in an effort to minimize asynchronous interfaces. Designers were able to choose between two simulators for running behavioral simulations: VCS from Chronologic Simulation (Los Altos, CA); or Verilog from Cadence Design Systems (San Jose, CA). There were trade-offs between the compile and run times of the simulators, so it was the designer's decision to choose the simulator with the least overall compile and run time. Because VCS has a slower compile time but a faster simulation time than Verilog, designers typically chose Verilog for simulating short vector sets, and VCS for long vector sets. Simulation results from both simulators were analyzed using a waveform display program called SignalScan. The systems group and designers of the VDP and GTP used two different tools to visually check the decoded video data and to aid in debugging: a shareware application called Xview, and the Viewgraphic Viewstore. Xview is a Sun tool used in the X-windows environment, which accepts a YUV data format and outputs video to a Sun workstation monitor. A C program was first used to convert video data from CCIR-656 format to YUV. Xview further allowed designers to zoom in on selected lines and samples of the frame. Designers also used Viewgraphic's Viewstore hardware for real time playback of digitized video which provided the ability to visually measure video quality across several frames and perform side-by-side comparisons of video quality in the various prediction modes as defined by the DigiCipher II/MPEG-2 standards. We used a C program to convert frames of CCIR-656 video to the RGB format accepted by Viewstore. Users can step through frame by frame and display field one and field two separately for debugging purposes. Figure 4 graphically describes the video subsystem verification process.
Figure 5. GTP Simulation Environment. The LM1200 hardware modeler along with a Chronologic VCS PLI interface
and a ViewStore box provides a system-level simulation environment for extensive testing of the GTP.
MPEG-2 Test Vector Sets Numerous transport streams were exchanged between General Instrument and external sources including Teracom, Scientific Atlanta, Sarnoff, GoldStar, Alcatel, and Samsung. The main goal was to ensure our interpretation of the MPEG-2 syntax and also to ensure good coverage of a wide variety of DigiCipher II/MPEG-2 modes. Approximately 40 frames were behaviorally simulated on the VDP. This served to verify packet processing and video decompression at various transport rates. The GTP chip designers simulated about 20,000 packets to verify packet processing and PLL control. The ACP chip designers also simulated about 20,000 packets to verify forced encryption mode, unencrypted packet processing, and processing of packets which had the scrambling control bits randomly set to encrypted. The LM1200 hardware modeler was selected for hardware modeling of the Motorola MC68xxx component and for exhaustive testing of the GTP. Figure 5 displays the simulation environment using the LM1200. Logic Modeling provides a Verilog Programmable Language Interface (PLI) for the Chronologic VCS simulator which allows the 68xxx hardware model to be instantiated into the simulation. The 68xxx processor physically resides in a circuit card within the LM1200 which talks over the ethernet network to the VCS PLI interface. As this is a functional only simulation environment, all DRAMs, SRAMs, and ROMs were written with zero delay models to assure the quickest simulation time. The 68xxx firmware was created using the Microtek Development Systems (Hillsboro, OR) development environment and brought into the simulation environment as a single contiguous ROM image in a Verilog compatible hex ASCII format. A custom C utility performed the Microtek object code to hex ASCII translation. A separate debug model was also included in the environment which monitored GTP pins and generated log messages, indicating GTP read/write register activity and providing further absorbability. Custom PLI provided the ability to write binary file video data at the output pads of the simulation, and then a post-processor transformed the data into ViewStore compatible format. The video data was then displayed on a monitor after downloading to the ViewStOR Synthesis methodology After encountering several performance problems with version 3.1a of Mountain View, California-based Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys ' development tool suite, we went with version 3.0 to synthesize the ASICs. We preferred Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys dc_shell scripts over interactive synthesis because of the iterative nature of synthesizing large designs. We used a combination of bottom-up and hierarchical compile strategies. Modules were partitioned into 5,000-10,000 gate blocks and were compiled hierachically and individually. We then performed a bottom-up compile to connect these blocks together at the top level. As per the example in Figure 6, assume an ASIC had a top level called TOP and submodules called A, B, and C. Also, assume A has submodules A1, A2, and A3. Modules A, B, and C were compiled hierarchically and TOP was compiled with a dont_touch attribute on A, B, and C. Hierarchical compilation is the process of reading in modules A1, A2, and A3 and compiling them simultaneously under module A. This approach lets Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys optimize the intermodule paths and is the most straightforward because constraints need to be applied at module level A and not at the submodule level A1, A2, and A3. Static timing analysis We performed a final static timing check for setup and hold following layout to ensure that all nets in the ASIC design meet timing requirements for best and worst case
Figure 6. A typical design hierarchy illustrating the bottom-up synthesis in which constraints are applied only to modules A, B, and C.operating conditions. Prior static timing checks used estimated delays based on estimated wire lengths. For this timing check, we used actual calculated wire loads from the layout based on length, fanout, and technology. We used the net loads from the layout to generate net delays for best- and worst-case operating conditions by the delay calculator in the Motorola ASIC tool kit. This delay calculator returns a Verilog SDF format file with which Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys is able to read in and back annotate the corresponding nets. Clock constraints, input timing constraints, and multi-cycle and false path attributes are then applied. Two reports need to be generated for a complete static timing analysis: (1) a setup check on all flip-flops needs to be performed using the worst-case timing files generated, followed by (2) a hold check on all flip-flops using the best-case timing generated. These two checks should insure that the design will properly perform within the operating range specified. A constraint report for both operating conditions is then generated within Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys , with all violators listed. It is likely that one will see some false paths in the report which have been missed during the constraints, for example, paths crossing clock boundaries or input and output ports of a memory tied to a bidirectional bus. These can be marked out and the design retimed and another report regenerated. Workstation configurations Both synthesis and simulation can consume considerable workstation resources. Generally speaking, module level synthesis, RTL simulation, and gate simulation did not cause many problems. Assuming one uses the Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys recommended 15 kgate-or-less per module rule, modules could synthesize using a workstation with 128 mb RAM and anywhere from 128- to 640 mb of virtual memory. Even ensuring that no logic exists at the ASIC top level, Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys has a voracious memory appetite which is further whetted when multiple clocks are employed. Top level simulations almost always required a machine with 640 mb of RAM and several gigabytes of virtual memory. All machines were SPARCstation 10s with either one, two, or four processors. Design verification methodology We used a number of Motorola-supplied tools after netlist synthesis for clock tree insertion and to verify the correctness of each design and vector set. Once each design was synthesized, designers used an electrical rules check (ERC) tool to check for maximum fanout violations, illegal I/O pad configurations, multiple drivers, and unconnected inputs. For those cases categorized as warnings, the designers had to determine whether the warnings were false or whether a fix had to be made. Error messages resulted in a design fix. Next, designers ran a delay calculation tool which read in the netlist and user specified technology libraries for best and worst case conditions. The tool output SDF files for each condition, along with edge reports listing nodes having excessively slow rise/fall times. In order to solve these problems, designers opted to fix the netlist manually if the number of problems was manageable or re-compile the netlist with Synopsys .com/isdweb/&lf=isd-sendtolog"> Synopsys . After the problems were corrected, layout engineers worked with designers to develop a floorplan and place macrocells. This was followed by running a Motorola-supplied clock tree synthesis tool that bases its optimization algorithm on placement information. In addition, the designer also supplied the tool with a set of initial conditions and goals (such as the maximum skew across each clock network, the tree depth, and the preferred buffer type). Finally, the tool inserts clock buffers into the design and creates a new Verilog and DEF netlist. After running and verifying best- and worst-case simulations, the designers fed the Verilog Value Change Dump (VCD) files from the sims into a tester tool that checks for violations of tester restrictions. The tool also indicates strobable regions within a cycle for each output. The final step in vector verification involves running a dithering tool which attempts to mimic the tester skew on inputs. The tool randomly skews both clock and data by the maximum tester skew amount and checks whether the outputs still match the outputs from a pre-dithered simulation. High-level design a necessity In a demanding market such as home electronics, time to market is extremely critical. Schedules tend to be very demanding, providing little time for methodology shortcomings and surprises. VLSI design at this level of complexity (2.5 million transistors) is unusually computer hardware and software resource intensive. It is estimated that close to 100,000 SPARC-station 10 equivalent CPU hours were consumed during the initial design phase first pass VLSI tapeout. Managing the simultaneous development of five very complex VLSI devices can only be accomplished through the use of behavioral design languages and logic synthesis tools as described herein. Attempting this project with conventional schematic design, entry and verification techniques simultaneously with the evolution of compressed video/audio entertainment delivery systems would be worse than impossible. It is estimated that General Instrument's cost of tools per VLSI engineer is in excess of $100,000 per engineer. While this appears to be very high, the alternatives are poor considering the marketplace opportunities. * Brian Balthazor has design experience in telecommunications, computers, and multimedia; Mike Lodman has design experience in digital audio, RISC CPU, memory cache, and DSP ASICs; Steve Pollmann has design experience in FPGA, neural network, forward error correction, and DSP ASICs; Rick Price has design experience in focal plane array (FPA) readouts and custom ASICs; Serdar Yilmaz has design experience in massively parallel computers and DSP ASICs; Linley Young has design experience in DSP ASICs. To voice an opinion on this or any Integrated System Design article, please e-mail your message to: michael@asic.com. integrated system design March 1995[ Articles from Integrated System Design Magazine ] [ ICs and uPs ] [ Custom ICs and Programmable Logic ] [ Vendor Guide ] [ Design and Development Tools ] [ Home ] For more information about isdmag.com e-mail cam@isdmag.com For advertising information e-mail amstjohn@mfi.com Comments on our editorial are welcome. Copyright © 1996 - Integrated System Design Magazine |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints| RSS|
Digital| Mobile |
| Network Websites |
|
International |
|
Network Features |
|
|
|
All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved. Privacy Statement | Terms of Service | About |