United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 

Embedded Systems

ASIC Design Flow Scores on First Pass

Engineers at Crystal Semiconductor and Cirrus Logic team up to design a wavetable synthesizer chip.

by Shekhar Patkar and Pran Kurup


Used in multimedia personal computers, game boxes, karaoke machines, and low-cost sound modules, Crystal Semiconductor's CS9236 (code-named "Bach"), a single-chip audio wavetable synthesizer, provides significantly more functionality than current offerings at a relatively low cost. The chip features a high-quality General MIDI (Music Instrument Digital Interface) sample set that includes 128 melodic instruments and 47 percussion sounds as specified by the General MIDI Level 1 specification. The synthesis engine can generate up to 32 notes simultaneously, and digital reverberation and chorusing effects are also included.

The key to the successful implementation of any design lies in its design methodology. In this respect, Bach was no exception. Previously, the designers had always followed a well-chiseled custom design flow. However, the complexity of Crystal's chip designs had grown beyond the capabilities of the existing custom design methodologies. Further, design reusability was critical, and any new module designs had to be flexible enough to be modifiable for other designs.

The designers were able to reproduce high-quality audio yet reduce the amount of memory required to hold the digitized sound samples by 75 percent without degrading the sound. Furthermore, they were able to deliver fully functional samples to customers immediately after receiving first silicon.

All the accomplishments were made possible thanks to extraordinary collaboration between Crystal's design engineers in Austin, Texas, and the methodology engineers at Cirrus Logic R&D in Fremont, Calif. The Austin engineers custom-designed the memories, the clock generator block, and the pads. The rest of the chip consisted of synthesized logic blocks built out of standard cells designed in Fremont. Although the division of labor had obvious advantages, the amalgamation of the two efforts presented some unique problems. As the semiconductor industry strives to achieve system-on-a-chip integration, collaborative efforts like the one used on this project gain special significance.

The design The architecture of the CS9236 can be divided into three elements: the MIDI interpreter, the pitch generator, and the effects processor. The design includes 20 on-chip macros (RAMs, pads, clock generator, and a sample ROM) and several datapath elements. The blocks that make up the chip were synthesized using 0.35-µm standard-cell libraries.

We addressed the front- and back-end parts of the design flow separately: The front-end activity was executed in Austin, with setup and training provided by engineers from Fremont who were temporarily stationed in Austin for that purpose. But the back-end work presented some unique challenges, so the initial place-and-route efforts were done in Fremont. The Austin engineers were trained in this methodology at the same time, and they performed the final placement and routing on their own, highlighting the success of the joint project. Parasitic extraction and clock analysis were done in Fremont, and LVS and DRC in Austin.

Front-end design flow The front-end design flow was fairly straightforward. Design entry was done with Verilog RTL. This step was followed by behavioral simulation using Verilog- XL. The verified Verilog HDL was then synthesized into gates using Design Compiler from Synopsys .

An important aspect of the design flow was verification using emulation. This effort, based on Synopsys ' Arkos, a processor-based emulator, was critical to identifying several bugs and building the overall confidence in the design database. Arkos's cosimulation capability allowed the use of existing Verilog testbenches. Further, using Arkos the Austin designers were able to transition from a Verilog to a C testbench, significantly increasing simulation speed. As a result, they were able to run tests overnight and have substantially long sound samples to play every morning, contributing in a very large way to the success of the design.


Figure 1. The back-end design requires extensive extraction, simulation, and analysis to ensure the chip performance.

The logic synthesis methodology has a significant impact on the turnaround time from verified RTL code to synthesized gates, especially in designs with large gate counts. The synthesis flow developed in Fremont facilitates structuring the various netlist formats (EDIF, Synopsys , gates, RTL, and so on), synthesis and simulation scripts, and source code in a manner that is easy to understand and maintain. An in-house program was used to figure out the dependencies between the various modules and submodules in the design and then create a dependency tree. The Unix make utility was then used to maintain the entire design and synthesize only those parts that were necessary. Further, this approach facilitated the distribution of synthesis tasks among multiple workstations, thereby resulting in a significant speedup of the synthesis process.

Back-End Design Flow At the back end, we performed gate-level simulation using Cadence Design Systems' Verilog-XL and SDF generated from Design Compiler (see Figure 1). The Arcadia tool from Epic Design Technologies was used for full-chip postlayout lumped-parasitic extraction. The load data was then converted into a Synopsys set_load script and back-annotated into Design Compiler. The SDF generated from Design Compiler was then used for postlayout simulation. Simultaneously, full-chip transistor-level simulations were performed using Epic's TimeMill.

With the use of innovative architectures and algorithms, the Austin designers were able to reduce the timing criticality to the extent that full-chip RC extraction was deemed unnecessary, and the verification cycle time was significantly reduced by capacitance extraction only. However, clock skew analysis procedures must be very stringent, and therefore RC extraction was carried out on clock nets, again using Arcadia. The resulting data was then fed to the Ultima Delay Calculator from Ultima Interconnect Technology for clock skew analysis. The tool lists the delays and other information for each clock buffer. Clock skew numbers were generated by postprocessing on the results. In this manner, the verification was completed in a very short time without sacrificing any accuracy.

The biggest problems were encountered during floorplanning and placement and routing. Bach has a huge sample ROM that takes up almost half the chip area, leaving a narrow reverse L area for the standard cells (see Figure 2). This shape was not conducive for a three-layer metal ASIC process, because of the inherent asymmetry in metal usage: one arm had two metal layers in the dominant direction, whereas the other had only one (metal 2). The design also had a large number of intermodule connections, which resulted in many long wires traversing the two arms.

The large number of macros, including memories, exacerbated the latter problem and caused the chip to be very routing-intensive. However, because of the shape-related problem, it was difficult to create enough routing resources to take care of all the requirements in one arm of the L, and congestion occurred in the regions traversed by the long wires. Careful floorplanning was therefore critical, leading us to use Cadence's Preview.


Figure 2. This floorplan revealed congestion problems in routing that were caused by the standard cell's high density and the long connections between the arms of the L.

During this stage, a datapath compiler developed at Cirrus Logic was used to automate datapath design and preplacement. The tool works either at the behavioral level (using function calls) or at the RT level (instantiating generic components) and is capable of generating a placement constraints file that can be entered into floorplanning and place-and-route tools. Constraints and directives were provided to the datapath compiler through a setup file.

The methodology for using this tool is closely tied to the Designware libraries designed for Design Compiler. Directives to Design Compiler are included in the source HDL code to choose among different implementations of prebuilt Designware parameterizable blocks like adders and multiplexers. These blocks, written in VHDL, can be targeted to any Cirrus Logic ASIC process library while compiling in Design Compiler. They also contain encoded placement information that can be used by the datapath compiler. After synthesis, we generated an EDIF netlist and used the in-house compiler to produce a placement constraints file, which was later read into Preview for floorplanning.

To preplace or to group? The placement information from the datapath compiler affected a very large percentage of the standard cells. Further, the compiler runs much faster than commercially available timing-driven placement tools. Consequently, two approaches were possible. One involves preplacing the logic and restricting the place-and-route tool (Cadence's Cell3) to work around it. The other approach is to "group" the logic during actual placement and routing so that it remains together, rather than being spread out across the available chip area. For Bach, the effects processor, the sample rate converter and the envelope generator all had numerous datapath elements. This meant that there might be a need for preplacing these datapaths, as described in the previous section, to meet the timing requirements. On the other hand, there is much advantage gained in allowing the P&R tool more freedom by grouping elements on the basis of the information from the datapath compiler rather than having it place the elements rigidly. As mentioned earlier, the fast cycle from design to floorplan enabled designers to reduce the timing criticality, so that the latter choice became preferable.

The experimentation performed with the datapaths gave another reason for preferring grouping over rigid placement. The vertical arm of the reverse L shape was completely blocked by a huge datapath in the effects processor that used up routing resources required by the memories. This problem is inevitable in a design that involves large macros and tight die size requirements. It makes routing a very tricky task and calls for very imaginative floorplanning.

Digitizing sound for compact devices
Crystal's CS9236, or "Bach," chip, is a 32-voice MIDI (Music Instrument Digital Interface) wavetable synthesizer that features full voice, effects, and pitch programming. To accomplish these functions on a single chip, the method used to store the digital representation of the analog audio signals needed modification. The biggest issue Crystal confronted was how to minimize the chip's memory requirements.

The sample ROM block on a wavetable synthesizer chip consists of audio data that has been converted into digital signals. The synthesizer plays back those signals in specific sequences to produce sounds. The signal data typically require 1 to 4 Mbytes of ROM, which would easily fill three to four chips.

To reduce the size of the sample ROM block, the designers at Crystal broke down each sound into its fundamental components. Then they compressed the resulting data to fit on a single chip by eliminating the frequencies in each sound that weren't essential to produce the sound. In this way the designers made room for the other necessary functional blocks without sacrificing functionality.

The Bach chip interfaces to the system through an external MIDI input stream--such as a computer keyboard--and uses an external digital-to-analog converter (DAC) to convert the digitized sounds into analog signals. There are no extensive bus or memory requirements. The reduced size and low cost of the Bach chip make it suitable for embedded systems like sound cards in desktop computers, keyboards, and karaoke machines.

-- Lori Maupas

Eventually, the grouping technique in the floorplanning environment, with input from the datapath compiler, was used to localize the critical blocks, and the placement tool was given the freedom to select the best strategy for placing the cells within the groups. The design was thus kept loose, leaving room for corrections and future enhancements. Furthermore, although better from a timing perspective, the datapath preplacement approach would restrict routing resources to the point where adding gates and routing would be almost impossible without increasing the die size further.

The exercise resulted in several improvements to the design that saved time and gates and gave the engineers a very good idea of how and where automatic preplacement should be used, so that trying to optimize the datapaths was well worth the effort.

Acknowledgments The authors would like to thank all those who contributed to the success of this project. Murali Munnuswamy of Crystal Semiconductor deserves special thanks for bridging the gap between the design team in Austin and the methodology team in Fremont.

Shekhar Patkar is a design engineer in the R&D Department at Cirrus Logic Inc. (Fremont, Calif.).

Pran Kurup is an R&D project manager at Cirrus Logic. He is the co-author of Logic Synthesis Using Synopsys , 2nd edition.

To voice an opinion on this or any Integrated System Design article, please e-mail your message to miker@isdmag.com.


integrated system design  August 1997



[ Articles from Integrated System Design Magazine ] [ ICs and uPs ]
[ Custom ICs and Programmable Logic ] [ Vendor Guide ]
[ Design and Development Tools ] [ Home ]



For more information about isdmag.com e-mail cam@isdmag.com
For advertising information e-mail amstjohn@mfi.com
Comments on our editorial are welcome
Copyright © 1997 Integrated System Design Magazine

  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Looking for a new job?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
SRC Expands R&D Centers
The Semiconductor Research Corp has added a new center to its university R&D efforts.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About