United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 



Verilog Simulation Bridges the Gap Between PLDs and ASICs

Designers used a Verilog simulator and an ASIC design flow to create a PLD that would emulate an ASIC, enabling faster and more accurate verification and testing.

by Allen Vexler

When we design a programmable logic device (PLD), the problems we face look remarkably similar to those we must solve when we design an ASIC. These days, device complexity, length of development cycles, and short delivery deadlines are surprisingly similar for the two technologies. At System Design Group (SDG) in San Diego, we develop both ASICs and programmable logic devices.

SDG recently met a design challenge that required us to bridge the gap between PLDs and ASICs. One of our long-time clients asked us to create a PLD that could quickly stand in for an ASIC that would need months to fabricate. Working as part of the customer's design team, we were asked to complete the design in three months, starting with the day we received the detailed specification and ending with the day we handed over the deliverables.

To meet this challenge, SDG team members chose the Verilog hardware description language and the Modelsim Verilog simulator from Model Technology. Our PLD design methodology was identical to the approach we generally use for ASIC designs, a natural choice because the PLD was to become an ASIC. Many other designers are making the same methodology choice, even when their programmable design isn't destined for migration to ASICs.

Our marching orders were to implement the functionality on a programmable device that could serve for ASIC emulation--keeping in mind the three-month deadline and the need to move the design immediately into ASIC fabrication. Although the requirement may sound vague, we've worked successfully with this customer for more than five years; they have come to consider us an integral part of their design team--there's a high level of trust between our two organizations. Consequently, they feel comfortable turning us loose with a detailed functional spec and waiting for us to give them the results. In this case, we delivered an Altera 10K70 intended to emulate a portion of a design that would become an ASIC destined for use in a high-speed computer peripheral.

The customer's primary goal was to add functionality to an existing ASIC design--placing all the new logic on one chip, effectively reducing component count on the board and offloading several compute-intensive tasks from the microprocessor. The customer wanted to start with a functionally equivalent PLD knowing that SDG could turn it around quickly and get a prototype up and running, ready for developers to begin testing code. The actual ASIC fabrication would take several months longer; the client didn't want to wait that long for product development.
Figure 1 - The digital servos
The digital control loop implemented is a second-order loop. The circuit receives velocity input from an encoder circuit and generates a pulse-width-modulated (PWM) output, which in turn drives a motor. The system performs error checking at selected points within the logic to determine if any of the resultant values are outside of predetermined bounds.

In addition, verifying the device's functionality using a PLD was a fast, efficient way to test the design, and helped to relieve some of the pressure to design extensive test benches and to increase the chances that the ASIC would work correctly on its first spin. Emulating an ASIC with a PLD also allowed the engineer to bring internal signals to the outside of the device for capture with a logic analyzer. This useful feature enabled us to capture real-time debugging information that would be difficult to obtain from a simulator, either because of the large number of vectors required or because the vectors that cause failure are unknown. Bringing test signals to the outside world is preferable for finding logic faults rather than timing bugs; the additional logic required to bring the signal doesn't interfere with the chip logic but does alter the chip timing.

Self-servo

The digital control loop that we implemented was a second-order loop. The circuit receives velocity input from an encoder circuit and generates a pulse-width-modulated (PWM) output, which in turn drives a motor (see Figure 1). The value of the PWM output is a function of both the current velocity and the previous velocity errors--the difference between the current and target velocities. The 16-bit encoder count arrives at the input on the int_count lines. The two least significant bits are eliminated because only 14 bits of resolution are necessary. The computed difference between this value and the tar_count value then generates a 9-bit error value called period_err. The period_err is run through a programmable averager whereby the programmer uses register settings to select the number of values to be averaged.

The circuit originally required two separate multipliers and shifters to process both the filt_err and err_accum values simultaneously. However, the resources required to implement two separate multipliers and barrel shifters would have overflowed the selected PLD. In order to circumvent this problem, we implemented one multiplier and barrel shifter and then multiplexed the two inputs into them at different times. The overall throughput of the system decreased, but the customer was satisfied nonetheless.

Our corrective process consisted, therefore, of several steps. First, we multiplexed the filt_err value into the multiplier along with the gv_mult and sent the result to the barrel shifter. The shift_vout value determined the number of bits shifted by the output of the multiplier. That intermediate value was latched in the gv_rslt register. Next we multiplexed the error accumulation value--err_accum--into the multiplier along with gp_mult, then shifted by the amount determined by shift_pout. This value was summed in a three-input adder along with the previously latched value found in the gv_rslt result register and the current PWM output. The result was then latched as the new PWM output. At the same time, the err_accum register was updated. The system performed error checking at selected points within the logic to determine if any of the resultant values lay outside of predetermined bounds. Should an error have occurred, control logic would have caused the incorrect PWM output to revert to a predefined nominal value.

Figure 2 - PLD design process
The PLD design process we used resembled the current standard for ASIC design. Note that the resimulation and gate-level simulation steps were necessary only if we had run into a problem with the PLD.

Decisions, decisions

SDG chose Verilog HDL for one simple reason: Our customer's ASIC design flow was already in Verilog. Since the PLD was going to be implemented as an ASIC and the code had to be transportable, this approach made the most sense. To serve as our HDL simulator, SDG chose Modelsim because it offered a good price/performance ratio.

In attempting to move a PLD onto an ASIC, our biggest challenge lay in writing code that we could easily transport and eventually use in ASIC production. Successful emulation depended on how fast we could make the PLD run and how many gates we could pack onto the chip. Therefore, the ultimate portability of the design was a direct result of how effectively we were able to synthesize Verilog code for the design that would function in both environments.

Eventually the trade-offs balanced out. Because programmable devices aren't as dense as the ASICs they emulate, we might have needed additional parts for equivalent functionality, a situation that would have caused the system to operate more slowly. PLDs are, in general, slower than the ASICs they emulate. But if we had implemented logic using device-specific features to enhance the PLD run speed, our code would have lost some of its portability.

Speed, density, and state machines

Our first design consideration was speed. In this case, the technical challenges stemmed from implementing high-speed arithmetic functions in a minimal amount of logic. The design required lean, well-disciplined code that could optimize speed and minimize the amount of area consumed. We also knew that we would have to sacrifice a certain amount of portability if we wanted to emulate the speed of an ASIC on a single programmable logic device.

Typically, ASIC emulation requires multiple PLDs, which are inherently less dense than ASICs. In this case, however, the high speeds demanded that one PLD contain all of the design requirements--going off-chip would have slowed the system below acceptable levels. Our most difficult technical challenge, therefore, lay in implementing high-speed arithmetic functions in a minimal amount of logic.

In spite of the need for tight code, we had to make design trade-offs to reach targeted speeds, an example being the state machine design. In general, a binary-coded state machine represents the most easily transportable structure. However, we couldn't work with a binary state machine because of the slow speed associated with that strategy.

In fact, PLD state machines are usually implemented as the one-hot type to make them run faster. This increased speed usually comes at the expense, however, of incorporating many flip-flops into the design. In contrast, ASICs usually use binary-coded state machines to reduce logic requirements. Our decision was challenging, therefore, because the state machine portion of the PLD design would need to change as we migrated the design to an ASIC. In this case, the synthesized Verilog code made the jump seamlessly, but the customer needed to spend additional time simulating this portion during the conversion to an ASIC.

The choice of multipliers in the PLD also caused concern. We had the option to describe the multiplier in the PLD exactly as we would have described it in the ASIC, thus allowing for 100-percent transportability. However, the resultant and unacceptable increase in gate count in the PLD would have come with an equally unacceptable decrease in speed. In this case, we decided to use PLD-specific functions that didn't appear in the ASIC, and perform additional simulation when converting to the ASIC. We used RAM in the Altera PLD to implement the multipliers and, using the embedded array block (EAB) cells as lookup tables, significantly reduced the amount of logic required. In this design, we implemented a 16 X 8 multiplier using eight EABs, each of which implemented a 4 X 4 multiplier. When the customer reached the ASIC portion of the design, they again had to rely on clean Verilog code and rigorous simulation to support synthesis because they needed to convert, once again, various PLD structures to the ASIC form. However, since speed demands were driving the design, we needed to use on-chip RAM to optimize the PLD speed, in accordance with the general rule of thumb for PLD-to-ASIC portability; memory blocks--even small ones--should be implemented in on-chip RAM as opposed to flip-flops.

Other portability issues we faced included multiple clock frequencies, clock switching, pinout incompatibility, and partitioning. To run state machines at different frequencies, the ASIC designer may want to employ various clock frequencies derived from a master clock. We were able to achieve this feature easily in the ASIC, but encountered problems in the PLD, where static timing analysis tools and the general mechanics of dividing a clock down and then distributing it throughout a chip made multiple clock frequencies difficult. Instead, we generally tried to run all the state machines at one frequency and then use the lower-frequency clock as an enable for the state machines.

Sometimes the designer of an ASIC will include the capability to switch off a clock in order to minimize power consumption. Because clock switching presents the same static timing analysis problems as multiple clock frequencies do, we found it very difficult to accomplish in a PLD.

Additionally, the PLD device package and pinouts typically don't match up with the desired ASIC pinout; we had to build a special board to accommodate the PLD-ASIC emulator. Similarly, if our ASIC hadn't fit into one PLD, we would have had to spend a considerable amount of time partitioning the functional blocks into multiple PLDs. In partitioning, designers must work within numerous constraints: keep blocks that share many signal lines within the same PLD, which minimizes the number of I/O pins required if pins are at a premium; keep blocks that run at high speed and share signal lines within the same PLD to minimize delays due to I/O buffers; and make sure that the module outputs that connect multiple PLDs are registered (synchronous) instead of combinatorial (asynchronous).

Demanding simulation speed

Finally, the performance of the simulator determined our ability to meet the three-month delivery date; simulation speed was critical. In fact, the speed of our simulator governed our performance at every phase of the design that required simulation (see Figure 2).

Simulation performance became progressively more important as we moved through the design process--nearly an order of magnitude more critical at the gate level. Because of the significantly slower simulation times, we performed the gate-level simulation of the FPGA only when absolutely necessary--for instance, when we were trying to track down a logic or timing problem that RTL simulation or a static timing analyzer couldn't uncover. The tool consistently met the team's expectations, especially at the gate level where simulation can slow considerably.

After all the complexities of creating this design, the end of the story was surprisingly unremarkable. There were no long hours, no gallons of coffee, no red eyes. In fact, we endured no last-minute design crisis at all. We delivered the code on time and the PLD did just what it was supposed to do. There wasn't anything remarkable about the delivery and debug of the resultant ASIC either. The project met our simple metric for creating happy customers--meeting deliverables, coming in on time, and staying under budget.


Allen Vexler cofounded System Design Group in 1993 and continues to serve as its chief technical officer. He spent the first seven years of his career working at Hewlett-Packard in Rancho Bernardo, Calif., followed by five years in the consulting business.

Brian Sinofsky and Brian Walkington contributed to this article.

To voice an opinion on this or any other article in Integrated System Design, please e-mail your comments to jeff@isdmag.com.


Send electronic versions of press releases to news@isdmag.com
For more information about isdmag.com e-mail webmaster@isdmag.com
Comments on our editorial are welcome.
Copyright © 2000 Integrated System Design Magazine

  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Looking for a new job?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
SRC Expands R&D Centers
The Semiconductor Research Corp has added a new center to its university R&D efforts.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About