United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 

Reconfigurable Prototyping Speeds Hardware/Software Integration

By integrating all aspects of a new tape drive design before fabricating ASICs, the design team saved time and produced code that worked nearly flawlessly on the final hardware.

By Vince Dugar, Ben Baron, and Randy Fout


With many design variables involved and a short time-to-market window, our team of hardware and software developers at Storagetek Corp. (Louisville, Colo.) faced an especially challenging design task for the company's 9840 tape drive. The design variables included a new tape format that would boost storage capacity and reduce access time, thus requiring the development of new ASICs, extensively revised software, and a new transport mechanism.

Since software simulation alone can't execute enough cycles in a reasonable time, we were initially concerned about verification. Another problem, and an even greater limitation, related to the drive's asynchronous operation. Because simulation is inherently synchronous, it would allow us to test neither the system's response to asynchronous events such as error-handling responses nor the operation of multiple asynchronous data channels.

Further complicating the verification effort was the requirement for real-time testing. To maximize performance, the thin-film tape head on the 9840 is tuned to a specific read/write speed. Running the drive at a lower speed would require detuning the head and would not sufficiently test the interaction of the electronics and software with the transport mechanism.

Therefore, we needed to build a prototype that implemented all of the drive's electronics, software, and transport hardware in real-time and to do so before spinning ASICs so that we minimized the time-to-market window. The solution to our problem came in the form of a reconfigurable rapid-prototyping system that combined FPGAs for logic emulation with off-the-shelf components such as DRAMs. Developed by Aptix Corp (San Jose), this technology allowed us to integrate the software and the logic elements gradually as various team members finished their designs.

Anatomy of a design process

The 9840 tape drive targets applications ranging from financial institutions and broadcasters (video servers) to general-purpose storage area networks (SANs). These drives are usually combined with an automated tape-library system that rapidly loads and unloads cartridges on demand. To increase storage capacity over that of previous drives, the 9840 uses a proprietary tape format that packs 20 GB of data onto a standard 1/2-inch cartridge. Compressing the data (using on-board compression hardware) typically increases capacity to 80 GBytes. Throughput is also critical in a drive of this type, and the 9840 moves compressed data at rates as high as 20 MBytes per second. To offer better random access to data, the 9840 uses serpentine track recording and step-head technology, which moves the head up and down to access the tape's multiple data wraps.

Figure 1 - Device-side formatter ASIC
On a write operation, the ASIC converts parallel data from the data buffer to multiple channels of encoded serial data, including error correction (ECC), that are then written to the tape media. On a read operation, the ASIC reformats the data into parallel form for movement to the data buffer.
The logic design for the 9840 includes about 300,000 gates that are implemented in a data-formatter ASIC. Along with coordinating the movement of data with the transport servo control, this ASIC performs on-the-fly data formatting between the read/write subsystem and the drive's data buffer (see Figure 1).

The software for the 9840 evolved from code carried over from a previous tape drive model. Many changes were needed, however, so the project required a team of 10 software developers. The changes included accommodations for the new tape format, with its new data arrangement, block sizes, and Error Correction (ECC) format, and the new logic design. The software team wrote code to the I/O specification published by the logic group, which meant interfacing with the device-side formatter chip.

To simulate or to prototype?

Choosing between software simulation and hardware prototyping as our primary verification method was difficult only because we were unsure of the ability of prototyping tools to achieve real-time operation. Certainly the limitations of software simulation were clear. In addition to the difficulties mentioned earlier, those concerning simulating asynchronous functions, we would face the time-consuming problem of creating accurate software simulation models for all the subsystems with which the ASICs interact. We needed to simulate enough to find basic logic errors and satisfy our ASIC vendor, but we also wanted to avoid the effort and limitations involved in full-scale software simulation of the entire system.

Since we could target our Verilog design at either ASICs or FPGAs by using Design Compiler by Synopsys (Mountain View, Calif.), using FPGA-based emulation provided a good basis for a prototype. We had worked with hard-wired prototypes before, so we believed we could save time by exploiting the versatility of a commercial prototyping system.

One choice was to use an emulation system that had a built-in array of FPGAs. However, in addition to being quite expensive, this system imposed some difficult capacity limits on us. When we started the design process (at the beginning of 1997), 300,000 gates was a lot of logic to fit into an FPGA. The highest-capacity FPGA available was the ORCA 2C40 by Lucent Technologies, Inc. (Allentown, Pa.), and the built-in array used even smaller FPGAs. These smaller FPGAs were also slower, so obtaining real-time speed would be more difficult. Further, the expense of this technology made it impractical to build several copies of the prototype for use by the hardware and software groups.

Thus, a logical emulation choice was the Aptix system. Its open architecture allows users to install many types of devices, including an ECC ASIC that we had already fabricated. Using this approach, we took advantage of the capacity and speed of the ORCA 2C40.

Partitioning raises its head in protest

Even so, we typically had to run multiple routes just to coax one to work at our 12-MHz system speed, and we had to partition the logic in our device-side formatter across multiple FPGAs. This partitioning task turned out to be the major challenge of the prototyping effort.

Although the 2C40 is rated as a 40,000-gate device, we found the average usable capacity to be closer to 20,000 gates. The actual utilization level depended on the nature of the logic involved, the number of clever tricks we could use, and the efficiency of the logic synthesis software. In addition, I/O limitations were also an issue, especially as the design matured and our I/O needs increased.
Figure 2 - Emulation Hardware Setup
The reconfigurable prototype hardware setup consisted of the prototype drive and emulated ASIC. Two reconfigurable prototyping boards were used - one for the read channel and one for the write channel. The FPGA-based prototyping boards are configured via Aptix software running on the Unix workstations, which can also control logic analyzer probing of FPGA boundaries and internal nodes.

It's a good idea to begin the design process with conservative capacity and I/O usage for a given FPGA so that design changes don't lead to a need for repartitioning. This issue was especially important for us because we needed to run the prototype at real-time speeds. We needed to carefully control the timing across the FPGA boundaries from one design change to the next, so we froze the partition definition fairly early on in the design cycle.

Today, partitioning would pose less of an issue. FPGAs have made enormous strides in both usable gate counts and the number of I/Os. A single FPGA can provide 150,000 usable gates and 400 I/Os, so currently two of these devices could conceivably handle our entire logic design. As it was, we needed 18 2C40s to emulate our logic; consequently, finding sensible boundaries along which to divide the logic took a lot of work.

We had to perform this complex partitioning manually because no automatic partitioning tool was available at the time. Since then, this barrier to simple partitioning has been addressed, too. Aptix has now introduced a gate-level partitioning tool called Logic Aggregater that groups design blocks across multiple Synplicity, Inc. (Sunnyvale, Calif.) has itself introduced a tool called Certify that handles both synthesis and partitioning for FPGAs; this tool uses synthesis estimates of area and connectivity to partition at the RTL level. While we have not used either of these tools, our experience with the difficulties of manual partitioning prompts us to applaud the development of automated support in this area.

Building the prototype

One problem with traditional prototypes is the difficulty of changing interconnects between devices. The prototyping systems solve this problem with the Field Programmable Interconnect Component (FPIC), an SRAM-based device that can connect any two or more of its 936 I/Os (see "A Block-based Approach to SOC Verification"). The connections appear electrically as a passive resistive-capacitive load, with delays of 5 to 10 ns typical for a two-point connection.

A Block-based Approach to SOC Verification
The System Explorer family uses two proprietary technologies: the Field Programmable Circuit Board (FPCB) and the Field Programmable Interconnect Component (FPIC). The first provides programmable interconnect routing between design blocks, which are represented by components (for example, FPGAs) that users plug into "free-hole" areas of the FPCB. Programmable routing is then accomplished by having all the free-hole connection points prerouted to a set of high-pin-count FPIC devices (see the figure).
Starting with system-level structural descriptions and gate-level netlists, users map ASIC custom logic into FPGAs and implement other system elements using DSPs, microprocessors, memories, and interface circuits. The Explorer software spawns place-and-route jobs across multiple licenses of FPGA vendor tools. The software also generates prototype-programming data, configures the prototyping system, and controls the interface HP logic analyzers for rapid hardware debugging.
The large capacity of today's FPGAs (approximately 150,000 ASIC gates) eases logic partitioning; logic modules are mapped into FPGAs following the design's natural hierarchy. As users map each circuit block to hardware, building up the prototype block-by-block, they verify its function against system-level C models through the Module Verification Platform (MVP) simulator interface. To perform regression testing, they run test benches at hardware speeds. The completed prototypes run at speeds significantly faster than software simulation, enabling in-system evaluation of the circuit against real-world data.
The prototyping system we used was an Aptix System Explorer MP3. This board has three FPIC devices in its center with a matrix of connections extending from the FPIC devices to plug-in areas for other components such as FPGAs. The system thus allows developers to populate the board with any type of FPGA or other device, to program the FPGAs to implement the design's logic, and to program the FPIC devices to implement the design's interconnects.

Because this technology makes it possible to change other aspects of the design under programmable control, we were able to build our prototype piece by piece, as the parts of the logic became available from team members. This incremental build-up simplified debugging because it allowed us to test different parts of the logic independently. We started by testing the logic required to "write to tape" and "read back", for example, without actually going to the tape head. Then we expanded the test to include writing and reading to the tape head. As we added functionality to the prototype, we moved the FPGAs around as needed. The gradual build-up of the prototype also allowed us to get our testing underway without waiting for all the pieces of the design to be ready.

On the utility of DRAM

For us, using DRAM chips to represent embedded memory was much more efficient than programming FPGA cells as RAM. DRAM chips can be added to represent the four embedded RAM blocks in the device-side formatter ASIC. The four embedded RAMs, which ranged in size from 0.5 to 8 kbytes, would have consumed about 300,000 gates in 18 2C40 FPGAsýdoubling the number of FPGAs needed for our prototype and severely slowing the design flow.

However, we still had to find a place to mount these DRAMs, and we were already using two MP3 prototyping boards to accommodate the 18 FPGAs required to implement our logic. We lacked the room to mount the DRAMs on the main board, and we wanted to avoid the routing delays incurred in running the memory I/O through the FPIC devices. Although the delay through the FPIC devices is small, we were just on the edge of being able to achieve our 12-MHz real-time system speed. Any added delays going to and from memory could have been critical.

We solved both of these problems by mounting DRAM chips directly on top of the FPGA modules. This solution turned out to be straightforward, physically, because the DRAMs plugged into the header pins provided on the FPGA modules for additional prototyping flexibility. At the logic level, we had to fix the locations of the DRAM I/O pins in the FPGA netlists so the I/Os wouldn't change from one iteration to the next. To ensure accurate I/O mapping in this complex environment, we created a database specifying the relationships between the FPGA-module I/Os (the top headers) and FPGA pins. Although fixing the FPGA and DRAM I/Os limited the prototype's flexibility slightly, the method worked well. In the final prototype, one MP3 board contained the circuitry for writing data to tape and another board handled reading the data from tape (see Figure 2). We then had room for all the necessary components and the RAM access speeds supported real-time performance.

The age of integration

The 9840 development team built six prototypes using the rapid-prototyping system, thus allowing the many designers involved in the project to work in parallel. Four of the prototypes were used specifically for software integration. The system supports the creation of relatively inexpensive replicates of a prototype by providing an EEPROM-based subsystem for on-board storage of the FPGA and FPIC programming data and stand-alone hardware configuration. The MP3 prototyping system allows you to download these programs via a LAN or serial connection, but we found it faster to program EEPROMs and plug them into each of our prototypes. The only real disadvantage to this approach is that it led to questions at times about the revision level of a specific prototype (a concern the latest generation of the prototyping system addresses).

During the hardware/software integration, the logic designers were continually updating the prototypes used by the software group. Despite the ongoing logic revisions, the software developers didn't see too many disruptive changes because the hardware/software interface remained relatively stable. This interface stability is crucial for success. Alternatively, the odd logic configurations encountered along the way forced the software to go through branches that wouldn't normally occur, so the code is probably more robust than it would otherwise be.

The C code developed for the 9840 ran on a SPARC processor (later changed to an ARM processor). A software analyzer connected to the main processor bus proved extremely helpful in isolating bugs. The software team didn't require full visibility into the logic interconnects, however, so we saved money on the prototype replicates by eliminating the automatic logic-analyzer interface.

As mentioned earlier, the ability to work with changeable hardware was a continual advantage for the software team. With the hardware and software developers cooperating on finding the best fix for each bug, we developed a better design and in less time than usual. In the typical integration process, where the software isn't integrated until the logic is in ASIC form, software developers have to figure out ways to correct for logic errors. By using the prototype-based integration, the logic designers quickly took care of HDL errors revealed by software testing. As a testament to the high quality of the code developed on the prototype, we were able to use the prototype codeýwith minimal modificationýin the actual product.

The reconfigurable rapid-prototyping system gave us real-time speed, high logic capacity, the flexibility to accommodate different types of components (including DRAM and the ECC ASIC), fast configuration, and the ability to construct and maintain low-cost replicates. Though it is difficult to say how long the design cycle would've been without the reconfigurable prototype, the prototype probably allowed us to save approximately 6 months. Similarly encouraging, test coverage increased by three or more orders of magnitude compared to software simulation. In addition, both the ASICs and the software performed to expectations. The 9840 tape drive is now running in many demanding data-storage installations.


Ben Baron is a senior software engineer at Storagetek Corp. (Louisville, Colo.). Previously, he worked as a Fortran programmer at the National Oceanic and Atmospheric Adminstration (NOAA).

Vince Dugar is an advisory development engineer at Storagetek. Prior to Storagetek, he worked at NCR.

Randy Fout is advisory software development engineer in the embedded systems software division at Storagetek.


Send electronic versions of press releases to news@isdmag.com
For more information about isdmag.com e-mail webmaster@isdmag.com
Comments on our editorial are welcome.
Copyright © 2000 Integrated System Design Magazine

  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Looking for a new job?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
SRC Expands R&D Centers
The Semiconductor Research Corp has added a new center to its university R&D efforts.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About