United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 

A Standard IP Interface for Custom SOC Designs

Design mandates integration across system blocks and interconnects.

By Suresh Dholakia


In the early 1990s, systems integration was a board-level function, interconnecting a variety of components on one or more printed circuit boards (PCBs), to fashion a complete system such as a computer, a router, a modem, or a network interface controller.

However, by 1998, when Raj Raghavan and I founded Real-Chip Inc. (Sunnyvale, CA), many of these components had evolved into reusable, synthesizable intellectual property (IP) blocks that could be interconnected on a single piece of silicon to create a system on a chip (SOC).

Real-Chip creates custom communications chips on an outsourced basis for system houses, communications industry suppliers, and network start-ups. Our chip designs are targeted for such applications as VoIP Internet telephony, xDSL modems, cable modems, satellite modems, LAN switches, remote-access switches, digital WANs and VPNs, and routers.

As part of our initial business model, we decided to address one of the major problems facing developers - integrating IP one block at a time. Typically, this is a difficult and time-consuming task that includes coping with a variety of incompatible interfaces, which in turn necessitates complex bus interconnect structures. This approach also makes it hard for developers to take advantage of new advances in IP, particularly modules aimed at network communications and the Internet marketplace. Additionally, the difficulty of designing and manufacturing SOCs has widened the gap between what the chip-system companies need and what they can afford to design, develop, test, and manufacture in-house.

A new approach

In an attempt to fill this void, we had to develop a standard interface for each IP block in an optimum manner for every custom-designed SOC. This project was especially important in regards to our communications and networking customers.

Figure 1 - The LX4x80 architecture
The LX4180 supports a coprocessor interface and MIPS I instruction set.

The company decided to combine a powerful RISC processor with a flexible interconnect fabric to form an efficient, scalable foundation on which to build custom SOCs for our customers. We looked at a variety of processor and interconnect options and decided that Lexra's (Waltham, MA) LX4xxx and LX5xxx RISC processor families and Palmchip's (San Jose, CA) Coreframe architecture both offered excellent options. Both technologies were well suited for integrating Internet-oriented telecom and datacom IP on chips with specific customer designs such as SOCs embedded in routers, or working with various protocol stacks incorporating new standards such as Diffserve (differentiated levels of service for IP packet streams) and Session Initiation Protocol (SIP).

The LX4180 32-bit embedded RISC processor core that we used in our SOCs is based on the Harvard architecture, with instruction and data memories connected to the CPU core by separate instruction and data buses (see Figure 1). The LX4180 can support different memory subsystem configurations, including a range of sizes for instruction and data memory, and different memory architectures such as cache, RAM, or ROM. Performance includes up to 200 Dhrystone MIPS at 200 MHz in a 0.18-ým process for the single-issue LX4180, and 260 Dhrystone MIPS for the dual-issue LX4280.

The Lexra processor supports the MIPS I instruction set, which allowed us to take advantage of existing software development tools. Also, the LX4180's support of a coprocessor interface was important, as many of our custom applications need this additional functionality. Although the LX4180 is a RISC processor that executes the MIPS I instruction set and R3000 exception model, the clocking, pipeline structure, pin-out, and memory interfaces have been designed to meet SOC design needs, deep-submicron (DSM) process technology, and recent advances in design methodology.

Creating a super bus

To create a scalable and flexible backplane for our custom SOC designs, we decided to integrate the LX4180's Lexra bus controller (LBC) - a PCI-like bus architecture - with a high-level SOC interconnect fabric. The integration between the LBC and the Coreframe architecture from Palmchip, which involved some tweaking and optimization of the interfaces on our part, went smoothly, aided by an efficient working relationship between the three companies. Cooperation between the technical staff at each company helped speed up the integration process.

Figure 2 - The Lexra bus
The synchronous bus has a pended architecture that helps reduce cache latency.

The LBC provides a connection between the Lexra core as well as to the system memory, USB, or IEEE-1394 (Firewire) controllers; the LBC is also able to interface with an external bus (see Figure 2). This bus supports multiple masters, therefore allowing for master I/O controllers with DMA engines to be connected to the bus with the appropriate arbitration function. The bus has a pended architecture in which a master holds the bus until all the data is transferred, thus simplifying the design of user-supplied bus agents and reducing latency for cache-miss servicing.

The Lexra bus is synchronous and signals are registered and sampled at the positive edge of the bus clock. Certain logical operations may be applied to the sampled signals and then new signals - such as those used for address decoding - can be driven immediately, yielding same-cycle turn around. Developers can select between a synchronous or asynchronous interface between the processor core and the bus; the bus speed can be set to be any speed up to twice the core clock frequency. The data bus is 32 bits wide allowing the bus to transfer one word, a halfword, or a byte in a single bus clock. The Lexra bus can transfer an entire cache line in a single bus transaction by bursting words of data from incremental addresses on successive clock cycles.

The LBC contains a write buffer. When the core issues a write request to a bus device, the address and data are saved in the buffer and sent to the device at a later point. The core can continue processing while safely assuming that the write will eventually happen. The LBC drives control signals to enable muxes or tri-state buffers, allowing the bus to have either a bi-directional or point-to-point topology.

Because our custom SOC designs are complex, the LBC needed to be augmented by an additional interconnect fabric. By integrating the LBC with the Coreframe interconnect architecture, we created a technological base that helped facilitate taking a system approach to chip design.

Coreframe architecture

From a high level perspective, the Coreframe architecture, which uses point-to-point connections rather than tri-state buses, resembles a system of buses (see Figure 3). The Palmbus is designed for low-speed access from the CPU core to peripheral blocks. The MBus, also from Palmchip, is designed for high-speed accesses to shared memory from the CPU core and peripheral blocks. Coreframe is processor independent.

Figure 3 - Bus view of Coreframe architecture
The architecture includes a CPU subsystem that acts independently from the peripheral blocks.

The CPU and its dedicated memory are shown as part of an independent CPU subsystem. The CPU bridge, Mbus controller, Palmbus Controller (PBC), and cache are the bridges between the CPU subsystem and the Coreframe architecture. If a design is ported to a different processor, only the interfaces to the CPU subsystem within the CPU bridge need to be changed.

The cache separates the CPU from the Coreframe system and a bridge is used to connect the CPU subsystem with the peripheral and memory subsystems. The Palmbus and MBus are independent parallel buses rather than a hierarchy of buses. Concurrent activity may be achieved on both buses, maximizing available bandwidth resources.

The channel interface is used to integrate peripheral blocks from multiple sources and is key to our approach to SOC design. Once a library of channels is created, DMA requestors - customized to the peripheral's requirements - can be integrated and verified, since the channel library also includes verification stimuli or testbenches.

At the lowest level, the Coreframe interconnect architecture is a collection of point-to-point and broadcast connections. Because of this underlying simplicity, the Palmchip architecture proved useful in meeting the requirements of the Real Chip design methodology.

The integration process

The integration of the two technologies in a standard interface consisted of connecting the LX4180, a block of CPU local memory, shared memory, and a number of peripheral cores - including Ethernet and USB, to the Coreframe architecture (see Figure 4). Both Lexra and Palmchip provide synthesizable Verilog register transfer level (RTL) source code, a testbench and test vectors, and other files and scripts needed to complete implementation of the product.

The LX4180 includes Verilog RTL source code for the processor core, plus additional files needed to configure, simulate, synthesize and test the core. Lexra supplies a Perl 5.0 script called Lconfig that allowed us to choose approximately 25 different configuration options available for the core.

Figure 4 - RealChip SOC block diagram
The custom-designed bridge helped reorder bit and byte function between the two different protocols.

We received Palmchip's product in the form of a soft-core implementation package (the Palm Pak SOC development platform), which includes Verilog RTL source code for various peripherals and interfaces: memory bus controller, peripheral bus controller, and local CPU-memory interface, configurable-memory access controller, direct-memory access interface, various timers, system-control module, I/O controller, and serial-link interface.

Our first step in the integration flow was to install the Lexra and Palmchip products and to verify installation by simulating each product using test vectors supplied by the vendors. We completed this step quickly and with no problems. Our next step was to configure the LX4180 core and then integrate it to the Coreframe interconnect architecture. We used the Lexra Lconfig tool to specify memory configuration, to select options we wished to include, and to specify settings for the user-configurable parameters such as cache size, read-and-write buffer size, memory access granularity, clock buffer handling, and type of system bus interface.

Based on memory

The memory configuration we specified included 8K bytes each of instruction and data cache, as well as 2K bytes each of instruction RAM and data RAM. Lconfig automatically created the appropriate Lexra memory interface (LMI) blocks for each of these memories and connected them to the LX4180 instruction and data buses.

We also specified an external RAM interface for the LX4180. Our chip-level architecture included a block of CPU local memory that the core accesses through the LBC interface. This memory is separate from the shared SDRAM, SSRAM, and ROM that resides on the MBus. Although the LX4180 uses the shared memory for loading programs during system start-up, it uses the CPU cache and local memory during normal operation (greater than 90 percent of the time). By providing dedicated CPU local memory, we avoided the performance impact of forcing the LX4180 to continuously compete with the peripherals for access to shared memory.

Lexra offers two options for the interface between the processor's buses and the system bus. The LBC uses a PCI-like protocol to interface to the rest of the system, whereas the CBus provides a very simple interface directly to the processor buses. We chose the LBC interface even though it added one cycle of latency to our design, because the PCI protocol is familiar to our design team and allowed us to complete the integration quickly as compared to designing our own protocol.

We chose not to include the LX4180's optional hardware multiply-accumulate (MAC) unit that improves performance for signal processing code since, in our design, the LX4180 functions purely as a microcontroller. We did include the optional EJTAG debug interface which provides access to the processor's internal status and provides debugging software once the core has been embedded into an SOC.

Using the EJTAG interface for on-chip software debug requires a debugger application and an EJTAG probe, such as those supplied by Embedded Performance, Inc. (Milpitas, CA) and Green Hills Software, Inc. (Santa Barbara, CA). A host computer running the debugger communicates to the EJTAG probe through a serial port, a parallel port, or an Ethernet connection. The probe, in turn, communicates to the LX4180 EJTAG interface through a set of pins on the SOC that are part of the IEEE 1149.1 JTAG interface. We use the EJTAG interface to provide debug commands to the LX4180 - for example, to enter single-step mode and monitor the execution of each instruction. Through the EJTAG interface, we're able to access the processor's status, including register contents, program counter address, and settings of various internal status and condition codes. We are also able to set hardware breakpoints on the instruction cache address, data cache address, and LBUS address and data, and directly probe system memory and I/O devices.

We also decided to configure the core for scan insertion and specified four scan chains. We used Syntest's (Sunnyvale, CA) DFT and ATPG tools suites to synthesize the scan chains, and perform to automatic test pattern generation (ATPG) and fault simulation. Once we completed our selections for the LX4180, we ran Lconfig to generate the RTL description for our configuration. In addition to generating the RTL description, Lconfig checks for invalid configurations, creates the underlying symbol file needed for simulation and synthesis, and creates a variety of Verilog files for regression testing in the simulation environment. Lconfig can be set to generate behavioral RAM models for memory or set to use actual memory descriptions generated by a memory compiler. We use the behavioral RAM models for fast simulation and the memory compiler descriptions - which will then be used to fabricate the actual chip - for more thorough verification.

Further integration between the LX4180 and the Coreframe architecture required us to design a CPU bridge to connect the LBC to the Coreframe MBus and Palmbus. The bridge handles conversions between the LBC and Coreframe protocols. For example, it performs reordering to account for the fact that the LX4180 uses "big endian" bit-and-byte ordering, whereas the Coreframe architecture assumes "little endian," which is the opposite ordering. In designing the bridge, we considered the maximum future requirements for peripheral and memory blocks that we may later add to support our customers' communications applications.

We also connected all of the peripheral blocks in our SOC that required access to shared memory through the MBus. We connected all of the peripheral blocks to the Palmbus, which handles control and status.

This integration required creating a very small amount of interface logic and was a straightforward process.

Once the integration was complete, we ran extensive functional simulations to verify that all components had been connected and were operating correctly. For this level of testing, we used Synopsys' (Mountain View, CA) VCS Verilog simulator and a bus transaction model supplied by Lexra. The model allowed us to directly generate all of the possible LBC bus cycles without using the LX4180, which would have required many more simulation cycles.

Our SOC will first be manufactured on a 0.18-ým foundry process. We plan to run the processor, the Mbus, and the Palmbus at the same speed - in excess of 150 MHz. For SOC implementation, we used Design Compiler for synthesis and Avanti's (Fremont, CA) Apollo, Star-RC, and Hercules tools for physical design.

From the beginning of the program to final verification, the integration of the LX4180 and the Coreframe architecture took us about three months. Much of that time was devoted to designing and verifying our SOC micro-architecture. Given that we had several engineers working on the project, this translates to approximately nine engineer-months to design, code, and verify the bridge and interface to the various peripherals.

A flexible fabric

By integrating the system-level building blocks with the interconnect architecture, we believe we achieved our goal of making the configuration as flexible and scalable as possible to meet the needs of a variety of new and existing communications applications. Given that the pace of IP development continues to accelerate, this approach will allow us to build an unlimited array of custom SOCs for the Internet-based communications marketplace.

Frequently, at this level of system integration, we run into new challenges, especially at the DSM level. We have to perform a wide range of verification testing - including extensive regression testing - at both the system and the silicon level. This is accomplished using a full suite of verification testing tools and gives us the ability to access and isolate major functional blocks within the chip. Therefore we don't have to redesign the entire SOC if an error is detected in one of the IP blocks.

Our systems integration approach to SOC design and the use of third-party technologies acts as a foundation upon which to build future custom SOCs with DSP processors, while multiple IP design macros provide a reusable architecture for target applications. We also have the ability to use plug-and-play IP modules to simplify the development process and help our customer achieve shorter time to market for their products.


Suresh Dholakia is a senior vice president and co-founder of Real-Chip, Inc. (Sunnyvale, CA). He co-founded Silicon Automation Systems and Frontline Design Automation. Suresh has over 17 years of experience in semiconductor, VLSI/ASIC design, and IC design automation.

To voice an opinion on this or any other article in Integrated System Design, please e-mail your comments to mikem@isdmag.com Send electronic versions of press releases to news@isdmag.com
For more information about isdmag.com e-mail webmaster@isdmag.com
Comments on our editorial are welcome.
Copyright © 2000 Integrated System Design Magazine


  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Looking for a new job?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
SRC Expands R&D Centers
The Semiconductor Research Corp has added a new center to its university R&D efforts.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About