United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 

Speedy processor runs on low power
Print this article Email this article Reprints RSS Digital Edition

EE Times


When designing high-speed parallel processors for compute-intensive applications, high performance and low power must be designed in from the beginning. With this in mind, ClearSpeed Technology developed the CS301, a floating-point processor that achieves upward of 25-gigaflop performance. This 40 million-transistor, 0.13-micron chip supports multi-Vdd and dissipates less than 1.8 watts typical. With experienced designers, clear design goals and an integrated system for concurrent analysis and implementation, the company took the CS301 from concept to working parts in less than five months.

ClearSpeed develops and delivers parallel microprocessors optimized for processing scientific calculations and computationally intensive data. These coprocessors offer a combination of high compute performance, C programmability and extremely low power requirements.

ClearSpeed set out to create the CS301 chip to address applications such as computational biology, drug discovery and nanotechnology. The advanced algorithms used on the CS301 enable accurate simulations of large molecules on smaller platforms, dramatically improving the efficiency and turnaround time of the entire scientific process.

To achieve a low-power design, the CS301 development team combined a carefully envisioned architecture, as seen from the top-down view, with the practical engineering detail built from the bottom up. It was a matter of taking everything back to basic principles and finding a solution that looked good when viewed from the top or the bottom. Optimizing the architecture produced the greatest gains, but it's the detail that ultimately determined which approach was best.

Probably the single most important design target for the CS301 was to minimize the number of times information had to be moved and to move it efficiently. This basic approach was woven into both the architecture and implementation, from the on-chip network-which has a very simple control structure allowing distributed arbitration and clock gating-to the fundamental structure of the multithreaded array processor. Instead of centralizing the control for decision making and processing into a single unit-as with a typical microprocessor-where possible, local units make their own decisions about what processing is required. This minimizes the flow of data, control and clock signals to only the unit that is required to implement the correct functionality.

The replicated processing element played a fundamental part in achieving both the performance and the power efficiency. The microarchitecture was critical, as each decision, from control coding and distribution through to the detail of the compute elements, needed to be evaluated for efficiency. There is no trick to power-efficient design other than making sure the team understands the goal and is inspired to sweat the details and find an optimal solution. The only shortcuts in this design were based on experience and sound engineering principles plus an integrated design environment that provided fast feedback and predictability throughout the flow.

For ClearSpeed, achieving its performance and power goals meant stripping out complexity. Finding low-transistor-count solutions to each aspect of the design allowed the team to reduce the area of each component, reducing capacitance locally and, as a by-product, reducing the capacitance associated with the global control and data flow. An essential requirement of the company's approach was the ability to rapidly take new ideas through to finished layout and to validate expectations. Some ideas looked elegant as RTL but turned out to be inefficient when realized in silicon through a semicustom flow. The ideal flow needed not only to give rapid closure but also to allow the company's engineers to understand the result and modify their design strategies to work with the tools.

Originally, the company's engineers had tried using a conventional point-tool IC design flow for the CS301, but various problems caused the team to abandon that method. Timing, signal and power integrity, and routing issues prevented it from achieving design closure. The designers suspected these problems were a result of poor initial placement. The team believed that its point-tool flow was not addressing all of the issues concurrently as was needed. In addition, the point-tool flow provided no feedback, so identifying the causes of the problems was impossible.

The development team adopted a new design flow from Magma Design Automation Inc. (Santa Clara, Calif.). With Magma's Blast Fusion APX, Blast Noise and Blast Rail, the team had an integrated flow that addressed timing, signal and power integrity, and routing issues concurrently throughout the flow. This correct-by-construction approach delivered better placement and provided insight into the design that allowed the team to reduce power significantly. With Magma's system, the team's engineers could accurately and efficiently perform timing-vs.-power and area-vs.-power trade-offs at different stages of the design flow.

Critical feedback came from using simple techniques such as graphically mapping design placement in Blast Fusion and being able to relate this to similar maps from Blast Rail that showed rail integrity and power density. With this, the team was able to tune the architecture and logic design to optimize the physical design and, in consequence, the power efficiency of the device.

The development team was able to easily identify timing, signal integrity and power problems early and to rework the design at a high level to improve the results-something it was unable to do with other software. With the visibility into the design that the Magma system provided, the engineers in charge of developing the CS301 were always confident that they would be able to achieve the team's design objectives.

Russell David is vice president of engineering at ClearSpeed Technology Inc. (Los Gatos, Calif.).

See related chart
Rail integrity and power density maps from Blast Rail were used to tune the CS301's architecture and physical design to reduce power.
Source: ClearSpeed Technology Inc.






  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Looking for a new job?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
DoD Recognizes University Scientists For Basic Research
Annual awards to university faculty to conduct next-generation research projects were announced this week by the Defense Department.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2010 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About