United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 


Rethinking address modes
Print this article Email this article Reprints RSS Digital Edition

EEdesign.com


There is a perennial question in the architectural community about hardware accelerators -- namely, where do you put them. One school of thought says that an accelerator should be left alone to do its work in peace. Another school says that the thing to do is to attach the hardware accelerator to the system memory bus in some way, often so that it can depend on the host CPU for address generation and exception handling. This school dates back to the days of the Intel floating-point coprocessors in the chip world, and reaches a good deal further than that into the history of minicomputer and mainframe design.

Yet another school, recently championed by a legion of extensible-instruction-set RISC engines, VLIW engines and processor arrays, says that the place to put acceleration is inside the CPU core, where it can live on the internal datapaths and access CPU resources such as register files.

There is one axis along which this debate can be usefully fractured. That process starts with examining the organization of the data that is to be processed.

In these the days of cheap disk drives and desktop PCs, a lot of people don't realize that there are different ways to organize data. They assume, without thinking, that all data sets are inherently random-access. Nothing could be further from a useful generalization.Some data sets-look-up tables, for example-are in fact inherently random-access. You want to put them in RAM and leave them there. Other data sets are inherently sequential. Invariably, the next data element you need will be the next one in the file.

Notice that some apparently random data sets can be made sequential by regrouping the data. A stream of header, payload and control fields in a communications protocol, for example, can appear nearly random. On consecutive operations you are usually not accessing consecutive fields. But group the fields into packets, and the data set becomes sequential. You just don't always look at every field in each packet. Hence, a data set may be sequential to one task and random to another, or sequential when grouped one way and random when grouped another.

The point behind all this is not to encourage everyone to go back and review file access courses. It is to save energy. Increasingly, the largest component of energy consumption in many systems is bus activity: generating addresses, gating them onto a bus and transferring data. Notice that even a quick description of a bus has inherent in it the assumption of random access.

The point is that sequentially accessed data sets do not require address generation. The only signal necessary to uniquely specify the next record you need is a "next, please." In general, such data sets should be kept out of RAM and off of busses. Otherwise you consume a great deal of energy, and often time, for a degree of specification entropy you will never use.

And now, going back to where we started this discussion, that brings us back to accelerators. If you are accelerating a transform on an inherently random data set, then you will have to generate addresses. Whether you generate those addresses in dedicated hardware -- as in some advanced DSP cores and some coprocessors -- or whether you generate them under software control inside a CPU core is a matter for further analysis. You'd need to examine the source of the data from which the addresses are derived and the complexity of the address generation task.

But if for the task at hand the data set is sequential, the last thing you want to do is to hang your accelerator on a bus that will demand address generation, or even worse to bury it in a CPU that will actually consume instructions and clock cycles generating the useless addresses. You want to exploit the sequential nature of the data by creating an address-free path between the data set and the accelerator. This is probably true even if there are some random operations necessary, and the sequential nature of the accelerator complicates them.

This consideration should be useful in deciding just where to put the accelerator. It might also help in starting that even more important task, architecting the memory system.

Ron Wilson is editorial director of ISD Magazine and a contributor to EE Times. He has covered chip-related matters for 15 years for various industry publications, and was once, in the distant past, a designer himself.





The views and opinions expressed in this column are strictly those of the author and should not be taken as an editorial position of EE Times or any of its other editors, publications or Web sites.


  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Looking for a new job?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
SRC Expands R&D Centers
The Semiconductor Research Corp has added a new center to its university R&D efforts.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.



All White Papers »   

  Design Resources
Designing for a dual Galileo-based GPS system
Malcolm Lomer of SiGe Semiconductor discusses GPS design challenges with the Galileo satellite system.
More »
 
Education and
Learning


Learn Now:












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
Network Websites
International
Network Features




All materials on this site Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Terms of Service | About